Abstract: Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.
Abstract: Environmental changes and major natural disasters are
most prevalent in the world due to the damage that humanity has
caused to nature and these damages directly affect the lives of
animals. Thus, the study of animal behavior and their interactions
with the environment can provide knowledge that guides researchers
and public agencies in preservation and conservation actions.
Exploratory analysis of animal movement can determine the patterns
of animal behavior and with technological advances the ability of
animals to be tracked and, consequently, behavioral studies have
been expanded. There is a lot of research on animal movement and
behavior, but we note that a proposal that combines resources and
allows for exploratory analysis of animal movement and provide
statistical measures on individual animal behavior and its interaction
with the environment is missing. The contribution of this paper is
to present the framework AniMoveMineR, a unified solution that
aggregates trajectory analysis and data mining techniques to explore
animal movement data and provide a first step in responding questions
about the animal individual behavior and their interactions with other
animals over time and space. We evaluated the framework through the
use of monitored jaguar data in the city of Miranda Pantanal, Brazil,
in order to verify if the use of AniMoveMineR allows to identify the
interaction level between these jaguars. The results were positive and
provided indications about the individual behavior of jaguars and
about which jaguars have the highest or lowest correlation.
Abstract: People, throughout the history, have made estimates
and inferences about the future by using their past experiences.
Developing information technologies and the improvements in the
database management systems make it possible to extract useful
information from knowledge in hand for the strategic decisions.
Therefore, different methods have been developed. Data mining by
association rules learning is one of such methods. Apriori algorithm,
one of the well-known association rules learning algorithms, is not
commonly used in spatio-temporal data sets. However, it is possible
to embed time and space features into the data sets and make Apriori
algorithm a suitable data mining technique for learning spatiotemporal
association rules. Lake Van, the largest lake of Turkey, is a
closed basin. This feature causes the volume of the lake to increase or
decrease as a result of change in water amount it holds. In this study,
evaporation, humidity, lake altitude, amount of rainfall and
temperature parameters recorded in Lake Van region throughout the
years are used by the Apriori algorithm and a spatio-temporal data
mining application is developed to identify overflows and newlyformed
soil regions (underflows) occurring in the coastal parts of
Lake Van. Identifying possible reasons of overflows and underflows
may be used to alert the experts to take precautions and make the
necessary investments.
Abstract: In order to help the expert to validate association rules
extracted from data, some quality measures are proposed in the
literature. We distinguish two categories: objective and subjective
measures. The first one depends on a fixed threshold and on data
quality from which the rules are extracted. The second one consists
on providing to the expert some tools in the objective to explore and
visualize rules during the evaluation step. However, the number of
extracted rules to validate remains high. Thus, the manually mining
rules task is very hard. To solve this problem, we propose, in this
paper, a semi-automatic method to assist the expert during the
association rule's validation. Our method uses rule-based
classification as follow: (i) We transform association rules into
classification rules (classifiers), (ii) We use the generated classifiers
for data classification. (iii) We visualize association rules with their
quality classification to give an idea to the expert and to assist him
during validation process.
Abstract: Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FPgrowth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.
Abstract: Recommendation systems are widely used in
e-commerce applications. The engine of a current recommendation
system recommends items to a particular user based on user
preferences and previous high ratings. Various recommendation
schemes such as collaborative filtering and content-based approaches
are used to build a recommendation system. Most of current
recommendation systems were developed to fit a certain domain such
as books, articles, and movies. We propose1 a hybrid framework
recommendation system to be applied on two dimensional spaces
(User × Item) with a large number of Users and a small number
of Items. Moreover, our proposed framework makes use of both
favorite and non-favorite items of a particular user. The proposed
framework is built upon the integration of association rules mining
and the content-based approach. The results of experiments show
that our proposed framework can provide accurate recommendations
to users.
Abstract: In this paper, we present a recommendation library application on Android system. The objective of this system is to support and advice user to use library resources based on mobile application. We describe the design approaches and functional components of this system. The system was developed based on under association rules, Apriori algorithm. In this project, it was divided the result by the research purposes into 2 parts: developing the Mobile application for online library service and testing and evaluating the system. Questionnaires were used to measure user satisfaction with system usability by specialists and users. The results were satisfactory both specialists and users.
Abstract: Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.
Abstract: This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.
Abstract: This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.
Abstract: In data mining, the association rules are used to search
for the relations of items of the transactions database. Following the
data is collected and stored, it can find rules of value through
association rules, and assist manager to proceed marketing strategy
and plan market framework. In this paper, we attempt fuzzy partition
methods and decide membership function of quantitative values of
each transaction item. Also, by managers we can reflect the
importance of items as linguistic terms, which are transformed as
fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth
(FWFP-Growth) is used to complete the process of data mining. The
method above is expected to improve Apriori algorithm for its better
efficiency of the whole association rules. An example is given to
clearly illustrate the proposed approach.
Abstract: With a surge of stream processing applications novel
techniques are required for generation and analysis of association
rules in streams. The traditional rule mining solutions cannot handle
streams because they generally require multiple passes over the data
and do not guarantee the results in a predictable, small time. Though
researchers have been proposing algorithms for generation of rules
from streams, there has not been much focus on their analysis.
We propose Association rule profiling, a user centric process for
analyzing association rules and attaching suitable profiles to them
depending on their changing frequency behavior over a previous
snapshot of time in a data stream.
Association rule profiles provide insights into the changing nature
of associations and can be used to characterize the associations. We
discuss importance of characteristics such as predictability of
linkages present in the data and propose metric to quantify it. We
also show how association rule profiles can aid in generation of user
specific, more understandable and actionable rules.
The framework is implemented as SUPAR: System for Usercentric
Profiling of Association Rules in streaming data. The
proposed system offers following capabilities:
i) Continuous monitoring of frequency of streaming item-sets
and detection of significant changes therein for association rule
profiling.
ii) Computation of metrics for quantifying predictability of
associations present in the data.
iii) User-centric control of the characterization process: user
can control the framework through a) constraint specification and b)
non-interesting rule elimination.
Abstract: Association rules are an important problem in data
mining. Massively increasing volume of data in real life databases
has motivated researchers to design novel and incremental algorithms
for association rules mining. In this paper, we propose an incremental
association rules mining algorithm that integrates shocking
interestingness criterion during the process of building the model. A
new interesting measure called shocking measure is introduced. One
of the main features of the proposed approach is to capture the user
background knowledge, which is monotonically augmented. The
incremental model that reflects the changing data and the user beliefs
is attractive in order to make the over all KDD process more
effective and efficient. We implemented the proposed approach and
experiment it with some public datasets and found the results quite
promising.
Abstract: The Kansei engineering is a technology which
converts human feelings into quantitative terms and helps designers
develop new products that meet customers- expectation. Standard
Kansei engineering procedure involves finding relationships between
human feelings and design elements of which many researchers have
found forward and backward relationship through various soft
computing techniques. In this paper, we proposed the framework of
Kansei engineering linking relationship not only between human
feelings and design elements, but also the whole part of product, by
constructing association rules. In this experiment, we obtain input
from emotion score that subjects rate when they see the whole part of
the product by applying semantic differentials. Then, association
rules are constructed to discover the combination of design element
which affects the human feeling. The results of our experiment
suggest the pattern of relationship of design elements according to
human feelings which can be derived from the whole part of product.
Abstract: Biclustering is a very useful data mining technique for
identifying patterns where different genes are co-related based on a
subset of conditions in gene expression analysis. Association rules
mining is an efficient approach to achieve biclustering as in
BIMODULE algorithm but it is sensitive to the value given to its
input parameters and the discretization procedure used in the
preprocessing step, also when noise is present, classical association
rules miners discover multiple small fragments of the true bicluster,
but miss the true bicluster itself. This paper formally presents a
generalized noise tolerant bicluster model, termed as μBicluster. An
iterative algorithm termed as BIDENS based on the proposed model
is introduced that can discover a set of k possibly overlapping
biclusters simultaneously. Our model uses a more flexible method to
partition the dimensions to preserve meaningful and significant
biclusters. The proposed algorithm allows discovering biclusters that
hard to be discovered by BIMODULE. Experimental study on yeast,
human gene expression data and several artificial datasets shows that
our algorithm offers substantial improvements over several
previously proposed biclustering algorithms.
Abstract: In an era of knowledge explosion, the growth of data
increases rapidly day by day. Since data storage is a limited resource,
how to reduce the data space in the process becomes a challenge issue.
Data compression provides a good solution which can lower the
required space. Data mining has many useful applications in recent
years because it can help users discover interesting knowledge in large
databases. However, existing compression algorithms are not
appropriate for data mining. In [1, 2], two different approaches were
proposed to compress databases and then perform the data mining
process. However, they all lack the ability to decompress the data to
their original state and improve the data mining performance. In this
research a new approach called Mining Merged Transactions with the
Quantification Table (M2TQT) was proposed to solve these problems.
M2TQT uses the relationship of transactions to merge related
transactions and builds a quantification table to prune the candidate
itemsets which are impossible to become frequent in order to improve
the performance of mining association rules. The experiments show
that M2TQT performs better than existing approaches.
Abstract: There are several approaches in trying to solve the
Quantitative 1Structure-Activity Relationship (QSAR) problem.
These approaches are based either on statistical methods or on
predictive data mining. Among the statistical methods, one should
consider regression analysis, pattern recognition (such as cluster
analysis, factor analysis and principal components analysis) or partial
least squares. Predictive data mining techniques use either neural
networks, or genetic programming, or neuro-fuzzy knowledge. These
approaches have a low explanatory capability or non at all. This
paper attempts to establish a new approach in solving QSAR
problems using descriptive data mining. This way, the relationship
between the chemical properties and the activity of a substance
would be comprehensibly modeled.
Abstract: The purpose of this research aims to discover the
knowledge for analysis student motivation behavior on e-Learning
based on Data Mining Techniques, in case of the Information
Technology for Communication and Learning Course at Suan
Sunandha Rajabhat University. The data mining techniques was
applied in this research including association rules, classification
techniques. The results showed that using data mining technique can
indicate the important variables that influence the student motivation
behavior on e-Learning.
Abstract: The problem of frequent itemset mining is considered in this paper. One new technique proposed to generate frequent patterns in large databases without time-consuming candidate generation. This technique is based on focusing on transaction instead of concentrating on itemset. This algorithm based on take intersection between one transaction and others transaction and the maximum shared items between transactions computed instead of creating itemset and computing their frequency. With applying real life transactions and some consumption is taken from real life data, the significant efficiency acquire from databases in generation association rules mining.
Abstract: In data mining, the association rules are used to find
for the associations between the different items of the transactions
database. As the data collected and stored, rules of value can be found
through association rules, which can be applied to help managers
execute marketing strategies and establish sound market frameworks.
This paper aims to use Fuzzy Frequent Pattern growth (FFP-growth)
to derive from fuzzy association rules. At first, we apply fuzzy
partition methods and decide a membership function of quantitative
value for each transaction item. Next, we implement FFP-growth
to deal with the process of data mining. In addition, in order to
understand the impact of Apriori algorithm and FFP-growth algorithm
on the execution time and the number of generated association
rules, the experiment will be performed by using different sizes of
databases and thresholds. Lastly, the experiment results show FFPgrowth
algorithm is more efficient than other existing methods.