Abstract: In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.
Abstract: People, throughout the history, have made estimates
and inferences about the future by using their past experiences.
Developing information technologies and the improvements in the
database management systems make it possible to extract useful
information from knowledge in hand for the strategic decisions.
Therefore, different methods have been developed. Data mining by
association rules learning is one of such methods. Apriori algorithm,
one of the well-known association rules learning algorithms, is not
commonly used in spatio-temporal data sets. However, it is possible
to embed time and space features into the data sets and make Apriori
algorithm a suitable data mining technique for learning spatiotemporal
association rules. Lake Van, the largest lake of Turkey, is a
closed basin. This feature causes the volume of the lake to increase or
decrease as a result of change in water amount it holds. In this study,
evaporation, humidity, lake altitude, amount of rainfall and
temperature parameters recorded in Lake Van region throughout the
years are used by the Apriori algorithm and a spatio-temporal data
mining application is developed to identify overflows and newlyformed
soil regions (underflows) occurring in the coastal parts of
Lake Van. Identifying possible reasons of overflows and underflows
may be used to alert the experts to take precautions and make the
necessary investments.
Abstract: In this paper, we present a recommendation library application on Android system. The objective of this system is to support and advice user to use library resources based on mobile application. We describe the design approaches and functional components of this system. The system was developed based on under association rules, Apriori algorithm. In this project, it was divided the result by the research purposes into 2 parts: developing the Mobile application for online library service and testing and evaluating the system. Questionnaires were used to measure user satisfaction with system usability by specialists and users. The results were satisfactory both specialists and users.
Abstract: This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.
Abstract: In data mining, the association rules are used to search
for the relations of items of the transactions database. Following the
data is collected and stored, it can find rules of value through
association rules, and assist manager to proceed marketing strategy
and plan market framework. In this paper, we attempt fuzzy partition
methods and decide membership function of quantitative values of
each transaction item. Also, by managers we can reflect the
importance of items as linguistic terms, which are transformed as
fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth
(FWFP-Growth) is used to complete the process of data mining. The
method above is expected to improve Apriori algorithm for its better
efficiency of the whole association rules. An example is given to
clearly illustrate the proposed approach.
Abstract: The Neuro-Fuzzy hybridization scheme has become
of research interest in pattern classification over the past decade. The
present paper proposes a novel Modified Adaptive Fuzzy Inference
Engine (MAFIE) for pattern classification. A modified Apriori
algorithm technique is utilized to reduce a minimal set of decision
rules based on input output data sets. A TSK type fuzzy inference
system is constructed by the automatic generation of membership
functions and rules by the fuzzy c-means clustering and Apriori
algorithm technique, respectively. The generated adaptive fuzzy
inference engine is adjusted by the least-squares fit and a conjugate
gradient descent algorithm towards better performance with a
minimal set of rules. The proposed MAFIE is able to reduce the
number of rules which increases exponentially when more input
variables are involved. The performance of the proposed MAFIE is
compared with other existing applications of pattern classification
schemes using Fisher-s Iris and Wisconsin breast cancer data sets and
shown to be very competitive.
Abstract: In data mining, the association rules are used to find
for the associations between the different items of the transactions
database. As the data collected and stored, rules of value can be found
through association rules, which can be applied to help managers
execute marketing strategies and establish sound market frameworks.
This paper aims to use Fuzzy Frequent Pattern growth (FFP-growth)
to derive from fuzzy association rules. At first, we apply fuzzy
partition methods and decide a membership function of quantitative
value for each transaction item. Next, we implement FFP-growth
to deal with the process of data mining. In addition, in order to
understand the impact of Apriori algorithm and FFP-growth algorithm
on the execution time and the number of generated association
rules, the experiment will be performed by using different sizes of
databases and thresholds. Lastly, the experiment results show FFPgrowth
algorithm is more efficient than other existing methods.
Abstract: In Virtual organization, Knowledge Discovery (KD)
service contains distributed data resources and computing grid nodes.
Computational grid is integrated with data grid to form Knowledge
Grid, which implements Apriori algorithm for mining association
rule on grid network. This paper describes development of parallel
and distributed version of Apriori algorithm on Globus Toolkit using
Message Passing Interface extended with Grid Services (MPICHG2).
The creation of Knowledge Grid on top of data and
computational grid is to support decision making in real time
applications. In this paper, the case study describes design and
implementation of local and global mining of frequent item sets. The
experiments were conducted on different configurations of grid
network and computation time was recorded for each operation. We
analyzed our result with various grid configurations and it shows
speedup of computation time is almost superlinear.
Abstract: This paper proposes an auto-classification algorithm
of Web pages using Data mining techniques. We consider the
problem of discovering association rules between terms in a set of
Web pages belonging to a category in a search engine database, and
present an auto-classification algorithm for solving this problem that
are fundamentally based on Apriori algorithm. The proposed
technique has two phases. The first phase is a training phase where
human experts determines the categories of different Web pages, and
the supervised Data mining algorithm will combine these categories
with appropriate weighted index terms according to the highest
supported rules among the most frequent words. The second phase is
the categorization phase where a web crawler will crawl through the
World Wide Web to build a database categorized according to the
result of the data mining approach. This database contains URLs and
their categories.
Abstract: This paper sets forth the possibility and importance about applying Data Mining in Web logs mining and shows some problems in the conventional searching engines. Then it offers an improved algorithm based on the original AprioriAll algorithm which has been used in Web logs mining widely. The new algorithm adds the property of the User ID during the every step of producing the candidate set and every step of scanning the database by which to decide whether an item in the candidate set should be put into the large set which will be used to produce next candidate set. At the meantime, in order to reduce the number of the database scanning, the new algorithm, by using the property of the Apriori algorithm, limits the size of the candidate set in time whenever it is produced. Test results show the improved algorithm has a more lower complexity of time and space, better restrain noise and fit the capacity of memory.
Abstract: A large amount of valuable information is available in
plain text clinical reports. New techniques and technologies are
applied to extract information from these reports. In this study, we
developed a domain based software system to transform 600
Otorhinolaryngology discharge notes to a structured form for
extracting clinical data from the discharge notes. In order to decrease
the system process time discharge notes were transformed into a data
table after preprocessing. Several word lists were constituted to
identify common section in the discharge notes, including patient
history, age, problems, and diagnosis etc. N-gram method was used
for discovering terms co-Occurrences within each section. Using this
method a dataset of concept candidates has been generated for the
validation step, and then Predictive Apriori algorithm for Association
Rule Mining (ARM) was applied to validate candidate concepts.
Abstract: This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Abstract: In the era of great competition, understanding and satisfying
customers- requirements are the critical tasks for a company
to make a profits. Customer relationship management (CRM) thus
becomes an important business issue at present. With the help of
the data mining techniques, the manager can explore and analyze
from a large quantity of data to discover meaningful patterns and
rules. Among all methods, well-known association rule is most
commonly seen. This paper is based on Apriori algorithm and uses
genetic algorithms combining a data mining method to discover fuzzy
classification rules. The mined results can be applied in CRM to
help decision marker make correct business decisions for marketing
strategies.
Abstract: A data cutting and sorting method (DCSM) is proposed
to optimize the performance of data mining. DCSM reduces the
calculation time by getting rid of redundant data during the data
mining process. In addition, DCSM minimizes the computational units
by splitting the database and by sorting data with support counts. In the
process of searching for the relationship between metabolic syndrome
and lifestyles with the health examination database of an electronics
manufacturing company, DCSM demonstrates higher search
efficiency than the traditional Apriori algorithm in tests with different
support counts.