Abstract: Recommender Systems have been developed to provide contents and services compatible to users based on their behaviors and interests. Due to information overload in online discussion forums and users diverse interests, recommending relative topics and threads is considered to be helpful for improving the ease of forum usage. In order to lead learners to find relevant information in educational forums, recommendations are even more needed. We present a hybrid thread recommender system for MOOC forums by applying social network analysis and association rule mining techniques. Initial results indicate that the proposed recommender system performs comparatively well with regard to limited available data from users' previous posts in the forum.
Abstract: Construction defects are major components that result in negative impacts on project performance including schedule delays and cost overruns. Since construction defects generally occur when a few associated causes combine, a thorough understanding of defect causality is required in order to more systematically prevent construction defects. To address this issue, this paper uses association rule mining (ARM) to quantify the causality between defect causes, and social network analysis (SNA) to find indirect causality among them. The suggested approach is validated with 350 defect instances from concrete works in 32 projects in Korea. The results show that the interrelationships revealed by the approach reflect the characteristics of the concrete task and the important causes that should be prevented.
Abstract: Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FPgrowth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.
Abstract: The use of eXtensible Markup Language (XML) in
web, business and scientific databases lead to the development of
methods, techniques and systems to manage and analyze XML data.
Semi-structured documents suffer due to its heterogeneity and
dimensionality. XML structure and content mining represent
convergence for research in semi-structured data and text mining. As
the information available on the internet grows drastically, extracting
knowledge from XML documents becomes a harder task. Certainly,
documents are often so large that the data set returned as answer to a
query may also be very big to convey the required information. To
improve the query answering, a Semantic Tree Based Association
Rule (STAR) mining method is proposed. This method provides
intentional information by considering the structure, content and the
semantics of the content. The method is applied on Reuter’s dataset
and the results show that the proposed method outperforms well.
Abstract: Frequent pattern mining is the process of finding a
pattern (a set of items, subsequences, substructures, etc.) that occurs
frequently in a data set. It was proposed in the context of frequent
itemsets and association rule mining. Frequent pattern mining is used
to find inherent regularities in data. What products were often
purchased together? Its applications include basket data analysis,
cross-marketing, catalog design, sale campaign analysis, Web log
(click stream) analysis, and DNA sequence analysis. However, one of
the bottlenecks of frequent itemset mining is that as the data increase
the amount of time and resources required to mining the data
increases at an exponential rate. In this investigation a new algorithm
is proposed which can be uses as a pre-processor for frequent itemset
mining. FASTER (FeAture SelecTion using Entropy and Rough sets)
is a hybrid pre-processor algorithm which utilizes entropy and roughsets
to carry out record reduction and feature (attribute) selection
respectively. FASTER for frequent itemset mining can produce a
speed up of 3.1 times when compared to original algorithm while
maintaining an accuracy of 71%.
Abstract: This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.
Abstract: The exponential increase in the volume of medical image database has imposed new challenges to clinical routine in maintaining patient history, diagnosis, treatment and monitoring. With the advent of data mining and machine learning techniques it is possible to automate and/or assist physicians in clinical diagnosis. In this research a medical image classification framework using data mining techniques is proposed. It involves feature extraction, feature selection, feature discretization and classification. In the classification phase, the performance of the traditional kNN k nearest neighbor classifier is improved using a feature weighting scheme and a distance weighted voting instead of simple majority voting. Feature weights are calculated using the interestingness measures used in association rule mining. Experiments on the retinal fundus images show that the proposed framework improves the classification accuracy of traditional kNN from 78.57 % to 92.85 %.
Abstract: In this paper the application of rule mining in order to
review the effective factors on supplier selection is reviewed in the
following three sections 1) criteria selecting and information
gathering 2) performing association rule mining 3) validation and
constituting rule base. Afterwards a few of applications of rule base
is explained. Then, a numerical example is presented and analyzed
by Clementine software. Some of extracted rules as well as the
results are presented at the end.
Abstract: There are several approaches in trying to solve the
Quantitative 1Structure-Activity Relationship (QSAR) problem.
These approaches are based either on statistical methods or on
predictive data mining. Among the statistical methods, one should
consider regression analysis, pattern recognition (such as cluster
analysis, factor analysis and principal components analysis) or partial
least squares. Predictive data mining techniques use either neural
networks, or genetic programming, or neuro-fuzzy knowledge. These
approaches have a low explanatory capability or non at all. This
paper attempts to establish a new approach in solving QSAR
problems using descriptive data mining. This way, the relationship
between the chemical properties and the activity of a substance
would be comprehensibly modeled.
Abstract: The purpose of this research aims to discover the
knowledge for analysis student motivation behavior on e-Learning
based on Data Mining Techniques, in case of the Information
Technology for Communication and Learning Course at Suan
Sunandha Rajabhat University. The data mining techniques was
applied in this research including association rules, classification
techniques. The results showed that using data mining technique can
indicate the important variables that influence the student motivation
behavior on e-Learning.
Abstract: A generic and extendible Multi-Agent Data Mining
(MADM) framework, MADMF (the Multi-Agent Data Mining
Framework) is described. The central feature of the framework is that
it avoids the use of agreed meta-language formats by supporting a
framework of wrappers.
The advantage offered is that the framework is easily extendible,
so that further data agents and mining agents can simply be added to
the framework. A demonstration MADMF framework is currently
available. The paper includes details of the MADMF architecture and
the wrapper principle incorporated into it. A full description and
evaluation of the framework-s operation is provided by considering
two MADM scenarios.
Abstract: The inherent flexibilities of XML in both structure
and semantics makes mining from XML data a complex task with
more challenges compared to traditional association rule mining in
relational databases. In this paper, we propose a new model for the
effective extraction of generalized association rules form a XML
document collection. We directly use frequent subtree mining
techniques in the discovery process and do not ignore the tree
structure of data in the final rules. The frequent subtrees based on the
user provided support are split to complement subtrees to form the
rules. We explain our model within multi-steps from data preparation
to rule generation.
Abstract: To overcome the product overload of Internet
shoppers, we introduce a semantic recommendation procedure which
is more efficient when applied to Internet shopping malls. The
suggested procedure recommends the semantic products to the
customers and is originally based on Web usage mining, product
classification, association rule mining, and frequently purchasing.
We applied the procedure to the data set of MovieLens Company for
performance evaluation, and some experimental results are provided.
The experimental results have shown superior performance in
terms of coverage and precision.
Abstract: Numerical analysis naturally finds applications in all
fields of engineering and the physical sciences, but in the
21st century, the life sciences and even the arts have adopted
elements of scientific computations. The numerical data analysis
became key process in research and development of all the fields [6].
In this paper we have made an attempt to analyze the specified
numerical patterns with reference to the association rule mining
techniques with minimum confidence and minimum support mining
criteria. The extracted rules and analyzed results are graphically
demonstrated. Association rules are a simple but very useful form of
data mining that describe the probabilistic co-occurrence of certain
events within a database [7]. They were originally designed to
analyze market-basket data, in which the likelihood of items being
purchased together within the same transactions are analyzed.
Abstract: A large amount of valuable information is available in
plain text clinical reports. New techniques and technologies are
applied to extract information from these reports. In this study, we
developed a domain based software system to transform 600
Otorhinolaryngology discharge notes to a structured form for
extracting clinical data from the discharge notes. In order to decrease
the system process time discharge notes were transformed into a data
table after preprocessing. Several word lists were constituted to
identify common section in the discharge notes, including patient
history, age, problems, and diagnosis etc. N-gram method was used
for discovering terms co-Occurrences within each section. Using this
method a dataset of concept candidates has been generated for the
validation step, and then Predictive Apriori algorithm for Association
Rule Mining (ARM) was applied to validate candidate concepts.
Abstract: DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.
Abstract: This research aims to create a model for analysis of student motivation behavior on e-Learning based on association rule mining techniques in case of the Information Technology for Communication and Learning Course at Suan Sunandha Rajabhat University. The model was created under association rules, one of the data mining techniques with minimum confidence. The results showed that the student motivation behavior model by using association rule technique can indicate the important variables that influence the student motivation behavior on e-Learning.
Abstract: This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.