Abstract: A large amount of valuable information is available in
plain text clinical reports. New techniques and technologies are
applied to extract information from these reports. In this study, we
developed a domain based software system to transform 600
Otorhinolaryngology discharge notes to a structured form for
extracting clinical data from the discharge notes. In order to decrease
the system process time discharge notes were transformed into a data
table after preprocessing. Several word lists were constituted to
identify common section in the discharge notes, including patient
history, age, problems, and diagnosis etc. N-gram method was used
for discovering terms co-Occurrences within each section. Using this
method a dataset of concept candidates has been generated for the
validation step, and then Predictive Apriori algorithm for Association
Rule Mining (ARM) was applied to validate candidate concepts.
Abstract: Most of the existing text mining approaches are
proposed, keeping in mind, transaction databases model. Thus, the
mined dataset is structured using just one concept: the “transaction",
whereas the whole dataset is modeled using the “set" abstract type. In
such cases, the structure of the whole dataset and the relationships
among the transactions themselves are not modeled and
consequently, not considered in the mining process.
We believe that taking into account structure properties of
hierarchically structured information (e.g. textual document, etc ...)
in the mining process, can leads to best results. For this purpose, an
hierarchical associations rule mining approach for textual documents
is proposed in this paper and the classical set-oriented mining
approach is reconsidered profits to a Direct Acyclic Graph (DAG)
oriented approach. Natural languages processing techniques are used
in order to obtain the DAG structure. Based on this graph model, an
hierarchical bottom up algorithm is proposed. The main idea is that
each node is mined with its parent node.
Abstract: DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.
Abstract: This research aims to create a model for analysis of student motivation behavior on e-Learning based on association rule mining techniques in case of the Information Technology for Communication and Learning Course at Suan Sunandha Rajabhat University. The model was created under association rules, one of the data mining techniques with minimum confidence. The results showed that the student motivation behavior model by using association rule technique can indicate the important variables that influence the student motivation behavior on e-Learning.
Abstract: This paper presents a system for discovering
association rules from collections of unstructured documents called
EART (Extract Association Rules from Text). The EART system
treats texts only not images or figures. EART discovers association
rules amongst keywords labeling the collection of textual documents.
The main characteristic of EART is that the system integrates XML
technology (to transform unstructured documents into structured
documents) with Information Retrieval scheme (TF-IDF) and Data
Mining technique for association rules extraction. EART depends on
word feature to extract association rules. It consists of four phases:
structure phase, index phase, text mining phase and visualization
phase. Our work depends on the analysis of the keywords in the
extracted association rules through the co-occurrence of the keywords
in one sentence in the original text and the existing of the keywords
in one sentence without co-occurrence. Experiments applied on a
collection of scientific documents selected from MEDLINE that are
related to the outbreak of H5N1 avian influenza virus.
Abstract: It has been established that microRNAs (miRNAs) play
an important role in gene expression by post-transcriptional regulation
of messengerRNAs (mRNAs). However, the precise relationships
between microRNAs and their target genes in sense of numbers,
types and biological relevance remain largely unclear. Dissecting the
miRNA-target relationships will render more insights for miRNA
targets identification and validation therefore promote the understanding
of miRNA function. In miRBase, miRanda is the key
algorithm used for target prediction for Zebrafish. This algorithm
is high-throughput but brings lots of false positives (noise). Since
validation of a large scale of targets through laboratory experiments
is very time consuming, several computational methods for miRNA
targets validation should be developed. In this paper, we present an
integrative method to investigate several aspects of the relationships
between miRNAs and their targets with the final purpose of extracting
high confident targets from miRanda predicted targets pool. This is
achieved by using the techniques ranging from statistical tests to
clustering and association rules. Our research focuses on Zebrafish.
It was found that validated targets do not necessarily associate with
the highest sequence matching. Besides, for some miRNA families,
the frequency of their predicted targets is significantly higher in the
genomic region nearby their own physical location. Finally, in a case
study of dre-miR-10 and dre-miR-196, it was found that the predicted
target genes hoxd13a, hoxd11a, hoxd10a and hoxc4a of dre-miR-
10 while hoxa9a, hoxc8a and hoxa13a of dre-miR-196 have similar
characteristics as validated target genes and therefore represent high
confidence target candidates.
Abstract: This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Abstract: This paper is a description approach to predict
incoming and outgoing data rate in network system by using
association rule discover, which is one of the data mining
techniques. Information of incoming and outgoing data in each
times and network bandwidth are network performance
parameters, which needed to solve in the traffic problem. Since
congestion and data loss are important network problems. The result
of this technique can predicted future network traffic. In addition,
this research is useful for network routing selection and network
performance improvement.
Abstract: MATCH project [1] entitle the development of an
automatic diagnosis system that aims to support treatment of colon
cancer diseases by discovering mutations that occurs to tumour
suppressor genes (TSGs) and contributes to the development of
cancerous tumours. The constitution of the system is based on a)
colon cancer clinical data and b) biological information that will be
derived by data mining techniques from genomic and proteomic
sources The core mining module will consist of the popular, well
tested hybrid feature extraction methods, and new combined
algorithms, designed especially for the project. Elements of rough
sets, evolutionary computing, cluster analysis, self-organization maps
and association rules will be used to discover the annotations
between genes, and their influence on tumours [2]-[11].
The methods used to process the data have to address their high
complexity, potential inconsistency and problems of dealing with the
missing values. They must integrate all the useful information
necessary to solve the expert's question. For this purpose, the system
has to learn from data, or be able to interactively specify by a domain
specialist, the part of the knowledge structure it needs to answer a
given query. The program should also take into account the
importance/rank of the particular parts of data it analyses, and adjusts
the used algorithms accordingly.
Abstract: In the era of great competition, understanding and satisfying
customers- requirements are the critical tasks for a company
to make a profits. Customer relationship management (CRM) thus
becomes an important business issue at present. With the help of
the data mining techniques, the manager can explore and analyze
from a large quantity of data to discover meaningful patterns and
rules. Among all methods, well-known association rule is most
commonly seen. This paper is based on Apriori algorithm and uses
genetic algorithms combining a data mining method to discover fuzzy
classification rules. The mined results can be applied in CRM to
help decision marker make correct business decisions for marketing
strategies.
Abstract: Chess is one of the indoor games, which improves the
level of human confidence, concentration, planning skills and
knowledge. The main objective of this paper is to help the chess
players to improve their chess openings using data mining
techniques. Budding Chess Players usually do practices by analyzing
various existing openings. When they analyze and correlate
thousands of openings it becomes tedious and complex for them. The
work done in this paper is to analyze the best lines of Blackmar-
Diemer Gambit(BDG) which opens with White D4... using data
mining analysis. It is carried out on the collection of winning games
by applying association rules. The first step of this analysis is
assigning variables to each different sequence moves. In the second
step, the sequence association rules were generated to calculate
support and confidence factor which help us to find the best
subsequence chess moves that may lead to winning position.
Abstract: Currently searching through internet is very popular especially in a field of academic. A huge of educational information such as research papers are overload for user. So community-base web sites have been developed to help user search information more easily from process of customizing a web site to need each specifies user or set of user. In this paper propose to use association rule analyze the community group on research paper bookmarking. A set of design goals for community group frameworks is developed and discussed. Additionally Researcher analyzes the initial relation by using association rule discovery between the antecedent and the consequent of a rule in the groups of user for generate the idea to improve ranking search result and development recommender system.
Abstract: Currently, web usage make a huge data from a lot of
user attention. In general, proxy server is a system to support web
usage from user and can manage system by using hit rates. This
research tries to improve hit rates in proxy system by applying data
mining technique. The data set are collected from proxy servers in the
university and are investigated relationship based on several features.
The model is used to predict the future access websites. Association
rule technique is applied to get the relation among Date, Time, Main
Group web, Sub Group web, and Domain name for created model.
The results showed that this technique can predict web content for the
next day, moreover the future accesses of websites increased from
38.15% to 85.57 %.
This model can predict web page access which tends to increase
the efficient of proxy servers as a result. In additional, the
performance of internet access will be improved and help to reduce
traffic in networks.