Abstract: MicroRNAs are small non-coding RNA found in
many different species. They play crucial roles in cancer such as
biological processes of apoptosis and proliferation. The identification
of microRNA-target genes can be an essential first step towards to
reveal the role of microRNA in various cancer types. In this paper,
we predict miRNA-target genes for lung cancer by integrating
prediction scores from miRanda and PITA algorithms used as a
feature vector of miRNA-target interaction. Then, machine-learning
algorithms were implemented for making a final prediction. The
approach developed in this study should be of value for future studies
into understanding the role of miRNAs in molecular mechanisms
enabling lung cancer formation.
Abstract: Due to the fast and flawless technological innovation
there is a tremendous amount of data dumping all over the world in
every domain such as Pattern Recognition, Machine Learning, Spatial
Data Mining, Image Analysis, Fraudulent Analysis, World Wide
Web etc., This issue turns to be more essential for developing several
tools for data mining functionalities. The major aim of this paper is to
analyze various tools which are used to build a resourceful analytical
or descriptive model for handling large amount of information more
efficiently and user friendly. In this survey the diverse tools are
illustrated with their extensive technical paradigm, outstanding
graphical interface and inbuilt multipath algorithms in which it is
very useful for handling significant amount of data more indeed.
Abstract: Laban Movement Analysis (LMA), developed in the
dance community over the past seventy years, is an effective method
for observing, describing, notating, and interpreting human
movement to enhance communication and expression in everyday
and professional life. Many applications that use motion capture data
might be significantly leveraged if the Laban qualities will be
recognized automatically. This paper presents an automated
recognition method of Laban qualities from motion capture skeletal
recordings and it is demonstrated on the output of Microsoft’s Kinect
V2 sensor.
Abstract: Devices in a pervasive computing system (PCS) are characterized by their context-awareness. It permits them to provide proactively adapted services to the user and applications. To do so, context must be well understood and modeled in an appropriate form which enhance its sharing between devices and provide a high level of abstraction. The most interesting methods for modeling context are those based on ontology however the majority of the proposed methods fail in proposing a generic ontology for context which limit their usability and keep them specific to a particular domain. The adaptation task must be done automatically and without an explicit intervention of the user. Devices of a PCS must acquire some intelligence which permits them to sense the current context and trigger the appropriate service or provide a service in a better suitable form. In this paper we will propose a generic service ontology for context modeling and a context-aware service adaptation based on a service oriented definition of context.
Abstract: In this paper, a new learning algorithm based on a
hybrid metaheuristic integrating Differential Evolution (DE) and
Reduced Variable Neighborhood Search (RVNS) is introduced to train
the classification method PROAFTN. To apply PROAFTN, values of
several parameters need to be determined prior to classification. These
parameters include boundaries of intervals and relative weights for
each attribute. Based on these requirements, the hybrid approach,
named DEPRO-RVNS, is presented in this study. In some cases, the
major problem when applying DE to some classification problems
was the premature convergence of some individuals to local optima.
To eliminate this shortcoming and to improve the exploration and
exploitation capabilities of DE, such individuals were set to iteratively
re-explored using RVNS. Based on the generated results on
both training and testing data, it is shown that the performance of
PROAFTN is significantly improved. Furthermore, the experimental
study shows that DEPRO-RVNS outperforms well-known machine
learning classifiers in a variety of problems.
Abstract: This paper presents the development of a Bayesian
belief network classifier for prediction of graft status and survival
period in renal transplantation using the patient profile information
prior to the transplantation. The objective was to explore feasibility
of developing a decision making tool for identifying the most suitable
recipient among the candidate pool members. The dataset was
compiled from the University of Toledo Medical Center Hospital
patients as reported to the United Network Organ Sharing, and had
1228 patient records for the period covering 1987 through 2009. The
Bayes net classifiers were developed using the Weka machine
learning software workbench. Two separate classifiers were induced
from the data set, one to predict the status of the graft as either failed
or living, and a second classifier to predict the graft survival period.
The classifier for graft status prediction performed very well with a
prediction accuracy of 97.8% and true positive values of 0.967 and
0.988 for the living and failed classes, respectively. The second
classifier to predict the graft survival period yielded a prediction
accuracy of 68.2% and a true positive rate of 0.85 for the class
representing those instances with kidneys failing during the first year
following transplantation. Simulation results indicated that it is
feasible to develop a successful Bayesian belief network classifier for
prediction of graft status, but not the graft survival period, using the
information in UNOS database.
Abstract: To create a solution for a specific problem in machine
learning, the solution is constructed from the data or by use a search
method. Genetic algorithms are a model of machine learning that can
be used to find nearest optimal solution. While the great advantage of
genetic algorithms is the fact that they find a solution through
evolution, this is also the biggest disadvantage. Evolution is inductive,
in nature life does not evolve towards a good solution but it evolves
away from bad circumstances. This can cause a species to evolve into
an evolutionary dead end. In order to reduce the effect of this
disadvantage we propose a new a learning tool (criteria) which can be
included into the genetic algorithms generations to compare the
previous population and the current population and then decide
whether is effective to continue with the previous population or the
current population, the proposed learning tool is called as Keeping
Efficient Population (KEP). We applied a GA based on KEP to the
production line layout problem, as a result KEP keep the evaluation
direction increases and stops any deviation in the evaluation.
Abstract: We present an Electronic Nose (ENose), which is
aimed at identifying the presence of one out of two gases, possibly
detecting the presence of a mixture of the two. Estimation of the
concentrations of the components is also performed for a volatile
organic compound (VOC) constituted by methanol and acetone, for
the ranges 40-400 and 22-220 ppm (parts-per-million), respectively.
Our system contains 8 sensors, 5 of them being gas sensors (of the
class TGS from FIGARO USA, INC., whose sensing element is a tin
dioxide (SnO2) semiconductor), the remaining being a temperature
sensor (LM35 from National Semiconductor Corporation), a
humidity sensor (HIH–3610 from Honeywell), and a pressure sensor
(XFAM from Fujikura Ltd.).
Our integrated hardware–software system uses some machine
learning principles and least square regression principle to identify at
first a new gas sample, or a mixture, and then to estimate the
concentrations. In particular we adopt a training model using the
Support Vector Machine (SVM) approach with linear kernel to teach
the system how discriminate among different gases. Then we apply
another training model using the least square regression, to predict
the concentrations.
The experimental results demonstrate that the proposed
multiclassification and regression scheme is effective in the
identification of the tested VOCs of methanol and acetone with
96.61% correctness. The concentration prediction is obtained with
0.979 and 0.964 correlation coefficient for the predicted versus real
concentrations of methanol and acetone, respectively.
Abstract: Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.
Abstract: Case-Based Reasoning (CBR) is one of machine
learning algorithms for problem solving and learning that caught a lot
of attention over the last few years. In general, CBR is composed of
four main phases: retrieve the most similar case or cases, reuse the
case to solve the problem, revise or adapt the proposed solution, and
retain the learned cases before returning them to the case base for
learning purpose. Unfortunately, in many cases, this retain process
causes the uncontrolled case base growth. The problem affects
competence and performance of CBR systems. This paper proposes
competence-based maintenance method based on deletion policy
strategy for CBR. There are three main steps in this method. Step 1,
formulate problems. Step 2, determine coverage and reachability set
based on coverage value. Step 3, reduce case base size. The results
obtained show that this proposed method performs better than the
existing methods currently discussed in literature.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: This paper explores the effectiveness of machine
learning techniques in detecting firms that issue fraudulent financial
statements (FFS) and deals with the identification of factors
associated to FFS. To this end, a number of experiments have been
conducted using representative learning algorithms, which were
trained using a data set of 164 fraud and non-fraud Greek firms in the
recent period 2001-2002. The decision of which particular method to
choose is a complicated problem. A good alternative to choosing
only one method is to create a hybrid forecasting system
incorporating a number of possible solution methods as components
(an ensemble of classifiers). For this purpose, we have implemented
a hybrid decision support system that combines the representative
algorithms using a stacking variant methodology and achieves better
performance than any examined simple and ensemble method. To
sum up, this study indicates that the investigation of financial
information can be used in the identification of FFS and underline the
importance of financial ratios.
Abstract: The data is available in abundance in any business
organization. It includes the records for finance, maintenance,
inventory, progress reports etc. As the time progresses, the data keep
on accumulating and the challenge is to extract the information from
this data bank. Knowledge discovery from these large and complex
databases is the key problem of this era. Data mining and machine
learning techniques are needed which can scale to the size of the
problems and can be customized to the application of business. For
the development of accurate and required information for particular
problem, business analyst needs to develop multidimensional models
which give the reliable information so that they can take right
decision for particular problem. If the multidimensional model does
not possess the advance features, the accuracy cannot be expected.
The present work involves the development of a Multidimensional
data model incorporating advance features. The criterion of
computation is based on the data precision and to include slowly
change time dimension. The final results are displayed in graphical
form.
Abstract: This paper discusses the designing of knowledge
integration of clinical information extracted from distributed medical
ontologies in order to ameliorate a machine learning-based multilabel
coding assignment system. The proposed approach is
implemented using a decision tree technique of the machine learning
on the university hospital data for patients with Coronary Heart
Disease (CHD). The preliminary results obtained show a satisfactory
finding that the use of medical ontologies improves the overall
system performance.