Abstract: The purpose of this paper is to propose a framework for constructing correct parallel processing programs based on Equivalent Transformation Framework (ETF). ETF regards computation as In the framework, a problem-s domain knowledge and a query are described in definite clauses, and computation is regarded as transformation of the definite clauses. Its meaning is defined by a model of the set of definite clauses, and the transformation rules generated must preserve meaning. We have proposed a parallel processing method based on “specialization", a part of operation in the transformations, which resembles substitution in logic programming. The method requires “Memo-tree", a history of specialization to maintain correctness. In this paper we proposes the new method for the specialization-base parallel processing without Memo-tree.
Abstract: The main aim of this study is to identify the most
influential variables that cause defects on the items produced by a
casting company located in Turkey. To this end, one of the items
produced by the company with high defective percentage rates is
selected. Two approaches-the regression analysis and decision treesare
used to model the relationship between process parameters and
defect types. Although logistic regression models failed, decision tree
model gives meaningful results. Based on these results, it can be
claimed that the decision tree approach is a promising technique for
determining the most important process variables.
Abstract: Sleep stage scoring is the process of classifying the
stage of the sleep in which the subject is in. Sleep is classified into
two states based on the constellation of physiological parameters.
The two states are the non-rapid eye movement (NREM) and the
rapid eye movement (REM). The NREM sleep is also classified into
four stages (1-4). These states and the state wakefulness are
distinguished from each other based on the brain activity. In this
work, a classification method for automated sleep stage scoring
based on a single EEG recording using wavelet packet decomposition
was implemented. Thirty two ploysomnographic recording from the
MIT-BIH database were used for training and validation of the
proposed method. A single EEG recording was extracted and
smoothed using Savitzky-Golay filter. Wavelet packets
decomposition up to the fourth level based on 20th order Daubechies
filter was used to extract features from the EEG signal. A features
vector of 54 features was formed. It was reduced to a size of 25 using
the gain ratio method and fed into a classifier of regression trees. The
regression trees were trained using 67% of the records available. The
records for training were selected based on cross validation of the
records. The remaining of the records was used for testing the
classifier. The overall correct rate of the proposed method was found
to be around 75%, which is acceptable compared to the techniques in
the literature.
Abstract: The next stage of the home networking environment is
supposed to be ubiquitous, where each piece of material is equipped
with an RFID (Radio Frequency Identification) tag. To fully support
the ubiquitous environment, home networking middleware should be
able to recommend home services based on a user-s interests and
efficiently manage information on service usage profiles for the users.
Therefore, USN (Ubiquitous Sensor Network) technology, which
recognizes and manages a appliance-s state-information (location,
capabilities, and so on) by connecting RFID tags is considered. The
Intelligent Multi-Agent Middleware (IMAM) architecture was
proposed to intelligently manage the mobile RFID-based home
networking and to automatically supply information about home
services that match a user-s interests. Evaluation results for
personalization services for IMAM using Bayesian-Net and Decision
Trees are presented.
Abstract: In a lowland dipterocarp forest, we assessed the impact of canopy openness (CO) and the resultant changes under different logging systems using hemispherical photography. CO was assessed in a primary forest and two forests logged selectively using reduced impact logging. At one site, 3-m-wide strip cutting was conducted for line planting. From the comparison of CO among the three sites, we found significant changes caused by logging. However, no significant difference was observed between the two logged sites. Strip cutting treatment did not affect CO. One year after, significant canopy closure occurred in both of the logged sites. Canopy closure was significant regardless of the disturbance element, logging gap, skid trail, or strip cutting line. Significant establishment of seedlings within a year was observed in the strip cutting line. Seedling establishment seemed to contribute to rapid canopy closure and prospected to affect to the survival and growth of planted trees.
Abstract: The study on the tree growth for four species groups of commercial timber in Koh Kong province, Cambodia-s tropical rainforest is described. The simulation for these four groups had been successfully developed in the 5-year interval through year-60. Data were obtained from twenty permanent sample plots in the duration of thirteen years. The aim for this study was to develop stand table simulation system of tree growth by the species group. There were five steps involved in the development of the tree growth simulation: aggregate the tree species into meaningful groups by using cluster analysis; allocate the trees in the diameter classes by the species group; observe the diameter movement of the species group. The diameter growth rate, mortality rate and recruitment rate were calculated by using some mathematical formula. Simulation equation had been created by combining those parameters. Result showed the dissimilarity of the diameter growth among species groups.
Abstract: Avoiding learning failures in mathematics e-learning environments caused by emotional problems in students with autism has become an important topic for combining of special education with information and communications technology. This study presents an adaptive emotional adjustment model in mathematics e-learning for students with autism, emphasizing the lack of emotional perception in mathematics e-learning systems. In addition, an emotion classification for students with autism was developed by inducing emotions in mathematical learning environments to record changes in the physiological signals and facial expressions of students. Using these methods, 58 emotional features were obtained. These features were then processed using one-way ANOVA and information gain (IG). After reducing the feature dimension, methods of support vector machines (SVM), k-nearest neighbors (KNN), and classification and regression trees (CART) were used to classify four emotional categories: baseline, happy, angry, and anxious. After testing and comparisons, in a situation without feature selection, the accuracy rate of the SVM classification can reach as high as 79.3-%. After using IG to reduce the feature dimension, with only 28 features remaining, SVM still has a classification accuracy of 78.2-%. The results of this research could enhance the effectiveness of eLearning in special education.
Abstract: In this paper we present a novel approach for wavelet compression of electrocardiogram (ECG) signals based on the set partitioning in hierarchical trees (SPIHT) coding algorithm. SPIHT algorithm has achieved prominent success in image compression. Here we use a modified version of SPIHT for one dimensional signals. We applied wavelet transform with SPIHT coding algorithm on different records of MIT-BIH database. The results show the high efficiency of this method in ECG compression.
Abstract: Software metric is a measure of some property of a
piece of software or its specification. The aim of this paper is to
present an application of evolutionary decision trees in software
engineering in order to classify the software modules that have or
have not one or more reported defects. For this some metrics are used
for detecting the class of modules with defects or without defects.
Abstract: In this paper, a neural tree (NT) classifier having a
simple perceptron at each node is considered. A new concept for
making a balanced tree is applied in the learning algorithm of the
tree. At each node, if the perceptron classification is not accurate and
unbalanced, then it is replaced by a new perceptron. This separates
the training set in such a way that almost the equal number of patterns
fall into each of the classes. Moreover, each perceptron is trained only
for the classes which are present at respective node and ignore other
classes. Splitting nodes are employed into the neural tree architecture
to divide the training set when the current perceptron node repeats
the same classification of the parent node. A new error function based
on the depth of the tree is introduced to reduce the computational
time for the training of a perceptron. Experiments are performed to
check the efficiency and encouraging results are obtained in terms of
accuracy and computational costs.
Abstract: Although backpropagation ANNs generally predict
better than decision trees do for pattern classification problems, they
are often regarded as black boxes, i.e., their predictions cannot be
explained as those of decision trees. In many applications, it is
desirable to extract knowledge from trained ANNs for the users to
gain a better understanding of how the networks solve the problems.
A new rule extraction algorithm, called rule extraction from artificial
neural networks (REANN) is proposed and implemented to extract
symbolic rules from ANNs. A standard three-layer feedforward ANN
is the basis of the algorithm. A four-phase training algorithm is
proposed for backpropagation learning. Explicitness of the extracted
rules is supported by comparing them to the symbolic rules generated
by other methods. Extracted rules are comparable with other methods
in terms of number of rules, average number of conditions for a rule,
and predictive accuracy. Extensive experimental studies on several
benchmarks classification problems, such as breast cancer, iris,
diabetes, and season classification problems, demonstrate the
effectiveness of the proposed approach with good generalization
ability.
Abstract: Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques with classifiers such as random
forests, neural networks and support vector machines. The data sets
are from MAGIC, a Cherenkov telescope experiment. The task is to
classify gamma signals from overwhelmingly hadron and muon
signals representing a rare class classification problem. We compare
the individual classifiers with their ensemble counterparts and
discuss the results. WEKA a wonderful tool for machine learning has
been used for making the experiments.
Abstract: A sequential decision problem, based on the task ofidentifying the species of trees given acoustic echo data collectedfrom them, is considered with well-known stochastic classifiers,including single and mixture Gaussian models. Echoes are processedwith a preprocessing stage based on a model of mammalian cochlearfiltering, using a new discrete low-pass filter characteristic. Stoppingtime performance of the sequential decision process is evaluated andcompared. It is observed that the new low pass filter processingresults in faster sequential decisions.
Abstract: This paper presents an information retrieval model on
XML documents based on tree matching. Queries and documents are
represented by extended trees. An extended tree is built starting from
the original tree, with additional weighted virtual links between each
node and its indirect descendants allowing to directly reach each
descendant. Therefore only one level separates between each node
and its indirect descendants. This allows to compare the user query
and the document with flexibility and with respect to the structural
constraints of the query. The content of each node is very important to
decide weither a document element is relevant or not, thus the content
should be taken into account in the retrieval process. We separate
between the structure-based and the content-based retrieval processes.
The content-based score of each node is commonly based on the
well-known Tf × Idf criteria. In this paper, we compare between
this criteria and another one we call Tf × Ief. The comparison
is based on some experiments into a dataset provided by INEX1 to
show the effectiveness of our approach on one hand and those of
both weighting functions on the other.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Abstract: Chronic hepatitis B can evolve to cirrhosis and liver
cancer. Interferon is the only effective treatment, for carefully selected
patients, but it is very expensive. Some of the selection criteria are
based on liver biopsy, an invasive, costly and painful medical procedure.
Therefore, developing efficient non-invasive selection systems,
could be in the patients benefit and also save money. We investigated
the possibility to create intelligent systems to assist the Interferon
therapeutical decision, mainly by predicting with acceptable accuracy
the results of the biopsy. We used a knowledge discovery in integrated
medical data - imaging, clinical, and laboratory data. The resulted
intelligent systems, tested on 500 patients with chronic hepatitis
B, based on C5.0 decision trees and boosting, predict with 100%
accuracy the results of the liver biopsy. Also, by integrating the other
patients selection criteria, they offer a non-invasive support for the
correct Interferon therapeutic decision. To our best knowledge, these
decision systems outperformed all similar systems published in the
literature, and offer a realistic opportunity to replace liver biopsy in
this medical context.
Abstract: The forest stand consisted of four layers. The species
composition between the third and the bottom layers was almost
similar, whereas it was almost exclusive between the top and the lower
three layers. The values of Shannon-s index H' and Pielou-s index
J ' tended to increase from the bottom layer upward, except for
H' -value of the top layer. The values of H' and J ' were 4.21 bit
and 0.73, respectively, for the total stand. High woody species
diversity of the forest depended on large trees in the upper layers,
which trend was different from a subtropical evergreen broadleaf
forest grown in silicate habitat in the northern part of Okinawa Island.
The spatial distribution of trees was overlapped between the third and
the bottom layers, whereas it was independent or slightly exclusive
between the top and the lower three layers. Mean tree weight of each
layer decreased from the top toward the bottom layer, whereas the
corresponding tree density increased from the top downward. This
relationship was analogous to the process of self-thinning plant
populations.
Abstract: In this paper a new method is suggested for
distributed data-mining by the probability patterns. These patterns
use decision trees and decision graphs. The patterns are cared to be
valid, novel, useful, and understandable. Considering a set of
functions, the system reaches to a good pattern or better objectives.
By using the suggested method we will be able to extract the useful
information from massive and multi-relational data bases.
Abstract: Term Extraction, a key data preparation step in Text
Mining, extracts the terms, i.e. relevant collocation of words,
attached to specific concepts (e.g. genetic-algorithms and decisiontrees
are terms associated to the concept “Machine Learning" ). In
this paper, the task of extracting interesting collocations is achieved
through a supervised learning algorithm, exploiting a few
collocations manually labelled as interesting/not interesting. From
these examples, the ROGER algorithm learns a numerical function,
inducing some ranking on the collocations. This ranking is optimized
using genetic algorithms, maximizing the trade-off between the false
positive and true positive rates (Area Under the ROC curve). This
approach uses a particular representation for the word collocations,
namely the vector of values corresponding to the standard statistical
interestingness measures attached to this collocation. As this
representation is general (over corpora and natural languages),
generality tests were performed by experimenting the ranking
function learned from an English corpus in Biology, onto a French
corpus of Curriculum Vitae, and vice versa, showing a good
robustness of the approaches compared to the state-of-the-art Support
Vector Machine (SVM).