Abstract: Wrist pulse analysis for identification of health status
is found in Ancient Indian as well as Chinese literature. The preprocessing
of wrist pulse is necessary to remove outlier pulses and
fluctuations prior to the analysis of pulse pressure signal. This paper
discusses the identification of irregular pulses present in the pulse
series and intricacies associated with the extraction of time domain
pulse features. An approach of Dynamic Time Warping (DTW) has
been utilized for the identification of outlier pulses in the wrist pulse
series. The ambiguity present in the identification of pulse features is
resolved with the help of first derivative of Ensemble Average of
wrist pulse series. An algorithm for detecting tidal and dicrotic notch
in individual wrist pulse segment is proposed.
Abstract: Interactions among proteins are the basis of various
life events. So, it is important to recognize and research protein
interaction sites. A control set that contains 149 protein molecules
were used here. Then 10 features were extracted and 4 sample sets
that contained 9 sliding windows were made according to features.
These 4 sample sets were calculated by Radial Basis Functional neutral
networks which were optimized by Particle Swarm Optimization
respectively. Then 4 groups of results were obtained. Finally, these 4
groups of results were integrated by decision fusion (DF) and Genetic
Algorithm based Selected Ensemble (GASEN). A better accuracy was
got by DF and GASEN. So, the integrated methods were proved to
be effective.
Abstract: Bagging and boosting are among the most popular re-sampling ensemble methods that generate and combine a diversity of regression models using the same learning algorithm as base-learner. Boosting algorithms are considered stronger than bagging on noise-free data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using an averaging methodology of bagging and boosting ensembles with 10 sub-learners in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-learners on standard benchmark datasets and the proposed ensemble gave better accuracy.
Abstract: In this paper we designed and implemented a new
ensemble of classifiers based on a sequence of classifiers which were
specialized in regions of the training dataset where errors of its
trained homologous are concentrated. In order to separate this
regions, and to determine the aptitude of each classifier to properly
respond to a new case, it was used another set of classifiers built
hierarchically. We explored a selection based variant to combine the
base classifiers. We validated this model with different base
classifiers using 37 training datasets. It was carried out a statistical
comparison of these models with the well known Bagging and
Boosting, obtaining significantly superior results with the
hierarchical ensemble using Multilayer Perceptron as base classifier.
Therefore, we demonstrated the efficacy of the proposed ensemble,
as well as its applicability to general problems.
Abstract: Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Abstract: Leo Breimans Random Forests (RF) is a recent
development in tree based classifiers and quickly proven to be one of
the most important algorithms in the machine learning literature. It
has shown robust and improved results of classifications on standard
data sets. Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques to the random forests. We
experiment the working of the ensembles of random forests on the
standard data sets available in UCI data sets. We compare the
original random forest algorithm with their ensemble counterparts
and discuss the results.
Abstract: Instead of traditional (nominal) classification we investigate
the subject of ordinal classification or ranking. An enhanced
method based on an ensemble of Support Vector Machines (SVM-s)
is proposed. Each binary classifier is trained with specific weights
for each object in the training data set. Experiments on benchmark
datasets and synthetic data indicate that the performance of our
approach is comparable to state of the art kernel methods for
ordinal regression. The ensemble method, which is straightforward
to implement, provides a very good sensitivity-specificity trade-off
for the highest and lowest rank.
Abstract: This paper illustrates the use of a combined neural
network model for classification of electrocardiogram (ECG) beats.
We present a trainable neural network ensemble approach to develop
customized electrocardiogram beat classifier in an effort to further
improve the performance of ECG processing and to offer
individualized health care.
We process a three stage technique for detection of premature
ventricular contraction (PVC) from normal beats and other heart
diseases. This method includes a denoising, a feature extraction and a
classification. At first we investigate the application of stationary
wavelet transform (SWT) for noise reduction of the
electrocardiogram (ECG) signals. Then feature extraction module
extracts 10 ECG morphological features and one timing interval
feature. Then a number of multilayer perceptrons (MLPs) neural
networks with different topologies are designed.
The performance of the different combination methods as well as
the efficiency of the whole system is presented. Among them,
Stacked Generalization as a proposed trainable combined neural
network model possesses the highest recognition rate of around 95%.
Therefore, this network proves to be a suitable candidate in ECG
signal diagnosis systems. ECG samples attributing to the different
ECG beat types were extracted from the MIT-BIH arrhythmia
database for the study.
Abstract: The clustering ensembles combine multiple partitions
generated by different clustering algorithms into a single clustering
solution. Clustering ensembles have emerged as a prominent method
for improving robustness, stability and accuracy of unsupervised
classification solutions. So far, many contributions have been done to
find consensus clustering. One of the major problems in clustering
ensembles is the consensus function. In this paper, firstly, we
introduce clustering ensembles, representation of multiple partitions,
its challenges and present taxonomy of combination algorithms.
Secondly, we describe consensus functions in clustering ensembles
including Hypergraph partitioning, Voting approach, Mutual
information, Co-association based functions and Finite mixture
model, and next explain their advantages, disadvantages and
computational complexity. Finally, we compare the characteristics of
clustering ensembles algorithms such as computational complexity,
robustness, simplicity and accuracy on different datasets in previous
techniques.
Abstract: In recent years, a number of works proposing the
combination of multiple classifiers to produce a single
classification have been reported in remote sensing literature. The
resulting classifier, referred to as an ensemble classifier, is
generally found to be more accurate than any of the individual
classifiers making up the ensemble. As accuracy is the primary
concern, much of the research in the field of land cover
classification is focused on improving classification accuracy. This
study compares the performance of four ensemble approaches
(boosting, bagging, DECORATE and random subspace) with a
univariate decision tree as base classifier. Two training datasets,
one without ant noise and other with 20 percent noise was used to
judge the performance of different ensemble approaches. Results
with noise free data set suggest an improvement of about 4% in
classification accuracy with all ensemble approaches in
comparison to the results provided by univariate decision tree
classifier. Highest classification accuracy of 87.43% was achieved
by boosted decision tree. A comparison of results with noisy data
set suggests that bagging, DECORATE and random subspace
approaches works well with this data whereas the performance of
boosted decision tree degrades and a classification accuracy of
79.7% is achieved which is even lower than that is achieved (i.e.
80.02%) by using unboosted decision tree classifier.
Abstract: The approach of subset selection in polynomial
regression model building assumes that the chosen fixed full set of
predefined basis functions contains a subset that is sufficient to
describe the target relation sufficiently well. However, in most cases
the necessary set of basis functions is not known and needs to be
guessed – a potentially non-trivial (and long) trial and error process.
In our research we consider a potentially more efficient approach –
Adaptive Basis Function Construction (ABFC). It lets the model
building method itself construct the basis functions necessary for
creating a model of arbitrary complexity with adequate predictive
performance. However, there are two issues that to some extent
plague the methods of both the subset selection and the ABFC,
especially when working with relatively small data samples: the
selection bias and the selection instability. We try to correct these
issues by model post-evaluation using Cross-Validation and model
ensembling. To evaluate the proposed method, we empirically
compare it to ABFC methods without ensembling, to a widely used
method of subset selection, as well as to some other well-known
regression modeling methods, using publicly available data sets.
Abstract: We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.