Abstract: This work presents a proposal to perform contextual sentiment analysis using a supervised learning algorithm and disregarding the extensive training of annotators. To achieve this goal, a web platform was developed to perform the entire procedure outlined in this paper. The main contribution of the pipeline described in this article is to simplify and automate the annotation process through a system of analysis of congruence between the notes. This ensured satisfactory results even without using specialized annotators in the context of the research, avoiding the generation of biased training data for the classifiers. For this, a case
study was conducted in a blog of entrepreneurship. The experimental results were consistent with the literature related annotation using formalized process with experts.
Abstract: This paper regards the phenomena of intensive suburbanization and urbanization in Olomouc city and in Olomouc region in general for the period of 1986–2009. A Remote Sensing approach that involves tracking of changes in Land Cover units is proposed to quantify the urbanization state and trends in temporal and spatial aspects. It actually consisted of two approaches, Experiment 1 and Experiment 2 which implied two different image classification solutions in order to provide Land Cover maps for each 1986–2009 time split available in the Landsat image set. Experiment 1 dealt with the unsupervised classification, while Experiment 2 involved semi- supervised classification, using a combination of object-based and pixel-based classifiers. The resulting Land Cover maps were subsequently quantified for the proportion of urban area unit and its trend through time, and also for the urban area unit stability, yielding the relation of spatial and temporal development of the urban area unit. Some outcomes seem promising but there is indisputably room for improvements of source data and also processing and filtering.
Abstract: Naïve Bayes classifiers are simple probabilistic
classifiers. Classification extracts patterns by using data file with a set
of labeled training examples and is currently one of the most
significant areas in data mining. However, Naïve Bayes assumes the
independence among the features. Structural learning among the
features thus helps in the classification problem. In this study, the use
of structural learning in Bayesian Network is proposed to be applied
where there are relationships between the features when using the
Naïve Bayes. The improvement in the classification using structural
learning is shown if there exist relationship between the features or
when they are not independent.
Abstract: Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.
Abstract: Until recently, researchers have developed various
tools and methodologies for effective clinical decision-making.
Among those decisions, chest pain diseases have been one of
important diagnostic issues especially in an emergency department. To
improve the ability of physicians in diagnosis, many researchers have
developed diagnosis intelligence by using machine learning and data
mining. However, most of the conventional methodologies have been
generally based on a single classifier for disease classification and
prediction, which shows moderate performance. This study utilizes an
ensemble strategy to combine multiple different classifiers to help
physicians diagnose chest pain diseases more accurately than ever.
Specifically the ensemble strategy is applied by using the integration
of decision trees, neural networks, and support vector machines. The
ensemble models are applied to real-world emergency data. This study
shows that the performance of the ensemble models is superior to each
of single classifiers.
Abstract: This paper reports a new pattern recognition approach for face recognition. The biological model of light receptors - cones and rods in human eyes and the way they are associated with pattern vision in human vision forms the basis of this approach. The functional model is simulated using CWD and WPD. The paper also discusses the experiments performed for face recognition using the features extracted from images in the AT & T face database. Artificial Neural Network and k- Nearest Neighbour classifier algorithms are employed for the recognition purpose. A feature vector is formed for each of the face images in the database and recognition accuracies are computed and compared using the classifiers. Simulation results show that the proposed method outperforms traditional way of feature extraction methods prevailing for pattern recognition in terms of recognition accuracy for face images with pose and illumination variations.
Abstract: In this paper, a second order autoregressive (AR)
model is proposed to discriminate alcoholics using single trial
gamma band Visual Evoked Potential (VEP) signals using 3 different
classifiers: Simplified Fuzzy ARTMAP (SFA) neural network (NN),
Multilayer-perceptron-backpropagation (MLP-BP) NN and Linear
Discriminant (LD). Electroencephalogram (EEG) signals were
recorded from alcoholic and control subjects during the presentation
of visuals from Snodgrass and Vanderwart picture set. Single trial
VEP signals were extracted from EEG signals using Elliptic filtering
in the gamma band spectral range. A second order AR model was
used as gamma band VEP exhibits pseudo-periodic behaviour and
second order AR is optimal to represent this behaviour. This
circumvents the requirement of having to use some criteria to choose
the correct order. The averaged discrimination errors of 2.6%, 2.8%
and 11.9% were given by LD, MLP-BP and SFA classifiers. The
high LD discrimination results show the validity of the proposed
method to discriminate between alcoholic subjects.
Abstract: Combining classifiers is a useful method for solving
complex problems in machine learning. The ECOC (Error Correcting
Output Codes) method has been widely used for designing combining
classifiers with an emphasis on the diversity of classifiers. In this
paper, in contrast to the standard ECOC approach in which individual
classifiers are chosen homogeneously, classifiers are selected
according to the complexity of the corresponding binary problem. We
use SATIMAGE database (containing 6 classes) for our experiments.
The recognition error rate in our proposed method is %10.37 which
indicates a considerable improvement in comparison with the
conventional ECOC and stack generalization methods.
Abstract: Tumor classification is a key area of research in the
field of bioinformatics. Microarray technology is commonly used in
the study of disease diagnosis using gene expression levels. The
main drawback of gene expression data is that it contains thousands
of genes and a very few samples. Feature selection methods are used
to select the informative genes from the microarray. These methods
considerably improve the classification accuracy. In the proposed
method, Genetic Algorithm (GA) is used for effective feature
selection. Informative genes are identified based on the T-Statistics,
Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate
solutions of GA are obtained from top-m informative genes. The
classification accuracy of k-Nearest Neighbor (kNN) method is used
as the fitness function for GA. In this work, kNN and Support Vector
Machine (SVM) are used as the classifiers. The experimental results
show that the proposed work is suitable for effective feature
selection. With the help of the selected genes, GA-kNN method
achieves 100% accuracy in 4 datasets and GA-SVM method
achieves in 5 out of 10 datasets. The GA with kNN and SVM
methods are demonstrated to be an accurate method for microarray
based tumor classification.
Abstract: The objective of this paper, is to apply support vector machine (SVM) approach for the classification of cancerous and normal regions of prostate images. Three kinds of textural features are extracted and used for the analysis: parameters of the Gauss- Markov random field (GMRF), correlation function and relative entropy. Prostate images are acquired by the system consisting of a microscope, video camera and a digitizing board. Cross-validated classification over a database of 46 images is implemented to evaluate the performance. In SVM classification, sensitivity and specificity of 96.2% and 97.0% are achieved for the 32x32 pixel block sized data, respectively, with an overall accuracy of 96.6%. Classification performance is compared with artificial neural network and k-nearest neighbor classifiers. Experimental results demonstrate that the SVM approach gives the best performance.
Abstract: This paper presents a new technique for generating sets of synthetic classifiers to evaluate abstract-level combination methods. The sets differ in terms of both recognition rates of the individual classifiers and degree of similarity. For this purpose, each abstract-level classifier is considered as a random variable producing one class label as the output for an input pattern. From the initial set of classifiers, new slightly different sets are generated by applying specific operators, which are defined at the purpose. Finally, the sets of synthetic classifiers have been used to estimate the performance of combination methods for abstract-level classifiers. The experimental results demonstrate the effectiveness of the proposed approach.
Abstract: Logic based methods for learning from structured data
is limited w.r.t. handling large search spaces, preventing large-sized
substructures from being considered by the resulting classifiers. A
novel approach to learning from structured data is introduced that
employs a structure transformation method, called finger printing, for
addressing these limitations. The method, which generates features
corresponding to arbitrarily complex substructures, is implemented in
a system, called DIFFER. The method is demonstrated to perform
comparably to an existing state-of-art method on some benchmark
data sets without requiring restrictions on the search space.
Furthermore, learning from the union of features generated by finger
printing and the previous method outperforms learning from each
individual set of features on all benchmark data sets, demonstrating
the benefit of developing complementary, rather than competing,
methods for structure classification.
Abstract: Power System Security is a major concern in real time
operation. Conventional method of security evaluation consists of
performing continuous load flow and transient stability studies by
simulation program. This is highly time consuming and infeasible
for on-line application. Pattern Recognition (PR) is a promising
tool for on-line security evaluation. This paper proposes a Support
Vector Machine (SVM) based binary classification for static and
transient security evaluation. The proposed SVM based PR approach
is implemented on New England 39 Bus and IEEE 57 Bus systems.
The simulation results of SVM classifier is compared with the other
classifier algorithms like Method of Least Squares (MLS), Multi-
Layer Perceptron (MLP) and Linear Discriminant Analysis (LDA)
classifiers.
Abstract: In this paper we compare the accuracy of data mining
methods to classifying students in order to predicting student-s class
grade. These predictions are more useful for identifying weak
students and assisting management to take remedial measures at early
stages to produce excellent graduate that will graduate at least with
second class upper. Firstly we examine single classifiers accuracy on
our data set and choose the best one and then ensembles it with a
weak classifier to produce simple voting method. We present results
show that combining different classifiers outperformed other single
classifiers for predicting student performance.
Abstract: Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.
Abstract: Effectiveness of Artificial Neural Networks (ANN)
and Support Vector Machines (SVM) classifiers for fault diagnosis of
rolling element bearings are presented in this paper. The
characteristic features of vibration signals of rotating driveline that
was run in its normal condition and with faults introduced were used
as input to ANN and SVM classifiers. Simple statistical features such
as standard deviation, skewness, kurtosis etc. of the time-domain
vibration signal segments along with peaks of the signal and peak of
power spectral density (PSD) are used as features to input the ANN
and SVM classifier. The effect of preprocessing of the vibration
signal by Discreet Wavelet Transform (DWT) prior to feature
extraction is also studied. It is shown from the experimental results
that the performance of SVM classifier in identification of bearing
condition is better then ANN and pre-processing of vibration signal
by DWT enhances the effectiveness of both ANN and SVM classifier
Abstract: Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques with classifiers such as random
forests, neural networks and support vector machines. The data sets
are from MAGIC, a Cherenkov telescope experiment. The task is to
classify gamma signals from overwhelmingly hadron and muon
signals representing a rare class classification problem. We compare
the individual classifiers with their ensemble counterparts and
discuss the results. WEKA a wonderful tool for machine learning has
been used for making the experiments.
Abstract: Classifier fusion may generate more accurate
classification than each of the basic classifiers. Fusion is often based
on fixed combination rules like the product, average etc. This paper
presents decision templates as classifier fusion method for the
recognition of the handwritten English and Farsi numerals (1-9).
The process involves extracting a feature vector on well-known
image databases. The extracted feature vector is fed to multiple
classifier fusion. A set of experiments were conducted to compare
decision templates (DTs) with some combination rules. Results from
decision templates conclude 97.99% and 97.28% for Farsi and
English handwritten digits.
Abstract: In this paper we designed and implemented a new
ensemble of classifiers based on a sequence of classifiers which were
specialized in regions of the training dataset where errors of its
trained homologous are concentrated. In order to separate this
regions, and to determine the aptitude of each classifier to properly
respond to a new case, it was used another set of classifiers built
hierarchically. We explored a selection based variant to combine the
base classifiers. We validated this model with different base
classifiers using 37 training datasets. It was carried out a statistical
comparison of these models with the well known Bagging and
Boosting, obtaining significantly superior results with the
hierarchical ensemble using Multilayer Perceptron as base classifier.
Therefore, we demonstrated the efficacy of the proposed ensemble,
as well as its applicability to general problems.
Abstract: In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.