Abstract: Feature selection and attribute reduction are crucial
problems, and widely used techniques in the field of machine
learning, data mining and pattern recognition to overcome the
well-known phenomenon of the Curse of Dimensionality. This paper
presents a feature selection method that efficiently carries out attribute
reduction, thereby selecting the most informative features of a dataset.
It consists of two components: 1) a measure for feature subset
evaluation, and 2) a search strategy. For the evaluation measure,
we have employed the fuzzy-rough dependency degree (FRFDD)
of the lower approximation-based fuzzy-rough feature selection
(L-FRFS) due to its effectiveness in feature selection. As for the
search strategy, a modified version of a binary shuffled frog leaping
algorithm is proposed (B-SFLA). The proposed feature selection
method is obtained by hybridizing the B-SFLA with the FRDD. Nine
classifiers have been employed to compare the proposed approach
with several existing methods over twenty two datasets, including
nine high dimensional and large ones, from the UCI repository.
The experimental results demonstrate that the B-SFLA approach
significantly outperforms other metaheuristic methods in terms of the
number of selected features and the classification accuracy.
Abstract: Brain-Computer Interfaces (BCIs) measure brain
signals activity, intentionally and unintentionally induced by users,
and provides a communication channel without depending on the
brain’s normal peripheral nerves and muscles output pathway.
Feature Selection (FS) is a global optimization machine learning
problem that reduces features, removes irrelevant and noisy data
resulting in acceptable recognition accuracy. It is a vital step
affecting pattern recognition system performance. This study presents
a new Binary Particle Swarm Optimization (BPSO) based feature
selection algorithm. Multi-layer Perceptron Neural Network
(MLPNN) classifier with backpropagation training algorithm and
Levenberg-Marquardt training algorithm classify selected features.
Abstract: By the evolvement in technology, the way of
expressing opinions switched direction to the digital world. The
domain of politics, as one of the hottest topics of opinion mining
research, merged together with the behavior analysis for affiliation
determination in texts, which constitutes the subject of this paper.
This study aims to classify the text in news/blogs either as
Republican or Democrat with the minimum number of features. As
an initial set, 68 features which 64 were constituted by Linguistic
Inquiry and Word Count (LIWC) features were tested against 14
benchmark classification algorithms. In the later experiments, the
dimensions of the feature vector reduced based on the 7 feature
selection algorithms. The results show that the “Decision Tree”,
“Rule Induction” and “M5 Rule” classifiers when used with “SVM”
and “IGR” feature selection algorithms performed the best up to
82.5% accuracy on a given dataset. Further tests on a single feature
and the linguistic based feature sets showed the similar results. The
feature “Function”, as an aggregate feature of the linguistic category,
was found as the most differentiating feature among the 68 features
with the accuracy of 81% in classifying articles either as Republican
or Democrat.
Abstract: As a popular rank-reduced vector space approach,
Latent Semantic Indexing (LSI) has been used in information
retrieval and other applications. In this paper, an LSI-based content
vector model for text classification is presented, which constructs
multiple augmented category LSI spaces and classifies text by their
content. The model integrates the class discriminative information
from the training data and is equipped with several pertinent feature
selection and text classification algorithms. The proposed classifier
has been applied to email classification and its experiments on a
benchmark spam testing corpus (PU1) have shown that the approach
represents a competitive alternative to other email classifiers based
on the well-known SVM and naïve Bayes algorithms.
Abstract: In this study we focus on improvement performance
of a cue based Motor Imagery Brain Computer Interface (BCI). For
this purpose, data fusion approach is used on results of different
classifiers to make the best decision. At first step Distinction
Sensitive Learning Vector Quantization method is used as a feature
selection method to determine most informative frequencies in
recorded signals and its performance is evaluated by frequency
search method. Then informative features are extracted by packet
wavelet transform. In next step 5 different types of classification
methods are applied. The methodologies are tested on BCI
Competition II dataset III, the best obtained accuracy is 85% and the
best kappa value is 0.8. At final step ordered weighted averaging
(OWA) method is used to provide a proper aggregation classifiers
outputs. Using OWA enhanced system accuracy to 95% and kappa
value to 0.9. Applying OWA just uses 50 milliseconds for
performing calculation.