Abstract: A generic and extendible Multi-Agent Data Mining
(MADM) framework, MADMF (the Multi-Agent Data Mining
Framework) is described. The central feature of the framework is that
it avoids the use of agreed meta-language formats by supporting a
framework of wrappers.
The advantage offered is that the framework is easily extendible,
so that further data agents and mining agents can simply be added to
the framework. A demonstration MADMF framework is currently
available. The paper includes details of the MADMF architecture and
the wrapper principle incorporated into it. A full description and
evaluation of the framework-s operation is provided by considering
two MADM scenarios.
Abstract: Power System Security is a major concern in real time
operation. Conventional method of security evaluation consists of
performing continuous load flow and transient stability studies by
simulation program. This is highly time consuming and infeasible
for on-line application. Pattern Recognition (PR) is a promising
tool for on-line security evaluation. This paper proposes a Support
Vector Machine (SVM) based binary classification for static and
transient security evaluation. The proposed SVM based PR approach
is implemented on New England 39 Bus and IEEE 57 Bus systems.
The simulation results of SVM classifier is compared with the other
classifier algorithms like Method of Least Squares (MLS), Multi-
Layer Perceptron (MLP) and Linear Discriminant Analysis (LDA)
classifiers.
Abstract: In order to be able to automatically differentiate
between two modes of permanent flow of a liquid simulating blood,
it was imperative to put together a data bank. Thus, the acquisition of
the various amplitude spectra of the Doppler signal of this liquid in
laminar flow and other spectra in turbulent flow enabled us to
establish an automatic difference between the two modes. According
to the number of parameters and their nature, a comparative study
allowed us to choose the best classifier.
Abstract: In this paper we propose a new approach for flexible document categorization according to the document type or genre instead of topic. Our approach implements two homogenous classifiers: contextual classifier and logical classifier. The contextual classifier is based on the document URL, whereas, the logical classifier use the logical structure of the document to perform the categorization. The final categorization is obtained by combining contextual and logical categorizations. In our approach, each document is assigned to all predefined categories with different membership degrees. Our experiments demonstrate that our approach is best than other genre categorization approaches.
Abstract: In this paper we compare the accuracy of data mining
methods to classifying students in order to predicting student-s class
grade. These predictions are more useful for identifying weak
students and assisting management to take remedial measures at early
stages to produce excellent graduate that will graduate at least with
second class upper. Firstly we examine single classifiers accuracy on
our data set and choose the best one and then ensembles it with a
weak classifier to produce simple voting method. We present results
show that combining different classifiers outperformed other single
classifiers for predicting student performance.
Abstract: An evolutionary method whose selection and recombination
operations are based on generalization error-bounds of
support vector machine (SVM) can select a subset of potentially
informative genes for SVM classifier very efficiently [7]. In this
paper, we will use the derivative of error-bound (first-order criteria)
to select and recombine gene features in the evolutionary process,
and compare the performance of the derivative of error-bound with
the error-bound itself (zero-order) in the evolutionary process. We
also investigate several error-bounds and their derivatives to compare
the performance, and find the best criteria for gene selection
and classification. We use 7 cancer-related human gene expression
datasets to evaluate the performance of the zero-order and first-order
criteria of error-bounds. Though both criteria have the same strategy
in theoretically, experimental results demonstrate the best criterion
for microarray gene expression data.
Abstract: This paper presents and evaluates a new classification
method that aims to improve classifiers performances and speed up
their training process. The proposed approach, called labeled
classification, seeks to improve convergence of the BP (Back
propagation) algorithm through the addition of an extra feature
(labels) to all training examples. To classify every new example, tests
will be carried out each label. The simplicity of implementation is the
main advantage of this approach because no modifications are
required in the training algorithms. Therefore, it can be used with
others techniques of acceleration and stabilization. In this work, two
models of the labeled classification are proposed: the LMLP
(Labeled Multi Layered Perceptron) and the LNFC (Labeled Neuro
Fuzzy Classifier). These models are tested using Iris, wine, texture
and human thigh databases to evaluate their performances.
Abstract: Digital news with a variety topics is abundant on the
internet. The problem is to classify news based on its appropriate
category to facilitate user to find relevant news rapidly. Classifier
engine is used to split any news automatically into the respective
category. This research employs Support Vector Machine (SVM) to
classify Indonesian news. SVM is a robust method to classify
binary classes. The core processing of SVM is in the formation of an
optimum separating plane to separate the different classes. For
multiclass problem, a mechanism called one against one is used to
combine the binary classification result. Documents were taken
from the Indonesian digital news site, www.kompas.com. The
experiment showed a promising result with the accuracy rate of 85%.
This system is feasible to be implemented on Indonesian news
classification.
Abstract: The drug discovery process starts with protein
identification because proteins are responsible for many functions
required for maintenance of life. Protein identification further needs
determination of protein function. Proposed method develops a
classifier for human protein function prediction. The model uses
decision tree for classification process. The protein function is
predicted on the basis of matched sequence derived features per each
protein function. The research work includes the development of a
tool which determines sequence derived features by analyzing
different parameters. The other sequence derived features are
determined using various web based tools.
Abstract: Computer game industry has experienced exponential
growth in recent years. A game is a recreational activity involving
one or more players. Game input is information such as data,
commands, etc., which is passed to the game system at run time from
an external source. Conversely, game outputs are information which
are generated by the game system and passed to an external target,
but which is not used internally by the game. This paper identifies a
new classification scheme for game input and output, which is based
on player-s input and output. Using this, relationship table for game
input classifier and output classifier is developed.
Abstract: In this research study, an intelligent detection system
to support medical diagnosis and detection of abnormal lesions by
processing endoscopic images is presented. The images used in this
study have been obtained using the M2A Swallowable Imaging
Capsule - a patented, video color-imaging disposable capsule.
Schemes have been developed to extract texture features from the
fuzzy texture spectra in the chromatic and achromatic domains for a
selected region of interest from each color component histogram of
endoscopic images. The implementation of an advanced fuzzy
inference neural network which combines fuzzy systems and
artificial neural networks and the concept of fusion of multiple
classifiers dedicated to specific feature parameters have been also
adopted in this paper. The achieved high detection accuracy of the
proposed system has provided thus an indication that such intelligent
schemes could be used as a supplementary diagnostic tool in
endoscopy.
Abstract: This paper proposes new hybrid approaches for face
recognition. Gabor wavelets representation of face images is an
effective approach for both facial action recognition and face
identification. Perform dimensionality reduction and linear
discriminate analysis on the down sampled Gabor wavelet faces can
increase the discriminate ability. Nearest feature space is extended to
various similarity measures. In our experiments, proposed Gabor
wavelet faces combined with extended neural net feature space
classifier shows very good performance, which can achieve 93 %
maximum correct recognition rate on ORL data set without any preprocessing
step.
Abstract: Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.
Abstract: In this work, we present a novel active learning approach
for learning a visual object detection system. Our system
is composed of an active learning mechanism as wrapper around
a sub-algorithm which implement an online boosting-based learning
object detector. In the core is a combination of a bootstrap procedure
and a semi automatic learning process based on the online boosting
procedure. The idea is to exploit the availability of classifier during
learning to automatically label training samples and increasingly
improves the classifier. This addresses the issue of reducing labeling
effort meanwhile obtain better performance. In addition, we propose
a verification process for further improvement of the classifier.
The idea is to allow re-update on seen data during learning for
stabilizing the detector. The main contribution of this empirical study
is a demonstration that active learning based on an online boosting
approach trained in this manner can achieve results comparable or
even outperform a framework trained in conventional manner using
much more labeling effort. Empirical experiments on challenging data
set for specific object deteciton problems show the effectiveness of
our approach.
Abstract: In this study, a classification-based video
super-resolution method using artificial neural network (ANN) is
proposed to enhance low-resolution (LR) to high-resolution (HR)
frames. The proposed method consists of four main steps:
classification, motion-trace volume collection, temporal adjustment,
and ANN prediction. A classifier is designed based on the edge
properties of a pixel in the LR frame to identify the spatial information.
To exploit the spatio-temporal information, a motion-trace volume is
collected using motion estimation, which can eliminate unfathomable
object motion in the LR frames. In addition, temporal lateral process is
employed for volume adjustment to reduce unnecessary temporal
features. Finally, ANN is applied to each class to learn the complicated
spatio-temporal relationship between LR and HR frames. Simulation
results show that the proposed method successfully improves both
peak signal-to-noise ratio and perceptual quality.
Abstract: Distributed denial-of-service (DDoS) attacks pose a
serious threat to network security. There have been a lot of
methodologies and tools devised to detect DDoS attacks and reduce
the damage they cause. Still, most of the methods cannot
simultaneously achieve (1) efficient detection with a small number of
false alarms and (2) real-time transfer of packets. Here, we introduce
a method for proactive detection of DDoS attacks, by classifying the
network status, to be utilized in the detection stage of the proposed
anti-DDoS framework. Initially, we analyse the DDoS architecture
and obtain details of its phases. Then, we investigate the procedures
of DDoS attacks and select variables based on these features. Finally,
we apply the k-nearest neighbour (k-NN) method to classify the
network status into each phase of DDoS attack. The simulation result
showed that each phase of the attack scenario is classified well and
we could detect DDoS attack in the early stage.
Abstract: Many digital signal processing, techniques have been used to automatically distinguish protein coding regions (exons) from non-coding regions (introns) in DNA sequences. In this work, we have characterized these sequences according to their nonlinear dynamical features such as moment invariants, correlation dimension, and largest Lyapunov exponent estimates. We have applied our model to a number of real sequences encoded into a time series using EIIP sequence indicators. In order to discriminate between coding and non coding DNA regions, the phase space trajectory was first reconstructed for coding and non-coding regions. Nonlinear dynamical features are extracted from those regions and used to investigate a difference between them. Our results indicate that the nonlinear dynamical characteristics have yielded significant differences between coding (CR) and non-coding regions (NCR) in DNA sequences. Finally, the classifier is tested on real genes where coding and non-coding regions are well known.
Abstract: Effectiveness of Artificial Neural Networks (ANN)
and Support Vector Machines (SVM) classifiers for fault diagnosis of
rolling element bearings are presented in this paper. The
characteristic features of vibration signals of rotating driveline that
was run in its normal condition and with faults introduced were used
as input to ANN and SVM classifiers. Simple statistical features such
as standard deviation, skewness, kurtosis etc. of the time-domain
vibration signal segments along with peaks of the signal and peak of
power spectral density (PSD) are used as features to input the ANN
and SVM classifier. The effect of preprocessing of the vibration
signal by Discreet Wavelet Transform (DWT) prior to feature
extraction is also studied. It is shown from the experimental results
that the performance of SVM classifier in identification of bearing
condition is better then ANN and pre-processing of vibration signal
by DWT enhances the effectiveness of both ANN and SVM classifier
Abstract: One of the approaches enabling people with amputated
limbs to establish some sort of interface with the real world includes
the utilization of the myoelectric signal (MES) from the remaining
muscles of those limbs. The MES can be used as a control input to a
multifunction prosthetic device. In this control scheme, known as the
myoelectric control, a pattern recognition approach is usually utilized
to discriminate between the MES signals that belong to different
classes of the forearm movements. Since the MES is recorded using
multiple channels, the feature vector size can become very large. In
order to reduce the computational cost and enhance the generalization
capability of the classifier, a dimensionality reduction method is
needed to identify an informative yet moderate size feature set. This
paper proposes a new fuzzy version of the well known Fisher-s
Linear Discriminant Analysis (LDA) feature projection technique.
Furthermore, based on the fact that certain muscles might contribute
more to the discrimination process, a novel feature weighting scheme
is also presented by employing Particle Swarm Optimization (PSO)
for estimating the weight of each feature. The new method, called
PSOFLDA, is tested on real MES datasets and compared with other
techniques to prove its superiority.
Abstract: An emotional speech recognition system for the
applications on smart phones was proposed in this study to combine
with 3G mobile communications and social networks to provide users
and their groups with more interaction and care. This study developed
a mechanism using the support vector machines (SVM) to recognize
the emotions of speech such as happiness, anger, sadness and normal.
The mechanism uses a hierarchical classifier to adjust the weights of
acoustic features and divides various parameters into the categories of
energy and frequency for training. In this study, 28 commonly used
acoustic features including pitch and volume were proposed for
training. In addition, a time-frequency parameter obtained by
continuous wavelet transforms was also used to identify the accent and
intonation in a sentence during the recognition process. The Berlin
Database of Emotional Speech was used by dividing the speech into
male and female data sets for training. According to the experimental
results, the accuracies of male and female test sets were increased by
4.6% and 5.2% respectively after using the time-frequency parameter
for classifying happy and angry emotions. For the classification of all
emotions, the average accuracy, including male and female data, was
63.5% for the test set and 90.9% for the whole data set.