Abstract: Tumor classification is a key area of research in the
field of bioinformatics. Microarray technology is commonly used in
the study of disease diagnosis using gene expression levels. The
main drawback of gene expression data is that it contains thousands
of genes and a very few samples. Feature selection methods are used
to select the informative genes from the microarray. These methods
considerably improve the classification accuracy. In the proposed
method, Genetic Algorithm (GA) is used for effective feature
selection. Informative genes are identified based on the T-Statistics,
Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate
solutions of GA are obtained from top-m informative genes. The
classification accuracy of k-Nearest Neighbor (kNN) method is used
as the fitness function for GA. In this work, kNN and Support Vector
Machine (SVM) are used as the classifiers. The experimental results
show that the proposed work is suitable for effective feature
selection. With the help of the selected genes, GA-kNN method
achieves 100% accuracy in 4 datasets and GA-SVM method
achieves in 5 out of 10 datasets. The GA with kNN and SVM
methods are demonstrated to be an accurate method for microarray
based tumor classification.
Abstract: The objective of this paper, is to apply support vector machine (SVM) approach for the classification of cancerous and normal regions of prostate images. Three kinds of textural features are extracted and used for the analysis: parameters of the Gauss- Markov random field (GMRF), correlation function and relative entropy. Prostate images are acquired by the system consisting of a microscope, video camera and a digitizing board. Cross-validated classification over a database of 46 images is implemented to evaluate the performance. In SVM classification, sensitivity and specificity of 96.2% and 97.0% are achieved for the 32x32 pixel block sized data, respectively, with an overall accuracy of 96.6%. Classification performance is compared with artificial neural network and k-nearest neighbor classifiers. Experimental results demonstrate that the SVM approach gives the best performance.
Abstract: A spatial classification technique incorporating a State of Art Feature Extraction algorithm is proposed in this paper for classifying a heterogeneous classes present in hyper spectral images. The classification accuracy can be improved if and only if both the feature extraction and classifier selection are proper. As the classes in the hyper spectral images are assumed to have different textures, textural classification is entertained. Run Length feature extraction is entailed along with the Principal Components and Independent Components. A Hyperspectral Image of Indiana Site taken by AVIRIS is inducted for the experiment. Among the original 220 bands, a subset of 120 bands is selected. Gray Level Run Length Matrix (GLRLM) is calculated for the selected forty bands. From GLRLMs the Run Length features for individual pixels are calculated. The Principle Components are calculated for other forty bands. Independent Components are calculated for next forty bands. As Principal & Independent Components have the ability to represent the textural content of pixels, they are treated as features. The summation of Run Length features, Principal Components, and Independent Components forms the Combined Features which are used for classification. SVM with Binary Hierarchical Tree is used to classify the hyper spectral image. Results are validated with ground truth and accuracies are calculated.
Abstract: This paper proposes to use ETM+ multispectral data
and panchromatic band as well as texture features derived from the
panchromatic band for land cover classification. Four texture features
including one 'internal texture' and three GLCM based textures
namely correlation, entropy, and inverse different moment were used
in combination with ETM+ multispectral data. Two data sets
involving combination of multispectral, panchromatic band and its
texture were used and results were compared with those obtained by
using multispectral data alone. A decision tree classifier with and
without boosting were used to classify different datasets. Results
from this study suggest that the dataset consisting of panchromatic
band, four of its texture features and multispectral data was able to
increase the classification accuracy by about 2%. In comparison, a
boosted decision tree was able to increase the classification accuracy
by about 3% with the same dataset.
Abstract: Near-infrared (NIR) spectroscopy is a widely used
method for material identification for laboratory and industrial applications.
While standard spectrometers only allow measurements at
one sampling point at a time, NIR Spectral Imaging techniques can
measure, in real-time, both the size and shape of an object as well as
identify the material the object is made of. The online classification
and sorting of recovered paper with NIR Spectral Imaging (SI)
is used with success in the paper recycling industry throughout
Europe. Recently, the globalisation of the recycling material streams
caused that water-based flexographic-printed newspapers mainly from
UK and Italy appear also in central Europe. These flexo-printed
newspapers are not sufficiently de-inkable with the standard de-inking
process originally developed for offset-printed paper. This de-inking
process removes the ink from recovered paper and is the fundamental
processing step to produce high-quality paper from recovered paper.
Thus, the flexo-printed newspapers are a growing problem for the
recycling industry as they reduce the quality of the produced paper
if their amount exceeds a certain limit within the recovered paper
material.
This paper presents the results of a research project for the
development of an automated entry inspection system for recovered
paper that was jointly conducted by CTR AG (Austria) and PTS
Papiertechnische Stiftung (Germany). Within the project an NIR
SI prototype for the identification of flexo-printed newspaper has
been developed. The prototype can identify and sort out flexoprinted
newspapers in real-time and achieves a detection accuracy
for flexo-printed newspaper of over 95%. NIR SI, the technology the
prototype is based on, allows the development of inspection systems
for incoming goods in a paper production facility as well as industrial
sorting systems for recovered paper in the recycling industry in the
near future.
Abstract: Linear Discrimination Analysis (LDA) is a linear
solution for classification of two classes. In this paper, we propose a
variant LDA method for multi-class problem which redefines the
between class and within class scatter matrices by incorporating a
weight function into each of them. The aim is to separate classes as
much as possible in a situation that one class is well separated from
other classes, incidentally, that class must have a little influence on
classification. It has been suggested to alleviate influence of classes
that are well separated by adding a weight into between class scatter
matrix and within class scatter matrix. To obtain a simple and
effective weight function, ordinary LDA between every two classes
has been used in order to find Fisher discrimination value and passed
it as an input into two weight functions and redefined between class
and within class scatter matrices. Experimental results showed that
our new LDA method improved classification rate, on glass, iris and
wine datasets, in comparison to different versions of LDA.
Abstract: Logic based methods for learning from structured data
is limited w.r.t. handling large search spaces, preventing large-sized
substructures from being considered by the resulting classifiers. A
novel approach to learning from structured data is introduced that
employs a structure transformation method, called finger printing, for
addressing these limitations. The method, which generates features
corresponding to arbitrarily complex substructures, is implemented in
a system, called DIFFER. The method is demonstrated to perform
comparably to an existing state-of-art method on some benchmark
data sets without requiring restrictions on the search space.
Furthermore, learning from the union of features generated by finger
printing and the previous method outperforms learning from each
individual set of features on all benchmark data sets, demonstrating
the benefit of developing complementary, rather than competing,
methods for structure classification.
Abstract: This paper proposes an auto-classification algorithm
of Web pages using Data mining techniques. We consider the
problem of discovering association rules between terms in a set of
Web pages belonging to a category in a search engine database, and
present an auto-classification algorithm for solving this problem that
are fundamentally based on Apriori algorithm. The proposed
technique has two phases. The first phase is a training phase where
human experts determines the categories of different Web pages, and
the supervised Data mining algorithm will combine these categories
with appropriate weighted index terms according to the highest
supported rules among the most frequent words. The second phase is
the categorization phase where a web crawler will crawl through the
World Wide Web to build a database categorized according to the
result of the data mining approach. This database contains URLs and
their categories.
Abstract: Information on weed distribution within the field is
necessary to implement spatially variable herbicide application.
Since hand labor is costly, an automated weed control system could be
feasible. This paper deals with the development of an algorithm for
real time specific weed recognition system based on Histogram
Analysis of an image that is used for the weed classification. This
algorithm is specifically developed to classify images into broad and
narrow class for real-time selective herbicide application. The
developed system has been tested on weeds in the lab, which have
shown that the system to be very effectiveness in weed identification.
Further the results show a very reliable performance on images of
weeds taken under varying field conditions. The analysis of the results
shows over 95 percent classification accuracy over 140 sample images
(broad and narrow) with 70 samples from each category of weeds.
Abstract: Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.
Abstract: The automatic discrimination of seismic signals is an important practical goal for earth-science observatories due to the large amount of information that they receive continuously. An essential discrimination task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, two classes of seismic signals recorded routinely in geophysical laboratory of the National Center for Scientific and Technical Research in Morocco are considered. They correspond to signals associated to local earthquakes and chemical explosions. The approach adopted for the development of an automatic discrimination system is a modular system composed by three blocs: 1) Representation, 2) Dimensionality reduction and 3) Classification. The originality of our work consists in the use of a new wavelet called "modified Mexican hat wavelet" in the representation stage. For the dimensionality reduction, we propose a new algorithm based on the random projection and the principal component analysis.
Abstract: An evolutionary method whose selection and recombination
operations are based on generalization error-bounds of
support vector machine (SVM) can select a subset of potentially
informative genes for SVM classifier very efficiently [7]. In this
paper, we will use the derivative of error-bound (first-order criteria)
to select and recombine gene features in the evolutionary process,
and compare the performance of the derivative of error-bound with
the error-bound itself (zero-order) in the evolutionary process. We
also investigate several error-bounds and their derivatives to compare
the performance, and find the best criteria for gene selection
and classification. We use 7 cancer-related human gene expression
datasets to evaluate the performance of the zero-order and first-order
criteria of error-bounds. Though both criteria have the same strategy
in theoretically, experimental results demonstrate the best criterion
for microarray gene expression data.
Abstract: Texture information plays increasingly an important
role in remotely sensed imagery classification and many pattern
recognition applications. However, the selection of relevant textural
features to improve this classification accuracy is not a straightforward
task. This work investigates the effectiveness of two Mutual
Information Feature Selector (MIFS) algorithms to select salient
textural features that contain highly discriminatory information for
multispectral imagery classification. The input candidate features are
extracted from a SPOT High Resolution Visible(HRV) image using
Wavelet Transform (WT) at levels (l = 1,2).
The experimental results show that the selected textural features
according to MIFS algorithms make the largest contribution to
improve the classification accuracy than classical approaches such
as Principal Components Analysis (PCA) and Linear Discriminant
Analysis (LDA).
Abstract: Digital news with a variety topics is abundant on the
internet. The problem is to classify news based on its appropriate
category to facilitate user to find relevant news rapidly. Classifier
engine is used to split any news automatically into the respective
category. This research employs Support Vector Machine (SVM) to
classify Indonesian news. SVM is a robust method to classify
binary classes. The core processing of SVM is in the formation of an
optimum separating plane to separate the different classes. For
multiclass problem, a mechanism called one against one is used to
combine the binary classification result. Documents were taken
from the Indonesian digital news site, www.kompas.com. The
experiment showed a promising result with the accuracy rate of 85%.
This system is feasible to be implemented on Indonesian news
classification.
Abstract: Face authentication for access control is a face
membership authentication which passes the person of the incoming
face if he turns out to be one of an enrolled person based on face
recognition or rejects if not. Face membership authentication belongs
to the two class classification problem where SVM(Support Vector
Machine) has been successfully applied and shows better performance
compared to the conventional threshold-based classification. However,
most of previous SVMs have been trained using image feature vectors
extracted from face images of each class member(enrolled
class/unenrolled class) so that they are not robust to variations in
illuminations, poses, and facial expressions and much affected by
changes in member configuration of the enrolled class
In this paper, we propose an effective face membership
authentication method based on SVM using class discriminating
features which represent an incoming face image-s associability with
each class distinctively. These class discriminating features are weakly
related with image features so that they are less affected by variations
in illuminations, poses and facial expression.
Through experiments, it is shown that the proposed face
membership authentication method performs better than the threshold
rule-based or the conventional SVM-based authentication methods and
is relatively less affected by changes in member size and membership.
Abstract: The main objective of this work is to provide a fault detection and isolation based on Markov parameters for residual generation and a neural network for fault classification. The diagnostic approach is accomplished in two steps: In step 1, the system is identified using a series of input / output variables through an identification algorithm. In step 2, the fault is diagnosed comparing the Markov parameters of faulty and non faulty systems. The Artificial Neural Network is trained using predetermined faulty conditions serves to classify the unknown fault. In step 1, the identification is done by first formulating a Hankel matrix out of Input/ output variables and then decomposing the matrix via singular value decomposition technique. For identifying the system online sliding window approach is adopted wherein an open slit slides over a subset of 'n' input/output variables. The faults are introduced at arbitrary instances and the identification is carried out in online. Fault residues are extracted making a comparison of the first five Markov parameters of faulty and non faulty systems. The proposed diagnostic approach is illustrated on benchmark problems with encouraging results.
Abstract: A new approach for protection of power transformer is
presented using a time-frequency transform known as Wavelet transform.
Different operating conditions such as inrush, Normal, load,
External fault and internal fault current are sampled and processed
to obtain wavelet coefficients. Different Operating conditions provide
variation in wavelet coefficients. Features like energy and Standard
deviation are calculated using Parsevals theorem. These features
are used as inputs to PNN (Probabilistic neural network) for fault
classification. The proposed algorithm provides more accurate results
even in the presence of noise inputs and accurately identifies inrush
and fault currents. Overall classification accuracy of the proposed
method is found to be 96.45%. Simulation of the fault (with and
without noise) was done using MATLAB AND SIMULINK software
taking 2 cycles of data window (40 m sec) containing 800 samples.
The algorithm was evaluated by using 10 % Gaussian white noise.
Abstract: To overcome the product overload of Internet
shoppers, we introduce a semantic recommendation procedure which
is more efficient when applied to Internet shopping malls. The
suggested procedure recommends the semantic products to the
customers and is originally based on Web usage mining, product
classification, association rule mining, and frequently purchasing.
We applied the procedure to the data set of MovieLens Company for
performance evaluation, and some experimental results are provided.
The experimental results have shown superior performance in
terms of coverage and precision.
Abstract: The objective of this paper is to a design of pattern
classification model based on the back-propagation (BP) algorithm for
decision support system. Standard BP model has done full connection
of each node in the layers from input to output layers. Therefore, it
takes a lot of computing time and iteration computing for good
performance and less accepted error rate when we are doing some
pattern generation or training the network.
However, this model is using exclusive connection in between
hidden layer nodes and output nodes. The advantage of this model is
less number of iteration and better performance compare with standard
back-propagation model. We simulated some cases of classification
data and different setting of network factors (e.g. hidden layer number
and nodes, number of classification and iteration). During our
simulation, we found that most of simulations cases were satisfied by
BP based using exclusive connection network model compared to
standard BP. We expect that this algorithm can be available to
identification of user face, analysis of data, mapping data in between
environment data and information.
Abstract: This paper aims to present a survey of object
recognition/classification methods based on image moments. We
review various types of moments (geometric moments, complex
moments) and moment-based invariants with respect to various
image degradations and distortions (rotation, scaling, affine
transform, image blurring, etc.) which can be used as shape
descriptors for classification. We explain a general theory how to
construct these invariants and show also a few of them in explicit
forms. We review efficient numerical algorithms that can be used
for moment computation and demonstrate practical examples of
using moment invariants in real applications.