Abstract: Information on weed distribution within the field is
necessary to implement spatially variable herbicide application.
Since hand labor is costly, an automated weed control system could be
feasible. This paper deals with the development of an algorithm for
real time specific weed recognition system based on Histogram
Analysis of an image that is used for the weed classification. This
algorithm is specifically developed to classify images into broad and
narrow class for real-time selective herbicide application. The
developed system has been tested on weeds in the lab, which have
shown that the system to be very effectiveness in weed identification.
Further the results show a very reliable performance on images of
weeds taken under varying field conditions. The analysis of the results
shows over 95 percent classification accuracy over 140 sample images
(broad and narrow) with 70 samples from each category of weeds.
Abstract: Texture information plays increasingly an important
role in remotely sensed imagery classification and many pattern
recognition applications. However, the selection of relevant textural
features to improve this classification accuracy is not a straightforward
task. This work investigates the effectiveness of two Mutual
Information Feature Selector (MIFS) algorithms to select salient
textural features that contain highly discriminatory information for
multispectral imagery classification. The input candidate features are
extracted from a SPOT High Resolution Visible(HRV) image using
Wavelet Transform (WT) at levels (l = 1,2).
The experimental results show that the selected textural features
according to MIFS algorithms make the largest contribution to
improve the classification accuracy than classical approaches such
as Principal Components Analysis (PCA) and Linear Discriminant
Analysis (LDA).
Abstract: Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.
Abstract: The identification and classification of weeds are of
major technical and economical importance in the agricultural
industry. To automate these activities, like in shape, color and
texture, weed control system is feasible. The goal of this paper is to
build a real-time, machine vision weed control system that can detect
weed locations. In order to accomplish this objective, a real-time
robotic system is developed to identify and locate outdoor plants
using machine vision technology and pattern recognition. The
algorithm is developed to classify images into broad and narrow class
for real-time selective herbicide application. The developed
algorithm has been tested on weeds at various locations, which have
shown that the algorithm to be very effectiveness in weed
identification. Further the results show a very reliable performance
on weeds under varying field conditions. The analysis of the results
shows over 90 percent classification accuracy over 140 sample
images (broad and narrow) with 70 samples from each category of
weeds.
Abstract: A new approach for protection of power transformer is
presented using a time-frequency transform known as Wavelet transform.
Different operating conditions such as inrush, Normal, load,
External fault and internal fault current are sampled and processed
to obtain wavelet coefficients. Different Operating conditions provide
variation in wavelet coefficients. Features like energy and Standard
deviation are calculated using Parsevals theorem. These features
are used as inputs to PNN (Probabilistic neural network) for fault
classification. The proposed algorithm provides more accurate results
even in the presence of noise inputs and accurately identifies inrush
and fault currents. Overall classification accuracy of the proposed
method is found to be 96.45%. Simulation of the fault (with and
without noise) was done using MATLAB AND SIMULINK software
taking 2 cycles of data window (40 m sec) containing 800 samples.
The algorithm was evaluated by using 10 % Gaussian white noise.
Abstract: Avoiding learning failures in mathematics e-learning environments caused by emotional problems in students with autism has become an important topic for combining of special education with information and communications technology. This study presents an adaptive emotional adjustment model in mathematics e-learning for students with autism, emphasizing the lack of emotional perception in mathematics e-learning systems. In addition, an emotion classification for students with autism was developed by inducing emotions in mathematical learning environments to record changes in the physiological signals and facial expressions of students. Using these methods, 58 emotional features were obtained. These features were then processed using one-way ANOVA and information gain (IG). After reducing the feature dimension, methods of support vector machines (SVM), k-nearest neighbors (KNN), and classification and regression trees (CART) were used to classify four emotional categories: baseline, happy, angry, and anxious. After testing and comparisons, in a situation without feature selection, the accuracy rate of the SVM classification can reach as high as 79.3-%. After using IG to reduce the feature dimension, with only 28 features remaining, SVM still has a classification accuracy of 78.2-%. The results of this research could enhance the effectiveness of eLearning in special education.
Abstract: Developing an accurate classifier for high dimensional microarray datasets is a challenging task due to availability of small sample size. Therefore, it is important to determine a set of relevant genes that classify the data well. Traditionally, gene selection method often selects the top ranked genes according to their discriminatory power. Often these genes are correlated with each other resulting in redundancy. In this paper, we have proposed a hybrid method using feature ranking and wrapper method (Genetic Algorithm with multiclass SVM) to identify a set of relevant genes that classify the data more accurately. A new fitness function for genetic algorithm is defined that focuses on selecting the smallest set of genes that provides maximum accuracy. Experiments have been carried on four well-known datasets1. The proposed method provides better results in comparison to the results found in the literature in terms of both classification accuracy and number of genes selected.
Abstract: The feature of HIV genome is in a wide range because
of it is highly heterogeneous. Hence, the infection ability of the virus changes related with different chemokine receptors. From this point,
R5 and X4 HIV viruses use CCR5 and CXCR5 coreceptors respectively while R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to
classify by using the coreceptors of HIV genome.
The aim of this study is to develop the optimal Multilayer
Perceptron (MLP) for high classification accuracy of HIV sub-type viruses. To accomplish this purpose, the unit number in hidden layer
was incremented one by one, from one to a particular number. The statistical data of R5X4, R5 and X4 viruses was preprocessed by the
signal processing methods. Accessible residues of these virus sequences were extracted and modeled by Auto-Regressive Model
(AR) due to the dimension of residues is large and different from each other. Finally the pre-processed dataset was used to evolve MLP with various number of hidden units to determine R5X4
viruses. Furthermore, ROC analysis was used to figure out the optimal MLP structure.
Abstract: In this paper we propose a robust environmental sound classification approach, based on spectrograms features driven from log-Gabor filters. This approach includes two methods. In the first methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The second method uses the same steps but applied only to three patches extracted from each spectrogram.
To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %.
Abstract: Discovering new biological knowledge from the highthroughput biological data is a major challenge to bioinformatics today. To address this challenge, we developed a new approach for protein classification. Proteins that are evolutionarily- and thereby functionally- related are said to belong to the same classification. Identifying protein classification is of fundamental importance to document the diversity of the known protein universe. It also provides a means to determine the functional roles of newly discovered protein sequences. Our goal is to predict the functional classification of novel protein sequences based on a set of features extracted from each protein sequence. The proposed technique used datasets extracted from the Structural Classification of Proteins (SCOP) database. A set of spectral domain features based on Fast Fourier Transform (FFT) is used. The proposed classifier uses multilayer back propagation (MLBP) neural network for protein classification. The maximum classification accuracy is about 91% when applying the classifier to the full four levels of the SCOP database. However, it reaches a maximum of 96% when limiting the classification to the family level. The classification results reveal that spectral domain contains information that can be used for classification with high accuracy. In addition, the results emphasize that sequence similarity measures are of great importance especially at the family level.
Abstract: In order to develop forest management strategies in
tropical forest in Malaysia, surveying the forest resources and
monitoring the forest area affected by logging activities is essential.
There are tremendous effort has been done in classification of land
cover related to forest resource management in this country as it is a
priority in all aspects of forest mapping using remote sensing and
related technology such as GIS. In fact classification process is a
compulsory step in any remote sensing research. Therefore, the main
objective of this paper is to assess classification accuracy of
classified forest map on Landsat TM data from difference number of
reference data (200 and 388 reference data). This comparison was
made through observation (200 reference data), and interpretation
and observation approaches (388 reference data). Five land cover
classes namely primary forest, logged over forest, water bodies, bare
land and agricultural crop/mixed horticultural can be identified by
the differences in spectral wavelength. Result showed that an overall
accuracy from 200 reference data was 83.5 % (kappa value
0.7502459; kappa variance 0.002871), which was considered
acceptable or good for optical data. However, when 200 reference
data was increased to 388 in the confusion matrix, the accuracy
slightly improved from 83.5% to 89.17%, with Kappa statistic
increased from 0.7502459 to 0.8026135, respectively. The accuracy
in this classification suggested that this strategy for the selection of
training area, interpretation approaches and number of reference data
used were importance to perform better classification result.
Abstract: The design of a pattern classifier includes an attempt
to select, among a set of possible features, a minimum subset of
weakly correlated features that better discriminate the pattern classes.
This is usually a difficult task in practice, normally requiring the
application of heuristic knowledge about the specific problem
domain. The selection and quality of the features representing each
pattern have a considerable bearing on the success of subsequent
pattern classification. Feature extraction is the process of deriving
new features from the original features in order to reduce the cost of
feature measurement, increase classifier efficiency, and allow higher
classification accuracy. Many current feature extraction techniques
involve linear transformations of the original pattern vectors to new
vectors of lower dimensionality. While this is useful for data
visualization and increasing classification efficiency, it does not
necessarily reduce the number of features that must be measured
since each new feature may be a linear combination of all of the
features in the original pattern vector. In this paper a new approach is
presented to feature extraction in which feature selection, feature
extraction, and classifier training are performed simultaneously using
a genetic algorithm. In this approach each feature value is first
normalized by a linear equation, then scaled by the associated weight
prior to training, testing, and classification. A knn classifier is used to
evaluate each set of feature weights. The genetic algorithm optimizes
a vector of feature weights, which are used to scale the individual
features in the original pattern vectors in either a linear or a nonlinear
fashion. By this approach, the number of features used in classifying
can be finely reduced.
Abstract: Text categorization is the problem of classifying text
documents into a set of predefined classes. In this paper, we
investigated three approaches to build a meta-classifier in order to
increase the classification accuracy. The basic idea is to learn a metaclassifier
to optimally select the best component classifier for each
data point. The experimental results show that combining classifiers
can significantly improve the accuracy of classification and that our
meta-classification strategy gives better results than each individual
classifier. For 7083 Reuters text documents we obtained a
classification accuracies up to 92.04%.
Abstract: In the recent works related with mixture discriminant
analysis (MDA), expectation and maximization (EM) algorithm is
used to estimate parameters of Gaussian mixtures. But, initial values
of EM algorithm affect the final parameters- estimates. Also, when
EM algorithm is applied two times, for the same data set, it can be
give different results for the estimate of parameters and this affect the
classification accuracy of MDA. Forthcoming this problem, we use
Self Organizing Mixture Network (SOMN) algorithm to estimate
parameters of Gaussians mixtures in MDA that SOMN is more robust
when random the initial values of the parameters are used [5]. We
show effectiveness of this method on popular simulated waveform
datasets and real glass data set.
Abstract: The knowledge base of welding defect recognition is
essentially incomplete. This characteristic determines that the recognition results do not reflect the actual situation. It also has a further influence on the classification of welding quality. This paper is
concerned with the study of a rough set based method to reduce the influence and improve the classification accuracy. At first, a rough set
model of welding quality intelligent classification has been built. Both condition and decision attributes have been specified. Later on, groups
of the representative multiple compound defects have been chosen
from the defect library and then classified correctly to form the
decision table. Finally, the redundant information of the decision table has been reducted and the optimal decision rules have been reached. By this method, we are able to reclassify the misclassified defects to
the right quality level. Compared with the ordinary ones, this method
has higher accuracy and better robustness.
Abstract: An unsupervised classification algorithm is derived
by modeling observed data as a mixture of several mutually
exclusive classes that are each described by linear combinations of
independent non-Gaussian densities. The algorithm estimates the
data density in each class by using parametric nonlinear functions
that fit to the non-Gaussian structure of the data. This improves
classification accuracy compared with standard Gaussian mixture
models. When applied to textures, the algorithm can learn basis
functions for images that capture the statistically significant structure
intrinsic in the images. We apply this technique to the problem of
unsupervised texture classification and segmentation.
Abstract: Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.
Abstract: Content-based music retrieval generally involves analyzing, searching and retrieving music based on low or high level features of a song which normally used to represent artists, songs or music genre. Identifying them would normally involve feature extraction and classification tasks. Theoretically the greater features analyzed, the better the classification accuracy can be achieved but with longer execution time. Technique to select significant features is important as it will reduce dimensions of feature used in classification and contributes to the accuracy. Artificial Immune System (AIS) approach will be investigated and applied in the classification task. Bio-inspired audio content-based retrieval framework (B-ACRF) is proposed at the end of this paper where it embraces issues that need further consideration in music retrieval performances.
Abstract: The identification and classification of the spine deformity play an important role when considering surgical planning for adolescent patients with idiopathic scoliosis. The subject of this article is the Lenke classification of scoliotic spines using Cobb angle measurements. The purpose is two-fold: (1) design a rulebased diagram to assist clinicians in the classification process and (2) investigate a computer classifier which improves the classification time and accuracy. The rule-based diagram efficiency was evaluated in a series of scoliotic classifications by 10 clinicians. The computer classifier was tested on a radiographic measurement database of 603 patients. Classification accuracy was 93% using the rule-based diagram and 99% for the computer classifier. Both the computer classifier and the rule based diagram can efficiently assist clinicians in their Lenke classification of spine scoliosis.
Abstract: Research into the problem of classification of sonar signals has been taken up as a challenging task for the neural networks. This paper investigates the design of an optimal classifier using a Multi layer Perceptron Neural Network (MLP NN) and Support Vector Machines (SVM). Results obtained using sonar data sets suggest that SVM classifier perform well in comparison with well-known MLP NN classifier. An average classification accuracy of 91.974% is achieved with SVM classifier and 90.3609% with MLP NN classifier, on the test instances. The area under the Receiver Operating Characteristics (ROC) curve for the proposed SVM classifier on test data set is found as 0.981183, which is very close to unity and this clearly confirms the excellent quality of the proposed classifier. The SVM classifier employed in this paper is implemented using kernel Adatron algorithm is seen to be robust and relatively insensitive to the parameter initialization in comparison to MLP NN.