Abstract: Clusters of Microcalcifications (MCCs) are most frequent symptoms of Ductal Carcinoma in Situ (DCIS) recognized by mammography. Least-Square Support Vector Machine (LS-SVM) is a variant of the standard SVM. In the paper, LS-SVM is proposed as a classifier for classifying MCCs as benign or malignant based on relevant extracted features from enhanced mammogram. To establish the credibility of LS-SVM classifier for classifying MCCs, a comparative evaluation of the relative performance of LS-SVM classifier for different kernel functions is made. For comparative evaluation, confusion matrix and ROC analysis are used. Experiments are performed on data extracted from mammogram images of DDSM database. A total of 380 suspicious areas are collected, which contain 235 malignant and 145 benign samples, from mammogram images of DDSM database. A set of 50 features is calculated for each suspicious area. After this, an optimal subset of 23 most suitable features is selected from 50 features by Particle Swarm Optimization (PSO). The results of proposed study are quite promising.
Abstract: Sea level rise threatens to increase the impact of future
storms and hurricanes on coastal communities. Accurate sea level
change prediction and supplement is an important task in determining
constructions and human activities in coastal and oceanic areas. In
this study, support vector machines (SVM) is proposed to predict
daily tidal levels along the Jeddah Coast, Saudi Arabia. The optimal
parameter values of kernel function are determined using a genetic
algorithm. The SVM results are compared with the field data and
with back propagation (BP). Among the models, the SVM is superior
to BPNN and has better generalization performance.
Abstract: BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.
Abstract: In recent years intrusions on computer network are the major security threat. Hence, it is important to impede such intrusions. The hindrance of such intrusions entirely relies on its detection, which is primary concern of any security tool like Intrusion detection system (IDS). Therefore, it is imperative to accurately detect network attack. Numerous intrusion detection techniques are available but the main issue is their performance. The performance of IDS can be improved by increasing the accurate detection rate and reducing false positive. The existing intrusion detection techniques have the limitation of usage of raw dataset for classification. The classifier may get jumble due to redundancy, which results incorrect classification. To minimize this problem, Principle component analysis (PCA), Linear Discriminant Analysis (LDA) and Local Binary Pattern (LBP) can be applied to transform raw features into principle features space and select the features based on their sensitivity. Eigen values can be used to determine the sensitivity. To further classify, the selected features greedy search, back elimination, and Particle Swarm Optimization (PSO) can be used to obtain a subset of features with optimal sensitivity and highest discriminatory power. This optimal feature subset is used to perform classification. For classification purpose, Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used due to its proven ability in classification. The Knowledge Discovery and Data mining (KDD’99) cup dataset was considered as a benchmark for evaluating security detection mechanisms. The proposed approach can provide an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates.
Abstract: This paper presents content-based image retrieval (CBIR) frameworks with relevance feedback (RF) based on combined learning of support vector machines (SVM) and AdaBoosts. The framework incorporates only most relevant images obtained from both the learning algorithm. To speed up the system, it removes irrelevant images from the database, which are returned from SVM learner. It is the key to achieve the effective retrieval performance in terms of time and accuracy. The experimental results show that this framework had significant improvement in retrieval effectiveness, which can finally improve the retrieval performance.
Abstract: Red blood cells (RBCs) are among the most
commonly and intensively studied type of blood cells in cell biology.
Anemia is a lack of RBCs is characterized by its level compared to
the normal hemoglobin level. In this study, a system based image
processing methodology was developed to localize and extract RBCs
from microscopic images. Also, the machine learning approach is
adopted to classify the localized anemic RBCs images. Several
textural and geometrical features are calculated for each extracted
RBCs. The training set of features was analyzed using principal
component analysis (PCA). With the proposed method, RBCs were
isolated in 4.3secondsfrom an image containing 18 to 27 cells. The
reasons behind using PCA are its low computation complexity and
suitability to find the most discriminating features which can lead to
accurate classification decisions. Our classifier algorithm yielded
accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor
(K-NN) algorithm, support vector machine (SVM), and neural
network RBFNN, respectively. Classification was evaluated in highly
sensitivity, specificity, and kappa statistical parameters. In
conclusion, the classification results were obtained within short time
period, and the results became better when PCA was used.
Abstract: In this paper, an extreme learning machine with an automatic segmentation algorithm is applied to heart disorder classification by heart sound signals. From continuous heart sound signals, the starting points of the first (S1) and the second heart pulses (S2) are extracted and corrected by utilizing an inter-pulse histogram. From the corrected pulse positions, a single period of heart sound signals is extracted and converted to a feature vector including the mel-scaled filter bank energy coefficients and the envelope coefficients of uniform-sized sub-segments. An extreme learning machine is used to classify the feature vector. In our cardiac disorder classification and detection experiments with 9 cardiac disorder categories, the proposed method shows significantly better performance than multi-layer perceptron, support vector machine, and hidden Markov model; it achieves the classification accuracy of 81.6% and the detection accuracy of 96.9%.
Abstract: Text categorization is the problem of classifying text
documents into a set of predefined classes. After a preprocessing
step, the documents are typically represented as large sparse vectors.
When training classifiers on large collections of documents, both the
time and memory restrictions can be quite prohibitive. This justifies
the application of feature selection methods to reduce the
dimensionality of the document-representation vector. In this paper,
we present three feature selection methods: Information Gain,
Support Vector Machine feature selection called (SVM_FS) and
Genetic Algorithm with SVM (called GA_SVM). We show that the
best results were obtained with GA_SVM method for a relatively
small dimension of the feature vector.
Abstract: Identification of cancer genes that might anticipate
the clinical behaviors from different types of cancer disease is
challenging due to the huge number of genes and small number of
patients samples. The new method is being proposed based on
supervised learning of classification like support vector machines
(SVMs).A new solution is described by the introduction of the
Maximized Margin (MM) in the subset criterion, which permits to
get near the least generalization error rate. In class prediction
problem, gene selection is essential to improve the accuracy and to
identify genes for cancer disease. The performance of the new
method was evaluated with real-world data experiment. It can give
the better accuracy for classification.
Abstract: In this paper, we propose a method of resolving dependency ambiguities of Korean subordinate clauses based on Support Vector Machines (SVMs). Dependency analysis of clauses is well known to be one of the most difficult tasks in parsing sentences, especially in Korean. In order to solve this problem, we assume that the dependency relation of Korean subordinate clauses is the dependency relation among verb phrase, verb and endings in the clauses. As a result, this problem is represented as a binary classification task. In order to apply SVMs to this problem, we selected two kinds of features: static and dynamic features. The experimental results on STEP2000 corpus show that our system achieves the accuracy of 73.5%.
Abstract: An early and accurate detection of Alzheimer's disease (AD) is an important stage in the treatment of individuals suffering from AD. We present an approach based on the use of structural magnetic resonance imaging (sMRI) phase images to distinguish between normal controls (NC), mild cognitive impairment (MCI) and AD patients with clinical dementia rating (CDR) of 1. Independent component analysis (ICA) technique is used for extracting useful features which form the inputs to the support vector machines (SVM), K nearest neighbour (kNN) and multilayer artificial neural network (ANN) classifiers to discriminate between the three classes. The obtained results are encouraging in terms of classification accuracy and effectively ascertain the usefulness of phase images for the classification of different stages of Alzheimer-s disease.
Abstract: Script identification is one of the challenging steps in the development of optical character recognition system for bilingual or multilingual documents. In this paper an attempt is made for identification of English numerals at word level from Punjabi documents by using Gabor features. The support vector machine (SVM) classifier with five fold cross validation is used to classify the word images. The results obtained are quite encouraging. Average accuracy with RBF kernel, Polynomial and Linear Kernel functions comes out to be greater than 99%.
Abstract: This paper presents a wavelet transform and Support
Vector Machine (SVM) based algorithm for estimating fault location
on transmission lines. The Discrete wavelet transform (DWT) is used
for data pre-processing and this data are used for training and testing
SVM. Five types of mother wavelet are used for signal processing to
identify a suitable wavelet family that is more appropriate for use in
estimating fault location. The results demonstrated the ability of SVM
to generalize the situation from the provided patterns and to
accurately estimate the location of faults with varying fault resistance.
Abstract: Purpose of this work is the development of an
automatic classification system which could be useful for radiologists
in the investigation of breast cancer. The software has been designed
in the framework of the MAGIC-5 collaboration.
In the automatic classification system the suspicious regions with
high probability to include a lesion are extracted from the image as
regions of interest (ROIs). Each ROI is characterized by some
features based on morphological lesion differences.
Some classifiers as a Feed Forward Neural Network, a K-Nearest
Neighbours and a Support Vector Machine are used to distinguish the
pathological records from the healthy ones.
The results obtained in terms of sensitivity (percentage of
pathological ROIs correctly classified) and specificity (percentage of
non-pathological ROIs correctly classified) will be presented through
the Receive Operating Characteristic curve (ROC). In particular the
best performances are 88% ± 1 of area under ROC curve obtained
with the Feed Forward Neural Network.
Abstract: Until recently, researchers have developed various
tools and methodologies for effective clinical decision-making.
Among those decisions, chest pain diseases have been one of
important diagnostic issues especially in an emergency department. To
improve the ability of physicians in diagnosis, many researchers have
developed diagnosis intelligence by using machine learning and data
mining. However, most of the conventional methodologies have been
generally based on a single classifier for disease classification and
prediction, which shows moderate performance. This study utilizes an
ensemble strategy to combine multiple different classifiers to help
physicians diagnose chest pain diseases more accurately than ever.
Specifically the ensemble strategy is applied by using the integration
of decision trees, neural networks, and support vector machines. The
ensemble models are applied to real-world emergency data. This study
shows that the performance of the ensemble models is superior to each
of single classifiers.
Abstract: Purpose: To explore the use of Curvelet transform to
extract texture features of pulmonary nodules in CT image and support
vector machine to establish prediction model of small solitary
pulmonary nodules in order to promote the ratio of detection and
diagnosis of early-stage lung cancer. Methods: 2461 benign or
malignant small solitary pulmonary nodules in CT image from 129
patients were collected. Fourteen Curvelet transform textural features
were as parameters to establish support vector machine prediction
model. Results: Compared with other methods, using 252 texture
features as parameters to establish prediction model is more proper.
And the classification consistency, sensitivity and specificity for the
model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based
on texture features extracted from Curvelet transform, support vector
machine prediction model is sensitive to lung cancer, which can
promote the rate of diagnosis for early-stage lung cancer to some
extent.
Abstract: Acoustical properties of speech have been shown to
be related to mental states of speaker with symptoms: depression
and remission. This paper describes way to address the issue of
distinguishing depressed patients from remitted subjects based on
measureable acoustics change of their spoken sound. The vocal-tract
related frequency characteristics of speech samples from female
remitted and depressed patients were analyzed via speech
processing techniques and consequently, evaluated statistically by
cross-validation with Support Vector Machine. Our results
comparatively show the classifier's performance with effectively
correct separation of 93% determined from testing with the subjectbased
feature model and 88% from the frame-based model based on
the same speech samples collected from hospital visiting interview
sessions between patients and psychiatrists.
Abstract: The paper discusses the results obtained to predict
reinforcement in singly reinforced beam using Neural Net (NN),
Support Vector Machines (SVM-s) and Tree Based Models. Major
advantage of SVM-s over NN is of minimizing a bound on the
generalization error of model rather than minimizing a bound on
mean square error over the data set as done in NN. Tree Based
approach divides the problem into a small number of sub problems to
reach at a conclusion. Number of data was created for different
parameters of beam to calculate the reinforcement using limit state
method for creation of models and validation. The results from this
study suggest a remarkably good performance of tree based and
SVM-s models. Further, this study found that these two techniques
work well and even better than Neural Network methods. A
comparison of predicted values with actual values suggests a very
good correlation coefficient with all four techniques.
Abstract: This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Abstract: Predicting protein-protein interactions represent a key step in understanding proteins functions. This is due to the fact that proteins usually work in context of other proteins and rarely function alone. Machine learning techniques have been applied to predict protein-protein interactions. However, most of these techniques address this problem as a binary classification problem. Although it is easy to get a dataset of interacting proteins as positive examples, there are no experimentally confirmed non-interacting proteins to be considered as negative examples. Therefore, in this paper we solve this problem as a one-class classification problem using one-class support vector machines (SVM). Using only positive examples (interacting protein pairs) in training phase, the one-class SVM achieves accuracy of about 80%. These results imply that protein-protein interaction can be predicted using one-class classifier with comparable accuracy to the binary classifiers that use artificially constructed negative examples.