Abstract: As in today's semiconductor industries test costs can make up to 50 percent of the total production costs, an efficient test error detection becomes more and more important. In this paper, we present a new machine learning approach to test error detection that should provide a faster recognition of test system faults as well as an improved test error recall. The key idea is to learn a classifier ensemble, detecting typical test error patterns in wafer test results immediately after finishing these tests. Since test error detection has not yet been discussed in the machine learning community, we define central problem-relevant terms and provide an analysis of important domain properties. Finally, we present comparative studies reflecting the failure detection performance of three individual classifiers and three ensemble methods based upon them. As base classifiers we chose a decision tree learner, a support vector machine and a Bayesian network, while the compared ensemble methods were simple and weighted majority vote as well as stacking. For the evaluation, we used cross validation and a specially designed practical simulation. By implementing our approach in a semiconductor test department for the observation of two products, we proofed its practical applicability.
Abstract: Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques with classifiers such as random
forests, neural networks and support vector machines. The data sets
are from MAGIC, a Cherenkov telescope experiment. The task is to
classify gamma signals from overwhelmingly hadron and muon
signals representing a rare class classification problem. We compare
the individual classifiers with their ensemble counterparts and
discuss the results. WEKA a wonderful tool for machine learning has
been used for making the experiments.
Abstract: Support Vector Machine (SVM) is a statistical
learning tool that was initially developed by Vapnik in 1979 and later
developed to a more complex concept of structural risk minimization
(SRM). SVM is playing an increasing role in applications to
detection problems in various engineering problems, notably in
statistical signal processing, pattern recognition, image analysis, and
communication systems. In this paper, SVM was applied to the
detection of SAR (synthetic aperture radar) images in the presence of
partially developed speckle noise. The simulation was done for single
look and multi-look speckle models to give a complete overlook and
insight to the new proposed model of the SVM-based detector. The
structure of the SVM was derived and applied to real SAR images
and its performance in terms of the mean square error (MSE) metric
was calculated. We showed that the SVM-detected SAR images have
a very low MSE and are of good quality. The quality of the
processed speckled images improved for the multi-look model.
Furthermore, the contrast of the SVM detected images was higher
than that of the original non-noisy images, indicating that the SVM
approach increased the distance between the pixel reflectivity levels
(the detection hypotheses) in the original images.
Abstract: In this paper we propose a robust environmental sound classification approach, based on spectrograms features driven from log-Gabor filters. This approach includes two methods. In the first methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The second method uses the same steps but applied only to three patches extracted from each spectrogram.
To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %.
Abstract: The nature of consumer products causes the difficulty
in forecasting the future demands and the accuracy of the forecasts
significantly affects the overall performance of the supply chain
system. In this study, two data mining methods, artificial neural
network (ANN) and support vector machine (SVM), were utilized to
predict the demand of consumer products. The training data used was
the actual demand of six different products from a consumer product
company in Thailand. The results indicated that SVM had a better
forecast quality (in term of MAPE) than ANN in every category of
products. Moreover, another important finding was the margin
difference of MAPE from these two methods was significantly high
when the data was highly correlated.
Abstract: The standard investigational method for obstructive
sleep apnea syndrome (OSAS) diagnosis is polysomnography (PSG),
which consists of a simultaneous, usually overnight recording of
multiple electro-physiological signals related to sleep and
wakefulness. This is an expensive, encumbering and not a readily
repeated protocol, and therefore there is need for simpler and easily
implemented screening and detection techniques. Identification of
apnea/hypopnea events in the screening recordings is the key factor
for the diagnosis of OSAS. The analysis of a solely single-lead
electrocardiographic (ECG) signal for OSAS diagnosis, which may
be done with portable devices, at patient-s home, is the challenge of
the last years. A novel artificial neural network (ANN) based
approach for feature extraction and automatic identification of
respiratory events in ECG signals is presented in this paper. A
nonlinear principal component analysis (NLPCA) method was
considered for feature extraction and support vector machine for
classification/recognition. An alternative representation of the
respiratory events by means of Kohonen type neural network is
discussed. Our prospective study was based on OSAS patients of the
Clinical Hospital of Pneumology from Iaşi, Romania, males and
females, as well as on non-OSAS investigated human subjects. Our
computed analysis includes a learning phase based on cross signal
PSG annotation.
Abstract: Face recognition is a technique to automatically
identify or verify individuals. It receives great attention in
identification, authentication, security and many more applications.
Diverse methods had been proposed for this purpose and also a lot of
comparative studies were performed. However, researchers could not
reach unified conclusion. In this paper, we are reporting an extensive
quantitative accuracy analysis of four most widely used face
recognition algorithms: Principal Component Analysis (PCA),
Independent Component Analysis (ICA), Linear Discriminant
Analysis (LDA) and Support Vector Machine (SVM) using AT&T,
Sheffield and Bangladeshi people face databases under diverse
situations such as illumination, alignment and pose variations.
Abstract: This paper aims to develop a NOx emission model of
an acid gas incinerator using Nelder-Mead least squares support
vector regression (LS-SVR). Malaysia DOE is actively imposing the
Clean Air Regulation to mandate the installation of analytical
instrumentation known as Continuous Emission Monitoring System
(CEMS) to report emission level online to DOE . As a hardware
based analyzer, CEMS is expensive, maintenance intensive and often
unreliable. Therefore, software predictive technique is often
preferred and considered as a feasible alternative to replace the
CEMS for regulatory compliance. The LS-SVR model is built based
on the emissions from an acid gas incinerator that operates in a LNG
Complex. Simulated Annealing (SA) is first used to determine the
initial hyperparameters which are then further optimized based on the
performance of the model using Nelder-Mead simplex algorithm.
The LS-SVR model is shown to outperform a benchmark model
based on backpropagation neural networks (BPNN) in both training
and testing data.
Abstract: The paper investigates the potential of support vector
machines and Gaussian process based regression approaches to
model the oxygen–transfer capacity from experimental data of
multiple plunging jets oxygenation systems. The results suggest the
utility of both the modeling techniques in the prediction of the
overall volumetric oxygen transfer coefficient (KLa) from operational
parameters of multiple plunging jets oxygenation system. The
correlation coefficient root mean square error and coefficient of
determination values of 0.971, 0.002 and 0.945 respectively were
achieved by support vector machine in comparison to values of
0.960, 0.002 and 0.920 respectively achieved by Gaussian process
regression. Further, the performances of both these regression
approaches in predicting the overall volumetric oxygen transfer
coefficient was compared with the empirical relationship for multiple
plunging jets. A comparison of results suggests that support vector
machines approach works well in comparison to both empirical
relationship and Gaussian process approaches, and could successfully
be employed in modeling oxygen-transfer.
Abstract: In comparison to the original SVM, which involves a
quadratic programming task; LS–SVM simplifies the required
computation, but unfortunately the sparseness of standard SVM is
lost. Another problem is that LS-SVM is only optimal if the training
samples are corrupted by Gaussian noise. In Least Squares SVM
(LS–SVM), the nonlinear solution is obtained, by first mapping the
input vector to a high dimensional kernel space in a nonlinear
fashion, where the solution is calculated from a linear equation set. In
this paper a geometric view of the kernel space is introduced, which
enables us to develop a new formulation to achieve a sparse and
robust estimate.
Abstract: The one-class support vector machine “support vector
data description” (SVDD) is an ideal approach for anomaly or outlier
detection. However, for the applicability of SVDD in real-world
applications, the ease of use is crucial. The results of SVDD are
massively determined by the choice of the regularisation parameter C
and the kernel parameter of the widely used RBF kernel. While for
two-class SVMs the parameters can be tuned using cross-validation
based on the confusion matrix, for a one-class SVM this is not
possible, because only true positives and false negatives can occur
during training. This paper proposes an approach to find the optimal
set of parameters for SVDD solely based on a training set from
one class and without any user parameterisation. Results on artificial
and real data sets are presented, underpinning the usefulness of the
approach.
Abstract: In the automotive industry test drives are being conducted
during the development of new vehicle models or as a part of
quality assurance of series-production vehicles. The communication
on the in-vehicle network, data from external sensors, or internal
data from the electronic control units is recorded by automotive
data loggers during the test drives. The recordings are used for fault
analysis. Since the resulting data volume is tremendous, manually
analysing each recording in great detail is not feasible.
This paper proposes to use machine learning to support domainexperts
by preventing them from contemplating irrelevant data and
rather pointing them to the relevant parts in the recordings. The
underlying idea is to learn the normal behaviour from available
recordings, i.e. a training set, and then to autonomously detect
unexpected deviations and report them as anomalies.
The one-class support vector machine “support vector data description”
is utilised to calculate distances of feature vectors. SVDDSUBSEQ
is proposed as a novel approach, allowing to classify subsequences
in multivariate time series data. The approach allows to
detect unexpected faults without modelling effort as is shown with
experimental results on recordings from test drives.
Abstract: According to the statistics, the prevalence of congenital hearing loss in Taiwan is approximately six thousandths; furthermore, one thousandths of infants have severe hearing impairment. Hearing ability during infancy has significant impact in the development of children-s oral expressions, language maturity, cognitive performance, education ability and social behaviors in the future. Although most children born with hearing impairment have sensorineural hearing loss, almost every child more or less still retains some residual hearing. If provided with a hearing aid or cochlear implant (a bionic ear) timely in addition to hearing speech training, even severely hearing-impaired children can still learn to talk. On the other hand, those who failed to be diagnosed and thus unable to begin hearing and speech rehabilitations on a timely manner might lose an important opportunity to live a complete and healthy life. Eventually, the lack of hearing and speaking ability will affect the development of both mental and physical functions, intelligence, and social adaptability. Not only will this problem result in an irreparable regret to the hearing-impaired child for the life time, but also create a heavy burden for the family and society. Therefore, it is necessary to establish a set of computer-assisted predictive model that can accurately detect and help diagnose newborn hearing loss so that early interventions can be provided timely to eliminate waste of medical resources. This study uses information from the neonatal database of the case hospital as the subjects, adopting two different analysis methods of using support vector machine (SVM) for model predictions and using logistic regression to conduct factor screening prior to model predictions in SVM to examine the results. The results indicate that prediction accuracy is as high as 96.43% when the factors are screened and selected through logistic regression. Hence, the model constructed in this study will have real help in clinical diagnosis for the physicians and actually beneficial to the early interventions of newborn hearing impairment.
Abstract: One important objective in Precision Agriculture is to minimize the volume of herbicides that are applied to the fields through the use of site-specific weed management systems. In order to reach this goal, two major factors need to be considered: 1) the similar spectral signature, shape and texture between weeds and crops; 2) the irregular distribution of the weeds within the crop's field. This paper outlines an automatic computer vision system for the detection and differential spraying of Avena sterilis, a noxious weed growing in cereal crops. The proposed system involves two processes: image segmentation and decision making. Image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and the weeds. From these attributes, a hybrid decision making approach determines if a cell must be or not sprayed. The hybrid approach uses the Support Vector Machines and the Fuzzy k-Means methods, combined through the fuzzy aggregation theory. This makes the main finding of this paper. The method performance is compared against other available strategies.
Abstract: Identity verification of authentic persons by their multiview faces is a real valued problem in machine vision. Multiview faces are having difficulties due to non-linear representation in the feature space. This paper illustrates the usability of the generalization of LDA in the form of canonical covariate for face recognition to multiview faces. In the proposed work, the Gabor filter bank is used to extract facial features that characterized by spatial frequency, spatial locality and orientation. Gabor face representation captures substantial amount of variations of the face instances that often occurs due to illumination, pose and facial expression changes. Convolution of Gabor filter bank to face images of rotated profile views produce Gabor faces with high dimensional features vectors. Canonical covariate is then used to Gabor faces to reduce the high dimensional feature spaces into low dimensional subspaces. Finally, support vector machines are trained with canonical sub-spaces that contain reduced set of features and perform recognition task. The proposed system is evaluated with UMIST face database. The experiment results demonstrate the efficiency and robustness of the proposed system with high recognition rates.
Abstract: In this work a new offline signature recognition system
based on Radon Transform, Fractal Dimension (FD) and Support Vector Machine (SVM) is presented. In the first step, projections of
original signatures along four specified directions have been performed using radon transform. Then, FDs of four obtained
vectors are calculated to construct a feature vector for each
signature. These vectors are then fed into SVM classifier for recognition of signatures. In order to evaluate the effectiveness of
the system several experiments are carried out. Offline signature
database from signature verification competition (SVC) 2004 is used
during all of the tests. Experimental result indicates that the proposed method achieved high accuracy rate in signature recognition.
Abstract: An image texture analysis and target recognition approach of using an improved image texture feature coding method (TFCM) and Support Vector Machine (SVM) for target detection is presented. With our proposed target detection framework, targets of interest can be detected accurately. Cascade-Sliding-Window technique was also developed for automated target localization. Application to mammogram showed that over 88% of normal mammograms and 80% of abnormal mammograms can be correctly identified. The approach was also successfully applied to Synthetic Aperture Radar (SAR) and Ground Penetrating Radar (GPR) images for target detection.
Abstract: In this study, we propose a tongue diagnosis method
which detects the tongue from face image and divides the tongue area into six areas, and finally generates tongue coating ratio of each area.
To detect the tongue area from face image, we use ASM as one of the active shape models. Detected tongue area is divided into six areas
widely used in the Korean traditional medicine and the distribution of tongue coating of the six areas is examined by SVM(Support Vector
Machine). For SVM, we use a 3-dimensional vector calculated by PCA(Principal Component Analysis) from a 12-dimentional vector
consisting of RGB, HIS, Lab, and Luv. As a result, we detected the tongue area stably using ASM and found that PCA and SVM helped
raise the ratio of tongue coating detection.
Abstract: Support vector machines (SVMs) are considered to be
the best machine learning algorithms for minimizing the predictive
probability of misclassification. However, their drawback is that for
large data sets the computation of the optimal decision boundary is a
time consuming function of the size of the training set. Hence several
methods have been proposed to speed up the SVM algorithm. Here
three methods used to speed up the computation of the SVM
classifiers are compared experimentally using a musical genre
classification problem. The simplest method pre-selects a random
sample of the data before the application of the SVM algorithm. Two
additional methods use proximity graphs to pre-select data that are
near the decision boundary. One uses k-Nearest Neighbor graphs and
the other Relative Neighborhood Graphs to accomplish the task.
Abstract: Steel surface defect detection is essentially one of
pattern recognition problems. Support Vector Machines (SVMs) are
known as one of the most proper classifiers in this application. In this
paper, we introduce a more accurate classification method by using
SVMs as our final classifier of the inspection system. In this scheme,
multiclass classification task is performed based on the "one-againstone"
method and different kernels are utilized for each pair of the
classes in multiclass classification of the different defects.
In the proposed system, a decision tree is employed in the first
stage for two-class classification of the steel surfaces to "defect" and
"non-defect", in order to decrease the time complexity. Based on
the experimental results, generated from over one thousand images,
the proposed multiclass classification scheme is more accurate than
the conventional methods and the overall system yields a sufficient
performance which can meet the requirements in steel manufacturing.