Abstract: Although the field of parametric Pattern Recognition (PR) has been thoroughly studied for over five decades, the use of the Order Statistics (OS) of the distributions to achieve this has not been reported. The pioneering work on using OS for classification was presented in [1] for the Uniform distribution, where it was shown that optimal PR can be achieved in a counter-intuitive manner, diametrically opposed to the Bayesian paradigm, i.e., by comparing the testing sample to a few samples distant from the mean. This must be contrasted with the Bayesian paradigm in which, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding central points, for example, the means. In [2], we showed that the results could be extended for a few symmetric distributions within the exponential family. In this paper, we attempt to extend these results significantly by considering asymmetric distributions within the exponential family, for some of which even the closed form expressions of the cumulative distribution functions are not available. These distributions include the Rayleigh, Gamma and certain Beta distributions. As in [1] and [2], the new scheme, referred to as Classification by Moments of Order Statistics (CMOS), attains an accuracy very close to the optimal Bayes’ bound, as has been shown both theoretically and by rigorous experimental testing.
Abstract: Fingerprint based identification system; one of a well
known biometric system in the area of pattern recognition and has
always been under study through its important role in forensic
science that could help government criminal justice community. In
this paper, we proposed an identification framework of individuals by
means of fingerprint. Different from the most conventional
fingerprint identification frameworks the extracted Geometrical
element features (GEFs) will go through a Discretization process.
The intention of Discretization in this study is to attain individual
unique features that could reflect the individual varianceness in order
to discriminate one person from another. Previously, Discretization
has been shown a particularly efficient identification on English
handwriting with accuracy of 99.9% and on discrimination of twins-
handwriting with accuracy of 98%. Due to its high discriminative
power, this method is adopted into this framework as an independent
based method to seek for the accuracy of fingerprint identification.
Finally the experimental result shows that the accuracy rate of
identification of the proposed system using Discretization is 100%
for FVC2000, 93% for FVC2002 and 89.7% for FVC2004 which is
much better than the conventional or the existing fingerprint
identification system (72% for FVC2000, 26% for FVC2002 and
32.8% for FVC2004). The result indicates that Discretization
approach manages to boost up the classification effectively, and
therefore prove to be suitable for other biometric features besides
handwriting and fingerprint.
Abstract: This paper proposes a novel hybrid algorithm for feature selection based on a binary ant colony and SVM. The final subset selection is attained through the elimination of the features that produce noise or, are strictly correlated with other already selected features. Our algorithm can improve classification accuracy with a small and appropriate feature subset. Proposed algorithm is easily implemented and because of use of a simple filter in that, its computational complexity is very low. The performance of the proposed algorithm is evaluated through a real Rotary Cement kiln dataset. The results show that our algorithm outperforms existing algorithms.
Abstract: In Geographic Information System, one of the sources
of obtaining needed geographic data is digitizing analog maps and
evaluation of aerial and satellite photos. In this study, a method will
be discussed which can be used to extract vectorial features and
creating vectorized drawing files for aerial photos. At the same time
a software developed for these purpose. Converting from raster to
vector is also known as vectorization and it is the most important step
when creating vectorized drawing files. In the developed algorithm,
first of all preprocessing on the aerial photo is done. These are;
converting to grayscale if necessary, reducing noise, applying some
filters and determining the edge of the objects etc. After these steps,
every pixel which constitutes the photo are followed from upper left
to right bottom by examining its neighborhood relationship and one
pixel wide lines or polylines obtained. The obtained lines have to be
erased for preventing confusion while continuing vectorization
because if not erased they can be perceived as new line, but if erased
it can cause discontinuity in vector drawing so the image converted
from 2 bit to 8 bit and the detected pixels are expressed as a different
bit. In conclusion, the aerial photo can be converted to vector form
which includes lines and polylines and can be opened in any CAD
application.
Abstract: Today automobile and aerospace industries realise Laser Beam Welding for a clean and non contact source of heating and fusion for joining of sheets. The welding performance is mainly based on by the laser welding parameters. Some concepts related to Artificial Neural Networks and how can be applied to model weld bead geometry and mechanical properties in terms of equipment parameters are reported in order to evaluate the accuracy and compare it with traditional modeling schemes. This review reveals the output features of Titanium and Aluminium weld bead geometry and mechanical properties such as ultimate tensile strength, yield strength, elongation and reduction of the area of the weld using Artificial Neural Network.
Abstract: Occurrences of spurious crests on the troughs of large,
relatively steep second-order Stokes waves are anomalous and not an
inherent characteristic of real waves. Here, the effects of such
occurrences on the statistics described by the standard second-order
stochastic model are examined theoretically and by way of
simulations. Theoretical results and simulations indicate that when
spurious occurrences are sufficiently large, the standard model leads
to physically unrealistic surface features and inaccuracies in the
statistics of various surface features, in particular, the troughs and
thus zero-crossing heights of large waves. Whereas inaccuracies can
be fairly noticeable for long-crested waves in both deep and
shallower depths, they tend to become relatively insignificant in
directional waves.
Abstract: A registration framework for image-guided robotic
surgery is proposed for three emergency neurosurgical procedures,
namely Intracranial Pressure (ICP) Monitoring, External Ventricular
Drainage (EVD) and evacuation of a Chronic Subdural Haematoma
(CSDH). The registration paradigm uses CT and white light as
modalities. This paper presents two simulation studies for a
preliminary evaluation of the registration protocol: (1) The loci of the
Target Registration Error (TRE) in the patient-s axial, coronal and
sagittal views were simulated based on a Fiducial Localisation Error
(FLE) of 5 mm and (2) Simulation of the actual framework using
projected views from a surface rendered CT model to represent white
light images of the patient. Craniofacial features were employed as
the registration basis to map the CT space onto the simulated
intraoperative space. Photogrammetry experiments on an artificial
skull were also performed to benchmark the results obtained from the
second simulation. The results of both simulations show that the
proposed protocol can provide a 5mm accuracy for these
neurosurgical procedures.
Abstract: Financial forecasting using machine learning techniques has received great efforts in the last decide . In this ongoing work, we show how machine learning of graphical models will be able to infer a visualized causal interactions between different banks in the Saudi equities market. One important discovery from such learned causal graphs is how companies influence each other and to what extend. In this work, a set of graphical models named Gaussian graphical models with developed ensemble penalized feature selection methods that combine ; filtering method, wrapper method and a regularizer will be shown. A comparison between these different developed ensemble combinations will also be shown. The best ensemble method will be used to infer the causal relationships between banks in Saudi equities market.
Abstract: In rotating machinery one of the critical components
that is prone to premature failure is the rolling bearing.
Consequently, early warning of an imminent bearing failure is much
critical to the safety and reliability of any high speed rotating
machines. This study is concerned with the application of Recurrence
Quantification Analysis (RQA) in fault detection of rolling element
bearings in rotating machinery. Based on the results from this study it
is reported that the RQA variable, percent determinism, is sensitive
to the type of fault investigated and therefore can provide useful
information on bearing damage in rolling element bearings.
Abstract: Purpose: To explore the use of Curvelet transform to
extract texture features of pulmonary nodules in CT image and support
vector machine to establish prediction model of small solitary
pulmonary nodules in order to promote the ratio of detection and
diagnosis of early-stage lung cancer. Methods: 2461 benign or
malignant small solitary pulmonary nodules in CT image from 129
patients were collected. Fourteen Curvelet transform textural features
were as parameters to establish support vector machine prediction
model. Results: Compared with other methods, using 252 texture
features as parameters to establish prediction model is more proper.
And the classification consistency, sensitivity and specificity for the
model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based
on texture features extracted from Curvelet transform, support vector
machine prediction model is sensitive to lung cancer, which can
promote the rate of diagnosis for early-stage lung cancer to some
extent.
Abstract: A novel method of learning complex fuzzy decision regions in the n-dimensional feature space is proposed. Through the fuzzy decision regions, a given pattern's class membership value of every class is determined instead of the conventional crisp class the pattern belongs to. The n-dimensional fuzzy decision region is approximated by union of hyperellipsoids. By explicitly parameterizing these hyperellipsoids, the decision regions are determined by estimating the parameters of each hyperellipsoid.Genetic Algorithm is applied to estimate the parameters of each region component. With the global optimization ability of GA, the learned decision region can be arbitrarily complex.
Abstract: In this paper, an Arabic letter recognition system based on Artificial Neural Networks (ANNs) and statistical analysis for feature extraction is presented. The ANN is trained using the Least Mean Squares (LMS) algorithm. In the proposed system, each typed Arabic letter is represented by a matrix of binary numbers that are used as input to a simple feature extraction system whose output, in addition to the input matrix, are fed to an ANN. Simulation results are provided and show that the proposed system always produces a lower Mean Squared Error (MSE) and higher success rates than the current ANN solutions.
Abstract: A feature weighting and selection method is proposed
which uses the structure of a weightless neuron and exploits the
principles that govern the operation of Genetic Algorithms and
Evolution. Features are coded onto chromosomes in a novel way
which allows weighting information regarding the features to be
directly inferred from the gene values. The proposed method is
significant in that it addresses several problems concerned with
algorithms for feature selection and weighting as well as providing
significant advantages such as speed, simplicity and suitability for
real-time systems.
Abstract: Acoustical properties of speech have been shown to
be related to mental states of speaker with symptoms: depression
and remission. This paper describes way to address the issue of
distinguishing depressed patients from remitted subjects based on
measureable acoustics change of their spoken sound. The vocal-tract
related frequency characteristics of speech samples from female
remitted and depressed patients were analyzed via speech
processing techniques and consequently, evaluated statistically by
cross-validation with Support Vector Machine. Our results
comparatively show the classifier's performance with effectively
correct separation of 93% determined from testing with the subjectbased
feature model and 88% from the frame-based model based on
the same speech samples collected from hospital visiting interview
sessions between patients and psychiatrists.
Abstract: Association rules are an important problem in data
mining. Massively increasing volume of data in real life databases
has motivated researchers to design novel and incremental algorithms
for association rules mining. In this paper, we propose an incremental
association rules mining algorithm that integrates shocking
interestingness criterion during the process of building the model. A
new interesting measure called shocking measure is introduced. One
of the main features of the proposed approach is to capture the user
background knowledge, which is monotonically augmented. The
incremental model that reflects the changing data and the user beliefs
is attractive in order to make the over all KDD process more
effective and efficient. We implemented the proposed approach and
experiment it with some public datasets and found the results quite
promising.
Abstract: An automated wood recognition system is designed to
classify tropical wood species.The wood features are extracted based
on two feature extractors: Basic Grey Level Aura Matrix (BGLAM)
technique and statistical properties of pores distribution (SPPD)
technique. Due to the nonlinearity of the tropical wood species
separation boundaries, a pre classification stage is proposed which
consists ofKmeans clusteringand kernel discriminant analysis (KDA).
Finally, Linear Discriminant Analysis (LDA) classifier and KNearest
Neighbour (KNN) are implemented for comparison purposes.
The study involves comparison of the system with and without pre
classification using KNN classifier and LDA classifier.The results
show that the inclusion of the pre classification stage has improved
the accuracy of both the LDA and KNN classifiers by more than
12%.
Abstract: The ever increasing product diversity and competition on the market of goods and services has dictated the pace of growth in the number of advertisements. Despite their admittedly diminished effectiveness over the recent years, advertisements remain the favored method of sales promotion. Consequently, the challenge for an advertiser is to explore every possible avenue of making an advertisement more noticeable, attractive and impellent for consumers. One way to achieve this is through invoking celebrity endorsements. On the one hand, the use of a celebrity to endorse a product involves substantial costs, however, on the other hand, it does not immediately guarantee the success of an advertisement. The question of how celebrities can be used in advertising to the best advantage is therefore of utmost importance. Celebrity endorsements have become commonplace: empirical evidence indicates that approximately 20 to 25 per cent of advertisements feature some famous person as a product endorser. The popularity of celebrity endorsements demonstrates the relevance of the topic, especially in the context of the current global economic downturn, when companies are forced to save in order to survive, yet simultaneously to heavily invest in advertising and sales promotion. The issue of the effective use of celebrity endorsements also figures prominently in the academic discourse. The study presented below is thus aimed at exploring what qualities (characteristics) of a celebrity endorser have an impact on the ffectiveness of the advertisement in which he/she appears and how.
Abstract: An appropriate project delivery system (PDS) is crucial
to the success of a construction projects. Case-based Reasoning (CBR)
is a useful support for PDS selection. However, the traditional CBR
approach represents cases as attribute-value vectors without taking
relations among attributes into consideration, and could not calculate
the similarity when the structures of cases are not strictly same.
Therefore, this paper solves this problem by adopting the Relational
Case-based Reasoning (RCBR) approach for PDS selection,
considering both the structural similarity and feature similarity. To
develop the feature terms of the construction projects, the criteria and
factors governing PDS selection process are first identified. Then
feature terms for the construction projects are developed. Finally, the
mechanism of similarity calculation and a case study indicate how
RCBR works for PDS selection. The adoption of RCBR in PDS
selection expands the scope of application of traditional CBR method
and improves the accuracy of the PDS selection system.
Abstract: Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.
Abstract: Segmentation, filtering out of measurement errors and
identification of breakpoints are integral parts of any analysis of
microarray data for the detection of copy number variation (CNV).
Existing algorithms designed for these tasks have had some successes
in the past, but they tend to be O(N2) in either computation time or
memory requirement, or both, and the rapid advance of microarray
resolution has practically rendered such algorithms useless. Here we
propose an algorithm, SAD, that is much faster and much less thirsty
for memory – O(N) in both computation time and memory requirement
-- and offers higher accuracy. The two key ingredients of SAD are the
fundamental assumption in statistics that measurement errors are
normally distributed and the mathematical relation that the product of
two Gaussians is another Gaussian (function). We have produced a
computer program for analyzing CNV based on SAD. In addition to
being fast and small it offers two important features: quantitative
statistics for predictions and, with only two user-decided parameters,
ease of use. Its speed shows little dependence on genomic profile.
Running on an average modern computer, it completes CNV analyses
for a 262 thousand-probe array in ~1 second and a 1.8 million-probe
array in 9 seconds