Abstract: Years of extensive research in the field of speech
processing for compression and recognition in the last five decades,
resulted in a severe competition among the various methods and
paradigms introduced. In this paper we include the different representations
of speech in the time-frequency and time-scale domains
for the purpose of compression and recognition. The examination of
these representations in a variety of related work is accomplished.
In particular, we emphasize methods related to Fourier analysis
paradigms and wavelet based ones along with the advantages and
disadvantages of both approaches.
Abstract: Current systems for face recognition techniques often
use either SVM or Adaboost techniques for face detection part and use
PCA for face recognition part. In this paper, we offer a novel method
for not only a powerful face detection system based on
Six-segment-filters (SSR) and Adaboost learning algorithms but also
for a face recognition system. A new exclusive face detection
algorithm has been developed and connected with the recognition
algorithm. As a result of it, we obtained an overall high-system
performance compared with current systems. The proposed algorithm
was tested on CMU, FERET, UNIBE, MIT face databases and
significant performance has obtained.
Abstract: This work presents a novel means of extracting fixedlength parameters from voice signals, such that words can be recognized
in linear time. The power and the zero crossing rate are first
calculated segment by segment from a voice signal; by doing so, two
feature sequences are generated. We then construct an FIR system
across these two sequences. The parameters of this FIR system, used
as the input of a multilayer proceptron recognizer, can be derived by
recursive LSE (least-square estimation), implying that the complexity of overall process is linear to the signal size. In the second part of
this work, we introduce a weighting factor λ to emphasize recent
input; therefore, we can further recognize continuous speech signals.
Experiments employ the voice signals of numbers, from zero to nine, spoken in Mandarin Chinese. The proposed method is verified to
recognize voice signals efficiently and accurately.
Abstract: The objective of this paper is to propose an adaptive multi threshold for image segmentation precisely in object detection. Due to the different types of license plates being used, the requirement of an automatic LPR is rather different for each country. The proposed technique is applied on Malaysian LPR application. It is based on Multi Layer Perceptron trained by back propagation. The proposed adaptive threshold is introduced to find the optimum threshold values. The technique relies on the peak value from the graph of the number object versus specific range of threshold values. The proposed approach has improved the overall performance compared to current optimal threshold techniques. Further improvement on this method is in progress to accommodate real time system specification.
Abstract: The urbanization phenomenon in Yogyakarta Special
Province, Indonesia, encouraged people move to the city for getting
jobs in the informal sectors. They live in some temporary houses in
the three main riverbanks: Gadjahwong, Code, and Winongo.
Triggered by its independent status they use it as the space for
accommodating domestic, social and economy activities because of
the non standardized room size of their houses, where are recognized
as the environmental hazards. This recognition makes the ambivalent
perception when was related to the twelfth point of the philosophy of
community development concept: the empowering individuals and
communities. Its spatial implication have actually described the
territory and the place making phenomena. By analyzing some data
collected the author-s fundamental research funded by The General
Directorate of Higher Education of Indonesia, this paper will discuss
how do the spatial implications of the occupants- behavior and the
numerous perceptions of those phenomena.
Abstract: This paper describes a new supervised fusion (hybrid)
electrocardiogram (ECG) classification solution consisting of a new
QRS complex geometrical feature extraction as well as a new version
of the learning vector quantization (LVQ) classification algorithm
aimed for overcoming the stability-plasticity dilemma. Toward this
objective, after detection and delineation of the major events of ECG
signal via an appropriate algorithm, each QRS region and also its
corresponding discrete wavelet transform (DWT) are supposed as
virtual images and each of them is divided into eight polar sectors.
Then, the curve length of each excerpted segment is calculated
and is used as the element of the feature space. To increase the
robustness of the proposed classification algorithm versus noise,
artifacts and arrhythmic outliers, a fusion structure consisting of
five different classifiers namely as Support Vector Machine (SVM),
Modified Learning Vector Quantization (MLVQ) and three Multi
Layer Perceptron-Back Propagation (MLP–BP) neural networks with
different topologies were designed and implemented. The new proposed
algorithm was applied to all 48 MIT–BIH Arrhythmia Database
records (within–record analysis) and the discrimination power of the
classifier in isolation of different beat types of each record was
assessed and as the result, the average accuracy value Acc=98.51%
was obtained. Also, the proposed method was applied to 6 number
of arrhythmias (Normal, LBBB, RBBB, PVC, APB, PB) belonging
to 20 different records of the aforementioned database (between–
record analysis) and the average value of Acc=95.6% was achieved.
To evaluate performance quality of the new proposed hybrid learning
machine, the obtained results were compared with similar peer–
reviewed studies in this area.
Abstract: In this paper, we propose a face recognition algorithm
using AAM and Gabor features. Gabor feature vectors which are well
known to be robust with respect to small variations of shape, scaling,
rotation, distortion, illumination and poses in images are popularly
employed for feature vectors for many object detection and
recognition algorithms. EBGM, which is prominent among face
recognition algorithms employing Gabor feature vectors, requires
localization of facial feature points where Gabor feature vectors are
extracted. However, localization method employed in EBGM is based
on Gabor jet similarity and is sensitive to initial values. Wrong
localization of facial feature points affects face recognition rate. AAM
is known to be successfully applied to localization of facial feature
points. In this paper, we devise a facial feature point localization
method which first roughly estimate facial feature points using AAM
and refine facial feature points using Gabor jet similarity-based facial
feature localization method with initial points set by the rough facial
feature points obtained from AAM, and propose a face recognition
algorithm using the devised localization method for facial feature
localization and Gabor feature vectors. It is observed through
experiments that such a cascaded localization method based on both
AAM and Gabor jet similarity is more robust than the localization
method based on only Gabor jet similarity. Also, it is shown that the
proposed face recognition algorithm using this devised localization
method and Gabor feature vectors performs better than the
conventional face recognition algorithm using Gabor jet
similarity-based localization method and Gabor feature vectors like
EBGM.
Abstract: A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.
Abstract: The Block Sorting problem is to sort a given
permutation moving blocks. A block is defined as a substring
of the given permutation, which is also a substring of the
identity permutation. Block Sorting has been proved to be
NP-Hard. Until now two different 2-Approximation algorithms
have been presented for block sorting. These are the best known
algorithms for Block Sorting till date. In this work we present
a different characterization of Block Sorting in terms of a
transposition cycle graph. Then we suggest a heuristic,
which we show to exhibit a 2-approximation performance
guarantee for most permutations.
Abstract: this paper presents a multi-context recurrent network for time series analysis. While simple recurrent network (SRN) are very popular among recurrent neural networks, they still have some shortcomings in terms of learning speed and accuracy that need to be addressed. To solve these problems, we proposed a multi-context recurrent network (MCRN) with three different learning algorithms. The performance of this network is evaluated on some real-world application such as handwriting recognition and energy load forecasting. We study the performance of this network and we compared it to a very well established SRN. The experimental results showed that MCRN is very efficient and very well suited to time series analysis and its applications.
Abstract: Happening of Ferroresonance phenomenon is one of the reasons of consuming and ruining transformers, so recognition of Ferroresonance phenomenon has a special importance. A novel method for classification of Ferroresonance presented in this paper. Using this method Ferroresonance can be discriminate from other transients such as capacitor switching, load switching, transformer switching. Wavelet transform is used for decomposition of signals and Competitive Neural Network used for classification. Ferroresonance data and other transients was obtained by simulation using EMTP program. Using Daubechies wavelet transform signals has been decomposed till six levels. The energy of six detailed signals that obtained by wavelet transform are used for training and trailing Competitive Neural Network. Results show that the proposed procedure is efficient in identifying Ferroresonance from other events.
Abstract: A Web-based learning tool, the Learn IN Context
(LINC) system, designed and being used in some institution-s
courses in mixed-mode learning, is presented in this paper. This
mode combines face-to-face and distance approaches to education.
LINC can achieve both collaborative and competitive learning. In
order to provide both learners and tutors with a more natural way to
interact with e-learning applications, a conversational interface has
been included in LINC. Hence, the components and essential features
of LINC+, the voice enhanced version of LINC, are described. We
report evaluation experiments of LINC/LINC+ in a real use context
of a computer programming course taught at the Université de
Moncton (Canada). The findings show that when the learning
material is delivered in the form of a collaborative and voice-enabled
presentation, the majority of learners seem to be satisfied with this
new media, and confirm that it does not negatively affect their
cognitive load.
Abstract: Detection, feature extraction and pose estimation of
people in images and video is made challenging by the variability of
human appearance, the complexity of natural scenes and the high
dimensionality of articulated body models and also the important
field in Image, Signal and Vision Computing in recent years. In this
paper, four types of people in 2D dimension image will be tested and
proposed. The system will extract the size and the advantage of them
(such as: tall fat, short fat, tall thin and short thin) from image. Fat
and thin, according to their result from the human body that has been
extract from image, will be obtained. Also the system extract every
size of human body such as length, width and shown them in output.
Abstract: The paper proposes a novel technique for iris
recognition using texture and phase features. Texture features are
extracted on the normalized iris strip using Haar Wavelet while phase
features are obtained using LOG Gabor Wavelet. The matching
scores generated from individual modules are combined using sum of
score technique. The system is tested on database obtained from Bath
University and Indian Institute of Technology Kanpur and is giving
an accuracy of 95.62% and 97.66% respectively. The FAR and FRR
of the combined system is also reduced comparatively.
Abstract: Face Recognition is a field of multidimensional
applications. A lot of work has been done, extensively on the most of
details related to face recognition. This idea of face recognition using
PCA is one of them. In this paper the PCA features for Feature
extraction are used and matching is done for the face under
consideration with the test image using Eigen face coefficients. The
crux of the work lies in optimizing Euclidean distance and paving the
way to test the same algorithm using Matlab which is an efficient tool
having powerful user interface along with simplicity in representing
complex images.
Abstract: Eye localization is necessary for face recognition and
related application areas. Most of eye localization algorithms reported
so far still need to be improved about precision and computational
time for successful applications. In this paper, we propose an eye
location method based on multi-scale Gabor feature vectors, which is
more robust with respect to initial points. The eye localization based
on Gabor feature vectors first needs to constructs an Eye Model Bunch
for each eye (left or right eye) which consists of n Gabor jets and
average eye coordinates of each eyes obtained from n model face
images, and then tries to localize eyes in an incoming face image by
utilizing the fact that the true eye coordinates is most likely to be very
close to the position where the Gabor jet will have the best Gabor jet
similarity matching with a Gabor jet in the Eye Model Bunch. Similar
ideas have been already proposed in such as EBGM (Elastic Bunch
Graph Matching). However, the method used in EBGM is known to be
not robust with respect to initial values and may need extensive search
range for achieving the required performance, but extensive search
ranges will cause much more computational burden. In this paper, we
propose a multi-scale approach with a little increased computational
burden where one first tries to localize eyes based on Gabor feature
vectors in a coarse face image obtained from down sampling of the
original face image, and then localize eyes based on Gabor feature
vectors in the original resolution face image by using the eye
coordinates localized in the coarse scaled image as initial points.
Several experiments and comparisons with other eye localization
methods reported in the other papers show the efficiency of our
proposed method.
Abstract: In this paper, in order to categorize ORL database face
pictures, principle Component Analysis (PCA) and Kernel Principal
Component Analysis (KPCA) methods by using Elman neural
network and Support Vector Machine (SVM) categorization methods
are used. Elman network as a recurrent neural network is proposed
for modeling storage systems and also it is used for reviewing the
effect of using PCA numbers on system categorization precision rate
and database pictures categorization time. Categorization stages are
conducted with various components numbers and the obtained results
of both Elman neural network categorization and support vector
machine are compared. In optimum manner 97.41% recognition
accuracy is obtained.
Abstract: Support Vector Machine (SVM) is a statistical learning tool that was initially developed by Vapnik in 1979 and later developed to a more complex concept of structural risk minimization (SRM). SVM is playing an increasing role in applications to detection problems in various engineering problems, notably in statistical signal processing, pattern recognition, image analysis, and communication systems. In this paper, SVM was applied to the detection of medical ultrasound images in the presence of partially developed speckle noise. The simulation was done for single look and multi-look speckle models to give a complete overlook and insight to the new proposed model of the SVM-based detector. The structure of the SVM was derived and applied to clinical ultrasound images and its performance in terms of the mean square error (MSE) metric was calculated. We showed that the SVM-detected ultrasound images have a very low MSE and are of good quality. The quality of the processed speckled images improved for the multi-look model. Furthermore, the contrast of the SVM detected images was higher than that of the original non-noisy images, indicating that the SVM approach increased the distance between the pixel reflectivity levels (detection hypotheses) in the original images.
Abstract: We present a new method for the fully automatic 3D
reconstruction of the coronary artery centerlines, using two X-ray
angiogram projection images from a single rotating monoplane
acquisition system. During the first stage, the input images are
smoothed using curve evolution techniques. Next, a simple yet
efficient multiscale method, based on the information of the Hessian
matrix, for the enhancement of the vascular structure is introduced.
Hysteresis thresholding using different image quantiles, is used to
threshold the arteries. This stage is followed by a thinning procedure
to extract the centerlines. The resulting skeleton image is then pruned
using morphological and pattern recognition techniques to remove
non-vessel like structures. Finally, edge-based stereo correspondence
is solved using a parallel evolutionary optimization method based on
f symbiosis. The detected 2D centerlines combined with disparity
map information allow the reconstruction of the 3D vessel
centerlines. The proposed method has been evaluated on patient data
sets for evaluation purposes.
Abstract: A human verification system is presented in this
paper. The system consists of several steps: background subtraction,
thresholding, line connection, region growing, morphlogy, star
skelatonization, feature extraction, feature matching, and decision
making. The proposed system combines an advantage of star
skeletonization and simple statistic features. A correlation matching
and probability voting have been used for verification, followed by a
logical operation in a decision making stage. The proposed system
uses small number of features and the system reliability is
convincing.