Abstract: This paper presents a formant-tracking linear prediction
(FTLP) model for speech processing in noise. The main focus of this
work is the detection of formant trajectory based on Hidden Markov
Models (HMM), for improved formant estimation in noise. The
approach proposed in this paper provides a systematic framework for
modelling and utilization of a time- sequence of peaks which satisfies
continuity constraints on parameter; the within peaks are modelled
by the LP parameters. The formant tracking LP model estimation
is composed of three stages: (1) a pre-cleaning multi-band spectral
subtraction stage to reduce the effect of residue noise on formants
(2) estimation stage where an initial estimate of the LP model of
speech for each frame is obtained (3) a formant classification using
probability models of formants and Viterbi-decoders. The evaluation
results for the estimation of the formant tracking LP model tested
in Gaussian white noise background, demonstrate that the proposed
combination of the initial noise reduction stage with formant tracking
and LPC variable order analysis, results in a significant reduction in
errors and distortions. The performance was evaluated with noisy
natual vowels extracted from international french and English vocabulary
speech signals at SNR value of 10dB. In each case, the
estimated formants are compared to reference formants.
Abstract: This paper investigates the performance of a speech
recognizer in an interactive voice response system for various coded
speech signals, coded by using a vector quantization technique namely
Multi Switched Split Vector Quantization Technique. The process of
recognizing the coded output can be used in Voice banking application.
The recognition technique used for the recognition of the coded speech
signals is the Hidden Markov Model technique. The spectral distortion
performance, computational complexity, and memory requirements of
Multi Switched Split Vector Quantization Technique and the
performance of the speech recognizer at various bit rates have been
computed. From results it is found that the speech recognizer is
showing better performance at 24 bits/frame and it is found that the
percentage of recognition is being varied from 100% to 93.33% for
various bit rates.
Abstract: A word recognition architecture based on a network
of neural associative memories and hidden Markov models has been
developed. The input stream, composed of subword-units like wordinternal
triphones consisting of diphones and triphones, is provided
to the network of neural associative memories by hidden Markov
models. The word recognition network derives words from this input
stream. The architecture has the ability to handle ambiguities on
subword-unit level and is also able to add new words to the
vocabulary during performance. The architecture is implemented to
perform the word recognition task in a language processing system
for understanding simple command sentences like “bot show apple".
Abstract: Current image-based individual human recognition
methods, such as fingerprints, face, or iris biometric modalities
generally require a cooperative subject, views from certain aspects,
and physical contact or close proximity. These methods cannot
reliably recognize non-cooperating individuals at a distance in the
real world under changing environmental conditions. Gait, which
concerns recognizing individuals by the way they walk, is a relatively
new biometric without these disadvantages. The inherent gait
characteristic of an individual makes it irreplaceable and useful in
visual surveillance.
In this paper, an efficient gait recognition system for human
identification by extracting two features namely width vector of
the binary silhouette and the MPEG-7-based region-based shape
descriptors is proposed. In the proposed method, foreground objects
i.e., human and other moving objects are extracted by estimating
background information by a Gaussian Mixture Model (GMM) and
subsequently, median filtering operation is performed for removing
noises in the background subtracted image. A moving target classification
algorithm is used to separate human being (i.e., pedestrian)
from other foreground objects (viz., vehicles). Shape and boundary
information is used in the moving target classification algorithm.
Subsequently, width vector of the outer contour of binary silhouette
and the MPEG-7 Angular Radial Transform coefficients are taken as
the feature vector. Next, the Principal Component Analysis (PCA)
is applied to the selected feature vector to reduce its dimensionality.
These extracted feature vectors are used to train an Hidden Markov
Model (HMM) for identification of some individuals. The proposed
system is evaluated using some gait sequences and the experimental
results show the efficacy of the proposed algorithm.