Abstract: Statement of the automatic speech recognition
problem, the assignment of speech recognition and the application
fields are shown in the paper. At the same time as Azerbaijan speech,
the establishment principles of speech recognition system and the
problems arising in the system are investigated. The computing algorithms of speech features, being the main part
of speech recognition system, are analyzed. From this point of view,
the determination algorithms of Mel Frequency Cepstral Coefficients
(MFCC) and Linear Predictive Coding (LPC) coefficients expressing
the basic speech features are developed. Combined use of cepstrals of
MFCC and LPC in speech recognition system is suggested to
improve the reliability of speech recognition system. To this end, the
recognition system is divided into MFCC and LPC-based recognition
subsystems. The training and recognition processes are realized in
both subsystems separately, and recognition system gets the decision
being the same results of each subsystems. This results in decrease of
error rate during recognition. The training and recognition processes are realized by artificial
neural networks in the automatic speech recognition system. The
neural networks are trained by the conjugate gradient method. In the
paper the problems observed by the number of speech features at
training the neural networks of MFCC and LPC-based speech
recognition subsystems are investigated. The variety of results of neural networks trained from different
initial points in training process is analyzed. Methodology of
combined use of neural networks trained from different initial points
in speech recognition system is suggested to improve the reliability
of recognition system and increase the recognition quality, and
obtained practical results are shown.
Abstract: In this paper three different approaches for person
verification and identification, i.e. by means of fingerprints, face and
voice recognition, are studied. Face recognition uses parts-based
representation methods and a manifold learning approach. The
assessment criterion is recognition accuracy. The techniques under
investigation are: a) Local Non-negative Matrix Factorization
(LNMF); b) Independent Components Analysis (ICA); c) NMF with
sparse constraints (NMFsc); d) Locality Preserving Projections
(Laplacianfaces). Fingerprint detection was approached by classical
minutiae (small graphical patterns) matching through image
segmentation by using a structural approach and a neural network as
decision block. As to voice / speaker recognition, melodic cepstral
and delta delta mel cepstral analysis were used as main methods, in
order to construct a supervised speaker-dependent voice recognition
system. The final decision (e.g. “accept-reject" for a verification
task) is taken by using a majority voting technique applied to the
three biometrics. The preliminary results, obtained for medium
databases of fingerprints, faces and voice recordings, indicate the
feasibility of our study and an overall recognition precision (about
92%) permitting the utilization of our system for a future complex
biometric card.
Abstract: In this paper in consideration of each available
techniques deficiencies for speech recognition, an advanced method
is presented that-s able to classify speech signals with the high
accuracy (98%) at the minimum time. In the presented method, first,
the recorded signal is preprocessed that this section includes
denoising with Mels Frequency Cepstral Analysis and feature
extraction using discrete wavelet transform (DWT) coefficients; Then
these features are fed to Multilayer Perceptron (MLP) network for
classification. Finally, after training of neural network effective
features are selected with UTA algorithm.