Analysis of Combined Use of NN and MFCC for Speech Recognition

The performance and analysis of speech recognition
system is illustrated in this paper. An approach to recognize the
English word corresponding to digit (0-9) spoken by 2 different
speakers is captured in noise free environment. For feature extraction,
speech Mel frequency cepstral coefficients (MFCC) has been used
which gives a set of feature vectors from recorded speech samples.
Neural network model is used to enhance the recognition
performance. Feed forward neural network with back propagation
algorithm model is used. However other speech recognition
techniques such as HMM, DTW exist. All experiments are carried
out on Matlab.





References:
[1] L.R. Rabiner, A tutorial on Hidden Markov Models and selected
applications in speech recognition, Proc. IEEE, 77(2), 1989, 257-286.
[2] L.R Rabiner J. G. Wilpon, Speaker independent isolated word
recognition for a moderate size voculabary, IEEE Transaction on
Acoustics, Speech Signal Processing, ASSP-27, 1979, 583-587.
[3] Picheny M; Nahamou D, Goel V, Kingbusy B, Ramabhadran S.J Saon,
G Trends and Advances in Speech recognition” IBM Journal of
Research and Development, Vol no-5 PP-2:1-2:18 sept-oct-2011
[4] Haykin, S., “Neural Networks A Comprehensive Approach”, Prentice
Hall, 1999.
[5] L. Muda M. Begam, I. Elamvazuthi, Voice recognition algorithms using
Mel Frequency Cepstral Coeffcient (MFCC) and Dynamic Time
Warping (DTW) techniques, Journal of Computing, 2 (3), 2010, 138-
143.
[6] Environmental Natural sound detection and classification using content
based retrierval (CBB) and MFCC by Subarta Mandal, Institutional
journal of engineering research and application (IJERA) ISSN:2248-
9622, Vol:2 , issue-6 Nov-Dec 2012 PP-123-129.
[7] Chadawan Ittichaichareon, Siwat Sukasri and Tha-Weesak
Yingthawornsuk” Speech recognition using MFCC, published in
international conference on computer Graphics simulation and modeling
(ICGSM 2012) July 28-29 2012 pattaya (Thailand).
[8] Stan Salvador and Pjilip Chan Fast DTW: Toward Accurate Dynamic
time Warping in Linear time space, Florida Institute of Technology,
Melbourne.
[9] Mohd Tamil, MOhd Yamani Idna Idris” Quarnic verse recitation feature
extraction using MFCC AL-Quran & AL- Hadith Academy of Islamic
Studeis of Malaya.
[10] M.B Herscher, R.B Cox, An adaptive isolated word speech recognition
system, Proc. conf. on speech communication and Processing, Newton,
MA, 1972, 89-92.
[11] Performances Analysis of learning classifier for spoken digit under
Noisy condition vol.4, No-3 March 2013 in Journal of emerging trends
in computing and information science.
[12] W. Ghai, N Singh, Literature review on automatic speech recognition,
International Journal of Computer Applications, 41 (8), 2012, 43-50.
[13] Pramod B. Patil” Multilayered Network for LPC Based Speech
Recognition”, IEEE 1998.
[14] Kung S, Digital Neural Network, Printice Hall 1993.