Speech Coding and Recognition

This paper investigates the performance of a speech recognizer in an interactive voice response system for various coded speech signals, coded by using a vector quantization technique namely Multi Switched Split Vector Quantization Technique. The process of recognizing the coded output can be used in Voice banking application. The recognition technique used for the recognition of the coded speech signals is the Hidden Markov Model technique. The spectral distortion performance, computational complexity, and memory requirements of Multi Switched Split Vector Quantization Technique and the performance of the speech recognizer at various bit rates have been computed. From results it is found that the speech recognizer is showing better performance at 24 bits/frame and it is found that the percentage of recognition is being varied from 100% to 93.33% for various bit rates.




References:
[1] Rabiner Lawrence, Juang Bing-Hwang, Fundamentals of speech
Recognition, Prentice Hall, New Jersey, 1993, ISBN 0-13-015157-2.
[2] Lawrence R.Rabiner, A tutorial on Hidden Markov Models and
selected applications in speech recognition, Poceedings of the IEEE,
Vol 77, no.2, Feb 1989, pp.154-161.
[3] Rabiner L.R, Levinson S.E., Rosenberg A.E. & Wilpon J.G, Speaker
independent recognition of isolated words using clustering techniques,
IEEE Trans. Acoustics, Speech, Signal Proc., 1979, pp.336-349.
[4] M.Satya Sai Ram., P.Siddaiah., & M.MadhaviLatha, Multi Switched
Split Vector Quantization of Narrow Band Speech Signals,
Proceedings World Academy of Science, Engineering and
Technology, WASET, Vol.27, Feb 2008, pp.236-239.
[5] M.Satya Sai Ram., P.Siddaiah., & M.MadhaviLatha, Multi Switched
Split Vector Quantizer, International Journal of Computer,
Information, and Systems science, and Engineering, IJCISSE,
WASET, Vol.2, no.1, May 2008, pp.1-6.
[6] Paliwal. K.K, Atal. B.S, Efficient vector quantization of LPC
Parameters at 24 bits/frame, IEEE Trans. Speech Audio Process,
1993, pp. 3-14.
[7] Stephen. So, & Paliwal. K. K, Efficient product code vector
quantization using switched split vector quantizer, Digital Signal
Processing journal, Elsevier, Vol 17, Jan 2007, pp.138-171.
[8] Bastiaan Kleijn. W, Tom Backstrom, & Paavo Alku, On Line Spectral
Frequencies," IEEE Signal Processing Letters, Vol.10, no.3, 2003.
[9] Soong. F, Juang. B, Line spectrum pair (LSP) and speech data
compression, IEEE Conference. On Acoustics, Speech Signal
Processing, vol 9, no.1, Mar 1984, pp. 37-40.
[10] P. Kabal, & P. Rama Chandran, The Computation of Line Spectral
Frequencies Using Chebyshev polynomials, IEEE Trans. On
Acoustics, Speech Signal Processing, Vol 34, no.6, 1986, pp. 1419-
1426.
[11] P. Lockwood and J. Boudy, .Experiments with a Nonlinear Spectral
Subtraction (NSS), Hidden Markov Models and the Projection, for
Robust Speech Recognition in Cars. Speech Communiaction, vol. 11,
1992 , pp. 215.228.
[12] S.F. Boll, Suppression of Acoustic Noise in Speech using Spectral
Subtraction, IEEE Trans. on ASSP, vol. 27(2), 1979, pp.113-120.
[13] M. Berouti, R. Schwartz, and J. Makhoul, Enhancement of Speech
Corrupted by Acoustic Noise. in Proc. ICASSP, 1979, pp. 208.211.
[14] Linde .Y, Buzo. A, & Gray. R.M, An Algorithm for Vector Quantizer
Design, IEEE Trans.Commun, 28, Jan.1980, pp. 84-95.