Various Speech Processing Techniques For Speech Compression And Recognition

Years of extensive research in the field of speech processing for compression and recognition in the last five decades, resulted in a severe competition among the various methods and paradigms introduced. In this paper we include the different representations of speech in the time-frequency and time-scale domains for the purpose of compression and recognition. The examination of these representations in a variety of related work is accomplished. In particular, we emphasize methods related to Fourier analysis paradigms and wavelet based ones along with the advantages and disadvantages of both approaches.

Authors:



References:
[1] Artimy, M., Phillips, W.J. and Robertson, W., Automatic Detection Of
Acoustic Sub-word Boundaries For Single Digit Recognition Proceeding
IEEE Canadian Conference on Electrical and Computer Engineering,
1999.
[2] Chan, Y.T., Wavelet Basics, Kluwer Academic Publisher, Boston, 1995.
[3] Cooley, J.W. and Tukey, J.W., An Algorithm For The Machine Computation
Of Complex Fourier series, Mathematics of Computation, Vol. 19.
pp: 297-301, 1965.
[4] Daubechies, I., The Wavelet Transform, Time Frequency Localization and
Signal Analysis, IEEE Transaction on Information Theory, Vol. 36, No.5
pp: 961-1005, 1990.
[5] Feng, Yanhui,Thanagasundram, Schlindwein, S., Soares, F., Discrete
wavelet-based thresholding study on acoustic emission signals to detect
bearing defect on a rotating machine,Thirteenth International Congress
on Sound and Vibration, Vienna, Austria July 2-6, 2006.
[6] Gabor, D., Theory of Communication, Journal of the IEEE No. 93, pp:
429-456, 1946.
[7] Graps, A., An Introduction To Wavelets, IEEE Computational Sciences
and Engineering, Volume 2, Number 2, pp: 50-61, Summer 1995.
[8] Karam, J.R., Phillips, W.J. and Robertson, W., New Low Rate Wavelet
Models For The Recognition Of Single Spoken Digits, IEEE, proceedings
of ccece, Halifax, pp:331-334, May, 2000.
[9] Karam, J.R., Phillips, W.J. and Robertson, W., Optimal Feature Vector
For Speech Recognition Of Unequally Segmented Spoken Digits, IEEE,
proceedings of ccece, Halifax, pp:327-330 May, 2000.
[10] Karam, J., A Global Threshold Wavelet-Based Scheme for Speech
Recognition, Third International conference on Computer Science, Software
Engineering Information Technology, E-Business and Applications,
Cairo, Egypt, Dec. 27-29 2004.
[11] Karam, J., Saad, R., The Effect of Different Compression Schemes on
Speech Signals, International Journal of Biomedical Sciences, Vol. 1 No.
4, pp: 230 234, 2006.
[12] News and Analysis of Speech Recognition Markets, Products and
Technology, Num. 73 pp: 1-32, July 1999.
[13] Misiti, M., Misiti, Y., Oppenheim, G., Poggi, J., Matlab Wavelet Toolbox,
Math Works, Natick, MA, 1997.
[14] NIST, TIDIGITS, Speech Discs, Studio Quality Speaker-Independent
Connected-Digital Corpus, NTIS PB91-506592, Texas Instruments, Feb.
1991.
[15] NIST, Speech Discs 7-1.1, TI 46 Word Speech Database Speaker-
Dependent Isolated-Digital Corpus, LDC93S9, Texas Instruments, Sep.
1991.
[16] Oppenheim, A.V. and Schafer, R.W., Discrete-Time Signal Processing,
Prentice Hall, Englewood Cliffs, New Jersey, 1989.
[17] Phillips, W.J., Tosuner, C. and Robertson, W., Speech Recognition
Techniques Using RBF Networks, IEEE, WESCANEX, Proceedings,
1995.
[18] Rabiner, L., Digital Formant Synthesizer For Speech Synthesis Studies,
J. Acoust. Soc. Am., Vol, 43, No. 2, pp: 822-828, April 1968.
[19] Rabiner, L. Juang, B., Fundamental of Speech Recognition, Prentice
Hall, New Jersey, 1993.
[20] Rabiner, L. and Sambur, M.R., An algorithm for determining the end
points of isolated utterances, Bell Systems Technical Journal, Vol.54, pp:
297-315, Feb. 1975.
[21] Rabiner, L.R. and Schafer, R.W., Digital Processing of Speech Signals,
Prentice Hall, New Jersey, 1978.
[22] Reddy, D.R., Computer recognition of connected speech, Journal of the
Acoustical Society of America, Vol. 42, pp:329-347, 1967.
[23] Picone, J.W., Signal Modeling Techniques in Speech Recognition, IEEE,
Vol.81, No.9, September 1993.
[24] Strang, G. and Nguyen, T., Wavelets and Filter Banks, Wellesley MA,
Wellesley-Cambridge Press, Wellesley, MA, 1996.
[25] Taswell, C., Speech Compression with Cosine and Wavelet packet nearbest
bases, IEEE International Conference on Acoustic, Speech, and
Signal Processing, p.p 566-568 Vol. 1, May 1996.
[26] Young, R.K., Wavelet Theory and its Applications, Kluwer Academic
Publishers, Lancaster, USA 1995.