Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits

This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.


Authors:



References:
[1] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd ed.
Academic Press, Inc., 2008.
[2] J. Holmes, W. Holmes, Speech Synthesis and Recognition, 2nd ed., CRC
Press, 2001.
[3] A. Ganoun and I. Almerhag, Performance Analysis of Spoken Arabic
Digits Recognition Techniques, Journal of Electronic Science and
Technology, vol. 10, no. 2, pp 153-157, June 2012.
[4] Beauchamp, K.G., Applications of Walsh and Related Functions,
Academic Press, 1984.
[5] Percival, D. B., and A. T. Walden. Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques,
Cambridge University Press, 1993.
[6] Stoica, P., and R.L. Moses, Introduction to Spectral Analysis, 1st ed.,
Prentice-Hall, 1997.
[7] K. Saeed and M. Nammous, A Speech-and-Speaker Identification
System: Feature Extraction, Description, and Classification of Speech-
Signal Image, IEEE Transactions On Industrial Electronics, vol. 54, no.
2, April 2007.
[8] Z. Hachkar et al., Comparison of MFCC and PLP Parameterization in
pattern recognition of Arabic Alphabet Speech, Canadian Journal on Artificial Intelligence, Machine Learning & Pattern Recognition vol. 2,
no. 3, April 2011.
[9] M. Abushariah et al., Arabic Speaker-Independent Continuous Automatic Speech Recognition Based on a Phonetically Rich and
Balanced Speech Corpus, The International Arab Journal of Information
Technology, vol. 9, No. 1, January 2012.
[10] T. Ganchev, M. Siafarikas and N. Fakotakis, Evaluation of speech
parameterization methods for speaker recognition, Proc. of the Acoustics, vol. 18-19, pp. 105-110, 2006.