On Developing an Automatic Speech Recognition System for Standard Arabic Language

The Automatic Speech Recognition (ASR) applied to Arabic language is a challenging task. This is mainly related to the language specificities which make the researchers facing multiple difficulties such as the insufficient linguistic resources and the very limited number of available transcribed Arabic speech corpora. In this paper, we are interested in the development of a HMM-based ASR system for Standard Arabic (SA) language. Our fundamental research goal is to select the most appropriate acoustic parameters describing each audio frame, acoustic models and speech recognition unit. To achieve this purpose, we analyze the effect of varying frame windowing (size and period), acoustic parameter number resulting from features extraction methods traditionally used in ASR, speech recognition unit, Gaussian number per HMM state and number of embedded re-estimations of the Baum-Welch Algorithm. To evaluate the proposed ASR system, a multi-speaker SA connected-digits corpus is collected, transcribed and used throughout all experiments. A further evaluation is conducted on a speaker-independent continue SA speech corpus. The phonemes recognition rate is 94.02% which is relatively high when comparing it with another ASR system evaluated on the same corpus.




References:
[1] M. Kabache, and M. Guerti, "Application des réseaux de neurones ├á la
reconnaissance des phonèmes spécifiques ├á l-Arabe standard", SETIT
2005, Tunisia, March 2005.
[2] S.A. Selouani, and J. Caelen, "Recognition of phonetic features using
neural networks and knowledge-based system: a comparative study",
International Journal on artificial intelligence tools, world scientific
publishing editors, vol. 8, no. 1, pp. 73-103, 1999.
[3] S. Hazmoune, F. Bougamouza, and M. Benmohammed, "La
reconnaissance automatique de la parole par combinaison de classifieurs
markoviens", JEESI-09, 2009.
[4] Y. A. Alotaibi, "Comparative Study of ANN and HMM to Arabic Digits
Recognition Systems", JKAU: Eng. Sci., vol. 19, no. 1, pp. 43-60, 2008.
[5] R. Ejbali, Y. Ben Ayed, and A. M. Alimi, "Arabic continues speech
recognition system using context-independent", Sixth International
Multi-Conference on Systems, Signals & Devices, Tunisia, March 2009.
[6] M. Elshafei, "Toward an Arabic text-to-speech system," The Arabian
Journal for Science and Engineering, vol. 16, no. 4, pp. 565-583, 1991.
[7] M. Alkhouli, Linguistic Phonetics, Daar Alfalah, Swaileh, Jordan, 1990.
[8] F. Jelinek, "Continuous speech recognition by statistical methods",
Proc. IEEE, vol. 64, pp. 532-556, 1976.
[9] L. R. Rabiner, "A Tutorial on Hidden Markov Models and selected
applications in Speech Recognition", Proc. IEEE, vol. 77, pp. 257-286,
Feb. 1989.
[10] F. Lefevre, Estimation de probabilité non-paramétrique pour la
reconnaissance Markovienne de la parole, Pierre and Marie Curie
University, Jan. 2000.
[11] S. Young et al., The HTK Book (for HTK Version 3.4), Cambridge
University, March 2009.
[12] M. Boudraa, and B. Boudraa "Twenty list of ten arabic Sentences for
Assessment", ACUSTICA acta acoustica. vol. 86, no. 43.71, pp. 870-
882, Nov. 1998.