On Developing an Automatic Speech Recognition System for Standard Arabic Language
The Automatic Speech Recognition (ASR) applied to
Arabic language is a challenging task. This is mainly related to the
language specificities which make the researchers facing multiple
difficulties such as the insufficient linguistic resources and the very
limited number of available transcribed Arabic speech corpora. In
this paper, we are interested in the development of a HMM-based
ASR system for Standard Arabic (SA) language. Our fundamental
research goal is to select the most appropriate acoustic parameters
describing each audio frame, acoustic models and speech recognition
unit. To achieve this purpose, we analyze the effect of varying frame
windowing (size and period), acoustic parameter number resulting
from features extraction methods traditionally used in ASR, speech
recognition unit, Gaussian number per HMM state and number of
embedded re-estimations of the Baum-Welch Algorithm. To evaluate
the proposed ASR system, a multi-speaker SA connected-digits
corpus is collected, transcribed and used throughout all experiments.
A further evaluation is conducted on a speaker-independent continue
SA speech corpus. The phonemes recognition rate is 94.02% which is
relatively high when comparing it with another ASR system
evaluated on the same corpus.
[1] M. Kabache, and M. Guerti, "Application des réseaux de neurones ├á la
reconnaissance des phonèmes spécifiques ├á l-Arabe standard", SETIT
2005, Tunisia, March 2005.
[2] S.A. Selouani, and J. Caelen, "Recognition of phonetic features using
neural networks and knowledge-based system: a comparative study",
International Journal on artificial intelligence tools, world scientific
publishing editors, vol. 8, no. 1, pp. 73-103, 1999.
[3] S. Hazmoune, F. Bougamouza, and M. Benmohammed, "La
reconnaissance automatique de la parole par combinaison de classifieurs
markoviens", JEESI-09, 2009.
[4] Y. A. Alotaibi, "Comparative Study of ANN and HMM to Arabic Digits
Recognition Systems", JKAU: Eng. Sci., vol. 19, no. 1, pp. 43-60, 2008.
[5] R. Ejbali, Y. Ben Ayed, and A. M. Alimi, "Arabic continues speech
recognition system using context-independent", Sixth International
Multi-Conference on Systems, Signals & Devices, Tunisia, March 2009.
[6] M. Elshafei, "Toward an Arabic text-to-speech system," The Arabian
Journal for Science and Engineering, vol. 16, no. 4, pp. 565-583, 1991.
[7] M. Alkhouli, Linguistic Phonetics, Daar Alfalah, Swaileh, Jordan, 1990.
[8] F. Jelinek, "Continuous speech recognition by statistical methods",
Proc. IEEE, vol. 64, pp. 532-556, 1976.
[9] L. R. Rabiner, "A Tutorial on Hidden Markov Models and selected
applications in Speech Recognition", Proc. IEEE, vol. 77, pp. 257-286,
Feb. 1989.
[10] F. Lefevre, Estimation de probabilité non-paramétrique pour la
reconnaissance Markovienne de la parole, Pierre and Marie Curie
University, Jan. 2000.
[11] S. Young et al., The HTK Book (for HTK Version 3.4), Cambridge
University, March 2009.
[12] M. Boudraa, and B. Boudraa "Twenty list of ten arabic Sentences for
Assessment", ACUSTICA acta acoustica. vol. 86, no. 43.71, pp. 870-
882, Nov. 1998.
[1] M. Kabache, and M. Guerti, "Application des réseaux de neurones ├á la
reconnaissance des phonèmes spécifiques ├á l-Arabe standard", SETIT
2005, Tunisia, March 2005.
[2] S.A. Selouani, and J. Caelen, "Recognition of phonetic features using
neural networks and knowledge-based system: a comparative study",
International Journal on artificial intelligence tools, world scientific
publishing editors, vol. 8, no. 1, pp. 73-103, 1999.
[3] S. Hazmoune, F. Bougamouza, and M. Benmohammed, "La
reconnaissance automatique de la parole par combinaison de classifieurs
markoviens", JEESI-09, 2009.
[4] Y. A. Alotaibi, "Comparative Study of ANN and HMM to Arabic Digits
Recognition Systems", JKAU: Eng. Sci., vol. 19, no. 1, pp. 43-60, 2008.
[5] R. Ejbali, Y. Ben Ayed, and A. M. Alimi, "Arabic continues speech
recognition system using context-independent", Sixth International
Multi-Conference on Systems, Signals & Devices, Tunisia, March 2009.
[6] M. Elshafei, "Toward an Arabic text-to-speech system," The Arabian
Journal for Science and Engineering, vol. 16, no. 4, pp. 565-583, 1991.
[7] M. Alkhouli, Linguistic Phonetics, Daar Alfalah, Swaileh, Jordan, 1990.
[8] F. Jelinek, "Continuous speech recognition by statistical methods",
Proc. IEEE, vol. 64, pp. 532-556, 1976.
[9] L. R. Rabiner, "A Tutorial on Hidden Markov Models and selected
applications in Speech Recognition", Proc. IEEE, vol. 77, pp. 257-286,
Feb. 1989.
[10] F. Lefevre, Estimation de probabilité non-paramétrique pour la
reconnaissance Markovienne de la parole, Pierre and Marie Curie
University, Jan. 2000.
[11] S. Young et al., The HTK Book (for HTK Version 3.4), Cambridge
University, March 2009.
[12] M. Boudraa, and B. Boudraa "Twenty list of ten arabic Sentences for
Assessment", ACUSTICA acta acoustica. vol. 86, no. 43.71, pp. 870-
882, Nov. 1998.
@article{"International Journal of Electrical, Electronic and Communication Sciences:55177", author = "R. Walha and F. Drira and H. El-Abed and A. M. Alimi", title = "On Developing an Automatic Speech Recognition System for Standard Arabic Language", abstract = "The Automatic Speech Recognition (ASR) applied to
Arabic language is a challenging task. This is mainly related to the
language specificities which make the researchers facing multiple
difficulties such as the insufficient linguistic resources and the very
limited number of available transcribed Arabic speech corpora. In
this paper, we are interested in the development of a HMM-based
ASR system for Standard Arabic (SA) language. Our fundamental
research goal is to select the most appropriate acoustic parameters
describing each audio frame, acoustic models and speech recognition
unit. To achieve this purpose, we analyze the effect of varying frame
windowing (size and period), acoustic parameter number resulting
from features extraction methods traditionally used in ASR, speech
recognition unit, Gaussian number per HMM state and number of
embedded re-estimations of the Baum-Welch Algorithm. To evaluate
the proposed ASR system, a multi-speaker SA connected-digits
corpus is collected, transcribed and used throughout all experiments.
A further evaluation is conducted on a speaker-independent continue
SA speech corpus. The phonemes recognition rate is 94.02% which is
relatively high when comparing it with another ASR system
evaluated on the same corpus.", keywords = "ASR, HMM, acoustical analysis, acoustic modeling,
Standard Arabic language", volume = "6", number = "10", pages = "1154-6", }