Efficient System for Speech Recognition using General Regression Neural Network
In this paper we present an efficient system for
independent speaker speech recognition based on neural network
approach. The proposed architecture comprises two phases: a
preprocessing phase which consists in segmental normalization and
features extraction and a classification phase which uses neural
networks based on nonparametric density estimation namely the
general regression neural network (GRNN). The relative
performances of the proposed model are compared to the similar
recognition systems based on the Multilayer Perceptron (MLP), the
Recurrent Neural Network (RNN) and the well known Discrete
Hidden Markov Model (HMM-VQ) that we have achieved also.
Experimental results obtained with Arabic digits have shown that the
use of nonparametric density estimation with an appropriate
smoothing factor (spread) improves the generalization power of the
neural network. The word error rate (WER) is reduced significantly
over the baseline HMM method. GRNN computation is a successful
alternative to the other neural network and DHMM.
[1] L. Rabiner, "A Tutorial on hidden Markov model and selected
applications", in Proc. of IEEE, Vol. 77, n┬░2, 1989.
[2] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford
University Press, 1995.
[3] S. Haykin , Neural Networks: A Comprehensive Foundation", 2nd ed.,
Cliffs, NJ,1999.
[4] R. P. Lippman, "Review of Neural Networks for Speech Recognition"
Neural Computation, n┬░1, pp.1-38, 1989.
[5] F. Jelinek, Statistical Methods for Speech Recognition, Cambridge,
Massachusetts, MIT Press, 1997.
[6] A. Waibel, T. Harazawa, G. Hinton, K. Shakano and K.J. Lang,
"Phoneme recognition using Time-Delay Neural Networks," IEEE
Trans. On ASSP, vol. 37, n┬░3, pp. 328-339, March 1989.
[7] K. Lang, A. Waibel, and G. Hinton, "A Time Delay Neural Network
architecture," Neural Networks, vol. 3, pp. 333-34, 1990.
[8] H. Bourlard, and N. Morgan "Connexionnist techniques", available:
http://cslu.cse.ogi.edu/HLT survey/ch11node7.html, March 2003.
[9] H. Bourlard and C.J. Wellekens "Links between Markov models and
multilayer perceptrons" in IEEE Trans on Pattern Analysis and Machine
Intelligence, Vol 2, pp. 1167-1178, 1990.
[10] K. Kirschoff et al., "Novel approach to Arabic speech recognition,"
Final Report from the JHU Summer School Workshop, 2002.
[11] S.A. Selouani and J. Caelen "Arabic word recognition by classifiers and
context", Journal of Computer Science and Technology, Vol.20, N┬░3,
pp.402-410. May 2005.
[12] H. Bahi and M. Sellami,"Combination of vector quantization and HMM
for Arabic speech recognition ", ACS/ IEEE Int. Conf. on Computer
System and Applications AICCSA-01, pp.96-101, Beirut, Lebanon, 2001.
[13] T. Cacoulos "Estimation of a multivariate density" Ann. Inst. Math.
Tokyo, Vol. 18, n┬░2, pp. 179-189, 1966.
[14] D. F. Specht "A General Regression Neural Networks" IEEE Trans. on
Neural Networks, Vol. 2, n┬░6, pp. 568-576, Nov. 1991.
[15] D.F. Specht, Probabilistic Neural Networks and General Regression
Neural Networks, FuzzyLogic and Neural Network Handbook, Chap3.
Mac Grow Hill inc. 1995.
[1] L. Rabiner, "A Tutorial on hidden Markov model and selected
applications", in Proc. of IEEE, Vol. 77, n┬░2, 1989.
[2] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford
University Press, 1995.
[3] S. Haykin , Neural Networks: A Comprehensive Foundation", 2nd ed.,
Cliffs, NJ,1999.
[4] R. P. Lippman, "Review of Neural Networks for Speech Recognition"
Neural Computation, n┬░1, pp.1-38, 1989.
[5] F. Jelinek, Statistical Methods for Speech Recognition, Cambridge,
Massachusetts, MIT Press, 1997.
[6] A. Waibel, T. Harazawa, G. Hinton, K. Shakano and K.J. Lang,
"Phoneme recognition using Time-Delay Neural Networks," IEEE
Trans. On ASSP, vol. 37, n┬░3, pp. 328-339, March 1989.
[7] K. Lang, A. Waibel, and G. Hinton, "A Time Delay Neural Network
architecture," Neural Networks, vol. 3, pp. 333-34, 1990.
[8] H. Bourlard, and N. Morgan "Connexionnist techniques", available:
http://cslu.cse.ogi.edu/HLT survey/ch11node7.html, March 2003.
[9] H. Bourlard and C.J. Wellekens "Links between Markov models and
multilayer perceptrons" in IEEE Trans on Pattern Analysis and Machine
Intelligence, Vol 2, pp. 1167-1178, 1990.
[10] K. Kirschoff et al., "Novel approach to Arabic speech recognition,"
Final Report from the JHU Summer School Workshop, 2002.
[11] S.A. Selouani and J. Caelen "Arabic word recognition by classifiers and
context", Journal of Computer Science and Technology, Vol.20, N┬░3,
pp.402-410. May 2005.
[12] H. Bahi and M. Sellami,"Combination of vector quantization and HMM
for Arabic speech recognition ", ACS/ IEEE Int. Conf. on Computer
System and Applications AICCSA-01, pp.96-101, Beirut, Lebanon, 2001.
[13] T. Cacoulos "Estimation of a multivariate density" Ann. Inst. Math.
Tokyo, Vol. 18, n┬░2, pp. 179-189, 1966.
[14] D. F. Specht "A General Regression Neural Networks" IEEE Trans. on
Neural Networks, Vol. 2, n┬░6, pp. 568-576, Nov. 1991.
[15] D.F. Specht, Probabilistic Neural Networks and General Regression
Neural Networks, FuzzyLogic and Neural Network Handbook, Chap3.
Mac Grow Hill inc. 1995.
@article{"International Journal of Information, Control and Computer Sciences:49190", author = "Abderrahmane Amrouche and Jean Michel Rouvaen", title = "Efficient System for Speech Recognition using General Regression Neural Network", abstract = "In this paper we present an efficient system for
independent speaker speech recognition based on neural network
approach. The proposed architecture comprises two phases: a
preprocessing phase which consists in segmental normalization and
features extraction and a classification phase which uses neural
networks based on nonparametric density estimation namely the
general regression neural network (GRNN). The relative
performances of the proposed model are compared to the similar
recognition systems based on the Multilayer Perceptron (MLP), the
Recurrent Neural Network (RNN) and the well known Discrete
Hidden Markov Model (HMM-VQ) that we have achieved also.
Experimental results obtained with Arabic digits have shown that the
use of nonparametric density estimation with an appropriate
smoothing factor (spread) improves the generalization power of the
neural network. The word error rate (WER) is reduced significantly
over the baseline HMM method. GRNN computation is a successful
alternative to the other neural network and DHMM.", keywords = "Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.", volume = "2", number = "4", pages = "992-7", }