Neural Network Based Speech to Text in Malay Language

Speech to text in Malay language is a system that converts Malay speech into text. The Malay language recognition system is still limited, thus, this paper aims to investigate the performance of ten Malay words obtained from the online Malay news. The methodology consists of three stages, which are preprocessing, feature extraction, and speech classification. In preprocessing stage, the speech samples are filtered using pre emphasis. After that, feature extraction method is applied to the samples using Mel Frequency Cepstrum Coefficient (MFCC). Lastly, speech classification is performed using Feedforward Neural Network (FFNN). The accuracy of the classification is further investigated based on the hidden layer size. From experimentation, the classifier with 40 hidden neurons shows the highest classification rate which is 94%.  





References:
[1] Yogita, H.G. and Sushama, D.S. “Speech to Text Conversion for Multilingual Languages,” International Conference on Communication and Signal Processing, 236-240. 2016.
[2] Hanifa, R.M., Isa, K. and Mohamad, S. “Malay Speech Recognition for Different Ethnic Speaker: An Exploratory Study,” IEEE Symposium on Computer Application & Industrial Electronics (ISCAIE), 91-96, 2017.
[3] Izzad, M., Nursurianti J. and Zainab A.B. “Speech/Non-Speech Detection in Malay Language Spontaneous Speech,” IEEE International Conference, 219-224. 2013.
[4] Rami, A.A. and Rini, A. “Speech to Text Translation for Malay Language,” IOP Conference Series: Materials Science and Engineering, 1-9. 2017.
[5] Fadhilah, R. and Raja, N. A. “Isolated Malay Speech Recognition Using Hidden Markov Models,” Proceedings of the International Conference on Computer and Communication Engineering, 721-725. 2008.
[6] Noraini, S., Zainab, A.B. and Nordin, A.B. “An Evaluation of Endpoint Detection Measures for Malay Speech Recognition of an Isolated Words,” IEEE International Conference, 1628-1635. 2010.
[7] Alireza, Z., Hua, N.T. and Seyed, M.M. “Gender Classification in Children Based on Speech Characteristics: Using Fundamental and Formant Frequencies of Malay Vowels,” Journal of Voice 27, 201-209. 2013.