Assamese Numeral Corpus for Speech Recognition using Cooperative ANN Architecture

Speech corpus is one of the major components in a Speech Processing System where one of the primary requirements is to recognize an input sample. The quality and details captured in speech corpus directly affects the precision of recognition. The current work proposes a platform for speech corpus generation using an adaptive LMS filter and LPC cepstrum, as a part of an ANN based Speech Recognition System which is exclusively designed to recognize isolated numerals of Assamese language- a major language in the North Eastern part of India. The work focuses on designing an optimal feature extraction block and a few ANN based cooperative architectures so that the performance of the Speech Recognition System can be improved.




References:
[1] A. Okatan1, N. Ayanolu, S. Senycel, Voice Recognition by Cepstrum
Method, Baheehir University, Faculty of Engineering, Department
of Computer Eng., Istanbul, Turkey International Intelligent Knowledge
Systems Society (IKS), Istanbul, Turkey.
[2] Wikipedia, the free encyclopedia"Speech corpus",
en.wikipedia.org/wiki/Speechcorpus.
[3] S. Haykin, Neural Networks A Comprehensive Foundation,2nd . Pearson
Education, New Delhi, 2003.
[4] K. K. Paliwal and W. B. Kleijn, Quantization of LPC Parameters,
[5] Prof. Gautam Baruah, Dept. of CSE, IIT Guwahati, tdil.mit.gov.in /
assamesecodechartoct02.pdf,
[6] "The X sound in Assamese language", The Assam Tribune Editorial,
March 5, 2006.
[7] Indo-Iranian. http://www.questia.com/library / encyclopedia/ indoiranian.
jsp
[8] "National Institute on Deafness and Other Communication Disorders",
(www.nidcd.nih.gov/health/voice/whatisvsl.htm),
[9] B. Yegnanarayana, Artificial Neural Networks, 1st Ed., PHI, New Delhi,
2003.
[10] A. P. Simpson, "Phonetic differences between male and female
speech", Language and Linguistics Compass 3/2, 621640, 2009.
[11] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech
Signals, 1st Ed., Prentice Hall, 1978.
[12] S. Haykins,Adaptive Filter Theory, 4th Ed., Pearson Education, New
Delhi, 2002.
[13] K. Hisashi, F. T. Mano, Patent application title: Filter Circuit,
mi.eng.cam.ac.uk / ajr / SA95/ node43.html.
[14] S. W. Smith, The Scientist and Engineer-s Guide to Digital Signal
Processing, 2nd ed., Available at www.healthcare.analog.com / static
/imported-files /tech... / dsp-book-frontmat.pdf.
[15] Introduction to Digital Filters, www.dsptutor.freeuk.com / digfilt.pdf .
[16] Introduction to DSP - filtering: design by equiripple method,
www.bores.com / courses / intro/ filters/4−equi.htm.
[17] Wikipedia, the free encyclopedia, Least Mean Square Filter,
www.bores.com / courses / intro/ filters/4−equi.htm.
[18] Feature Extraction, cslu.cse.ogi.edu /toolkit /old /old /version 2.0a /.../
node5.html.
[19] M. P. Kesarkar, Feature Extraction for Speech Recogntion, M.Tech.
Credit Seminar Report, Electronic Systems Group, EE. Dept, IIT Bombay,
November, 2003.
[20] Jurafsky, Daniel and J. H. Martin, Speech and Language Processing: An
Introduction to Natural Language Processing, Computational Linguistics,
and Speech Recognition, (1st ed.). Prentice Hall, 2000.
[21] A. K. Paul, D. Das, and Md. M. Kamal, Bangla Speech Recognition
System Using LPC and ANN, 3rd ed. Proccedings of Seventh International
Conference on Advances in Pattern Recognition,04-06, February, 2009.
[22] G. Dede and M. H. Sazl, Speech recognition with artificial neural
networks, Digital Signal Processing), Volume 20, Issue 3, Pages 763-768,
May 2010.
[23] A. M. Ahmad, S. Ismail, D. F. Samaon, Recurrent Neural Network
with Backpropagation through Time for Speech Recognition, Proccedings
of Intemational Symposium on Communications and Information Technologies
2004 ( ISClT 2004 ) Sapporo, Japan, October 26- 29, 2004.
Harlow, England: Addison-Wesley, 1999.