Speaker Identification using Neural Networks

The speech signal conveys information about the identity of the speaker. The area of speaker identification is concerned with extracting the identity of the person speaking the utterance. As speech interaction with computers becomes more pervasive in activities such as the telephone, financial transactions and information retrieval from speech databases, the utility of automatically identifying a speaker is based solely on vocal characteristic. This paper emphasizes on text dependent speaker identification, which deals with detecting a particular speaker from a known population. The system prompts the user to provide speech utterance. System identifies the user by comparing the codebook of speech utterance with those of the stored in the database and lists, which contain the most likely speakers, could have given that speech utterance. The speech signal is recorded for N speakers further the features are extracted. Feature extraction is done by means of LPC coefficients, calculating AMDF, and DFT. The neural network is trained by applying these features as input parameters. The features are stored in templates for further comparison. The features for the speaker who has to be identified are extracted and compared with the stored templates using Back Propogation Algorithm. Here, the trained network corresponds to the output; the input is the extracted features of the speaker to be identified. The network does the weight adjustment and the best match is found to identify the speaker. The number of epochs required to get the target decides the network performance.




References:
[1] S. kasuriya1, V. Achariyakulporn, C. Wutiwiwatchai, C. Tanprasert,
Text-dependent speaker identification via telephone based on dtw and
mlp. 22nd floor, Gypsum-Metropolitan Building, Sri-Ayudhaya
Rd.,Rachathewi, Bangkok 10400, Thailand
[2] Monte, J. Hernando,X.Mir├│,A. Adolf Dpt.TSC.Universitat Politécnica
de Catalunya Barcelona.Spain ,Text independent speaker identification
on noisy environments by means of self organizing maps Dpt. TSC.
Universitat Politécnica de Catalunya, Barcelona, Spain.
[3] A.N. Iyer, B. Y. Smolenski, R. E. Yantorno J. Cupples, S. Wenndt,
Speaker identification improvement using the usable speech concept.
Speech Processing Lab, Temple University 12th & Norris Streets,
Philadelphia, PA 19122,Air Force Research Laboratory/IFEC, 32
Brooks Rd. Rome NY 13441-4514.
[4] Lawrence Rabiner- "Fundamentals of Speech Recognition" Pearson
Education Speech Processing Series. Pearson Education Publication.
[5] Lawrence Rabiner- "Digital Processing of Speech Signals" Pearson
Education Speech Processing Series. Pearson Education Publication.
[6] Picton P.H.- "Introduction to Neural Networks", Mc Graw Hill
Publication.
[7] Ben Gold & Nelson Morgan -"Speech and Audio Signal Processing."
John Wiley and Sons.
[8] Duane Hanselman & Bruce Littlefield-"mastering Matlab-A
comprehensive Tutorial & reference" Prentice Hall International
Editions
[9] Proakis Manolakis "Digital Signal Processing Principles, Algorithms
and applications " Prentice -Hall India
[10] John G Proakis & Vinay K Ingle-"Digital Signal Processing Using
malab "Thomson Brooks/cole
[11] Jacek .M.Zurada "Introduction to artificial Neural Systems"
[12] http/www.electronicsletters.com
[13] http/www.DspGuru.com
[14] http/www.mathworks.com
[15] http/www.bores.com
[16] www.ieee.org/discover