An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Card image

Scholarly

Volume:8, Issue: 10, 2014 Page No: 1949 - 1958

International Journal of Information, Control and Computer Sciences

ISSN: 2517-9942

1102 Downloads

Abstract Full Text Download References Share Add to Favorites

DOI:10.5281/zenodo.1108442 BibTeX JSON

An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Authors:

Keywords:

References:

[1] Furui S, “Recent advances in speaker recognition”,Pattern Recognition
Letters, vol. 18, no. 9, (1997). September, pp. 859–872.H. Simpson,
Dumb Robots, 3rd ed., Springfield: UOS Press, 2004, pp.6-9.
[2] K. Chen, L. Wang, and H. Chi., “Methods of combining multiple
classifiers with different features and their applications to textindependent
speaker identification”. Journal on Pattern Recognition and
Artificial Intelligence, 11(3):417–445, 1997.
[3] Reynolds, D.A., “An overview of automatic speaker recognition
technology”. Proc. IEEE Acoustics Speech Signal Processing 4,4072–
4075 (2002).
[4] Godino-Llorente, J.I., Gómez-Vilda, P., Sáenz Lechón, N., Velasco,
M.B., Cruz Roldán, F., Ballester, M.A.F., “Discriminative Methods for
the Detection of Voice Disorder”. In: A ISCA Tutorial and Research
Workshop on Non-Linear Speech Processing, The COST- 277
Workshop (2005).
[5] Xugang, L., Jianwu, D., “An investigation of Dependencies between
Frequency components and speaker characteristics for text–independent
speaker identification”. Speech Communication 2007 50(4), 312–
322 (2007).
[6] D. A. Reynolds and R. C. Rose, “Robust text independent speaker
identification using Gaussian mixture speaker models”. IEEE Trans.
on Speech and audio processing, vol. 3(1), pp. 72–83, 1995.
[7] Yuk,C.C.Q.L.D.-S., “An HMM approach to text independent speaker
verification”,. In IEEE international conference on Acoustics, Speech
and signal processing, 1996.
[8] F. K. Soong, et. al., “A vector quantization approach to speaker
recognition”, AT & T Technical Journal, Vol.66, No.2, pp. 14-26, 1987.
[9] T. Kinnunen, T., Kilpeläinen,T., Fränti P.: ”Comparison of clustering
algorithms in speaker identification”, proc. Lasted Int. Conf. Signal
Processing and Communications (SPC): 222-
227, Marbella, Spain, 2000.
[10] Y. Linde, A. Buzo and R. M. Gray, “An Algorithm for Vector Quantizer
Design,”IEEE Transactions on Communications, vol. COM-28, pp. 84-
95, January 1980.
[11] Atal, B.; Rabiner, L., “A pattern recognition approach to voicedunvoiced-
silence classification with applications to speech
recognition”, Acoustics,Speech, and Signal Processing (see also IEEE
Transactions on Signal Processing), IEEE Transactions on, Volume: 24 ,
Issue: 3 , Jun 1976, Pages: 201 - 212.
[12] D. G. Childers, M. Hand, J. M. Larar, “ Silent and Voiced/Unvoied/
Mixed Excitation(Four-Way), Classification of Speech”, IEEE
Transaction on ASSP, Vol-37, No-11, pp. 1771-74, Nov 1989.
[13] G. Saha, Sandipan Chakroborty, Suman Senapat , "A New Silence
Removal and end point detection algorithm for speech and Speaker
Recognition Applications", Proceedings of the NCC 2005, Jan.
[14] A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from
incomplete data via the EM algorithm, ” J.Royal Stat. Soc., vol 39, pp.
1-38, 1977.