Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.




References:
[1] F. Beritelli. A multiband approach to human identity verification based
on phonocardiogram signal analysis. In Biometrics Symposium, pages
71-76, Tampa, FL, 2008.
[2] F. Beritelli and S. Serrano. Biometric identification based on frequency
analysis of cardiac sounds. IEEE Transactions on Information Forensics
and Security, 2(3):596-604, 2007.
[3] B. P. Bogert, M. J. R. Healy, and J. W. Tukey. The quefrency analysis
of time series for echoes: cepstrum, pseudo-autocovariance, crosscepstrum,
and saphe cracking, chapter 15, pages 209-243. Wiley, NY,
1963.
[4] J. C. Brown. Computer identification of musical instruments using
pattern recognition with cepstral coefficients as features. The Journal of
the Acoustical Society of America, 105(3):1933-1941, 1999.
[5] T. Ganchev, N. Fakotakis, and G. Kokkinakis. Comparative evaluation
of various MFCC implementations on the speaker verification task.
In Proceedings of the Speech and Computer, pages 191-194, Patras,
Greece, 2005.
[6] M. R. Hasan, M. Jamil, M. G. Rabbani, and M. S. Rahman. Speaker
identification using mel frequency cepstral coefficients. In Proceedings
of the 3rd International Conference on Electrical and Computer Engineering,
pages 565 - 568, Dhaka, Bangladesh, 2004.
[7] A. Kinney and J. Stevens. Wavelet packet cepstral analysis for speaker
recognition. In Conference Record of the Thirty-Sixth Asilomar Conference
on Signals, Systems and Computers, volume 1, pages 206-209,
Monterey, CA, 2002.
[8] M. Nazar. Speaker identification using cepstral analysis. In Proceedings
of the IEEE Students Conference, volume 1, pages 139-143 vol.1,
Lahore, Pakistan, 2002.
[9] K. Phua, J. Chen, T. H. Dat, and L. Shue. Heart sound as a biometric.
Pattern Recognition, 41(3):906 - 919, 2008.
[10] D. A. Reynolds. Speaker identification and verification using Gaussian
mixture speaker models. Speech Communication, 17(1-2):91 - 108,
1995.
[11] J. Volkmann, S. S. Stevens, and E. B. Newman. A scale for the
measurement of the psychological magnitude pitch. The Journal of the
Acoustical Society of America, 8(3):208-208, 1937.