Puff Noise Detection and Cancellation for Robust Speech Recognition

In this paper, an algorithm for detecting and attenuating puff noises frequently generated under the mobile environment is proposed. As a baseline system, puff detection system is designed based on Gaussian Mixture Model (GMM), and 39th Mel Frequency Cepstral Coefficient (MFCC) is extracted as feature parameters. To improve the detection performance, effective acoustic features for puff detection are proposed. In addition, detected puff intervals are attenuated by high-pass filtering. The speech recognition rate was measured for evaluation and confusion matrix and ROC curve are used to confirm the validity of the proposed system.




References:
[1] Pati, V, Rao, P, "Acoustic Features for Detection of Aspirated Stops,"
Proc. Of the National Conference on Communications, 2011
[2] Ishi, C. T, "A New Acoustic Measure for Aspiration Noise Detection",
Proc. of The 8th International Conference of Speech and Language
Processing, 2:941-944, 2004
[3] Schmidt, M., Larsen, J., and Hsiao, F., "Wind noise reduction using
non-negative sparse coding", Proc. of 2007 IEEE Workshop on Machine
Learning for Signal, 2007.
[4] Xiaoqiang, L., Shuangtian, L., Jie, L., "Convolutive Sparse Non-negative
Matrix Factorization for Windy Speech", Proc. of ICSP, 2010.
[5] Kuroiwa, S., Mori, Y., Tsuge, S., Takashina, M., and Ren, F., "Wind
noise reduction method for speech recording using multiple noise
templates and observed spectrum fine structure", Proc. of ICCT, 2006.
[6] Nemer, E, Leblanc, W, "Single-microphone wind noise reduction by
adaptive postfiltering", Applications of Signal Processing to Audio and
Acoustics 2009. WASPAA '09. IEEE Workshop on., 2009.
[7] Yoshida, M., Oku, T., Yamanaka, M., and Murata, H., "A novel wind
noise reduction for digital video camera", ICCE, 2008.
[8] Gary, W., Jens, M., Steven, B., and Jurgen, P., "Electronic pop protection
for microphones", 2007 IEEE Workshop on Applications of Signal
Processing to Audio and Acoutics, 2007.
[9] Lie, L., Hong-Jiang, J., Stan Z., L., "Content-based audio classification
and segmentation by using support vector machines", Multimedia
Systems, Springer-Verlag, 2003
[10] Young, S., et al., "The HTK book (v3.4)", Cambridge University Press,
2006.