Environmental Interference Cancellation of Speech with the Radial Basis Function Networks: An Experimental Comparison

In this paper, we use Radial Basis Function Networks (RBFN) for solving the problem of environmental interference cancellation of speech signal. We show that the Second Order Thin- Plate Spline (SOTPS) kernel cancels the interferences effectively. For make comparison, we test our experiments on two conventional most used RBFN kernels: the Gaussian and First order TPS (FOTPS) basis functions. The speech signals used here were taken from the OGI Multi-Language Telephone Speech Corpus database and were corrupted with six type of environmental noise from NOISEX-92 database. Experimental results show that the SOTPS kernel can considerably outperform the Gaussian and FOTPS functions on speech interference cancellation problem.

Authors:



References:
[1] R.W. Jones, B.L. Olsen, B.R. Mace, "Comparison of convergence
characteristics of adaptive IIR and FIR filters for active noise
control in a duct", Applied Acoustics, vol. 68, pp. 729-738, 2007.
[2] J. Elliott Stephen, M. Stothers Ian, and A. Nelson Philip, "A
Multiple Error LMS Algorithm and Its Application to the Active
Control of Sound and Vibration", IEEE Trans. on Acoustics,
Speech, and Signal Processing, Vol. ASSP-35, No. 10, pp. 1423-
1434, 1987.
[3] M. Feder, A. V. Oppenheim, and E. Weinstein, "Maximum
Likelihood Noise Cancellation Using the EM Algorithm", IEEE
Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-
37, No. 2, pp. 204-216, February 1989.
[4] C. K. Chen and T. D. Chiueh, "Multilayer Perceptron Neural
Networks for Active Noise Cancellation", in Proc. of the IEEE
International Symposium on Circuits and Systems (ISCAS),
Atlanta, GA, May 1996.
[5] J. S. Lim, "Evaluation of a correlation subtraction method for
enhancing speech degraded by additive white noise", IEEE Trans.
Acoust., Speech, Signal Processing, vol. ASSP-26, pp. 471-472,
Oct. 1978.
[6] S. F. Boll, "Suppression of acoustic noise in speech using spectral
subtraction", IEEE Trans. Acoust., Speech, Signal Processing, vol.
ASSP-27, pp. 113-120, Apr. 1979.
[7] J. S. Lim and A. V. Oppenheim, "Enhancement and bandwidth
compression of noisy speech", Proc. IEEE, vol. 67, pp. 1586-1604,
Dec. 1979.
[8] J. S. Lim, A. V. Oppenheim, and L. D. Braida, "Evaluation of an
adaptive comb filtering method for enhancing speech degraded by
white noise addition", IEEE Trans. Acoust., Speech, Signal
Processing, vol. ASSP-26, pp. 354-358, Aug. 1978.
[9] T. W. Parsons, "Separation of speech from interfering of speech by
means of harmonic selection", J. Acoust. Soc. Amer., vol. 60,
pp.911-918, Oct. 1976.
[10] H. Sameti, H. Sheikhzadeh, L.Deng, and R. L. Brennan, "HMMBased
Strategies for Enhancement of Speech Signals Embedded
in Nonstationary Noise", IEEE Trans. Speech and Audio
Processing, Vol. 6, No. 5, September 1998.
[11] J. Chen, J. Benesty, Y. Huang, "On the optimal linear filtering
techniques for noise reduction", Speech Communication, vol.49,
pp. 305-316, 2007.
[12] B. Widrow and E. Walach , "Adaptive Inverse Control". S.S.
Series. Prentice Hall International, Englewood Cliffs, NJ, 1996.
[13] I. Cha and S. A. Kassam, "Interference cancellation using radial
basis function networks". Signal Processing, vol. 47, pp. 247-268,
1995.
[14] Y. Lu, N. Sundararajan and P. Saratchandran, "Performance
evaluation of a sequential minimal radial basis function neural
network learning algorithm". IEEE Trans. Neural Networks, vol. 9,
pp. 308-318, 1998.
[15] S. Haykin, "Neural Networks: A Comprehensive Foundation",
Prentice Hall International, 1999.
[16] G. Wahba, http://www.stat.wisc.edu/~wahba/
[17] K. Mike Tao, "A Closer Look at the Radial Basis Function
Networks", conference record of the 27th asilomar conference on
signals, systems and computers, vol. 1, pp. 401-405, 1993.
[18] C. M. Bishop, "Pattern recognition and machine learning",
Springer, 2006.
[19] A. P. Dempster, N. M. Laird and D. B. Rubin, "Maximum
Likelihood from Incomplete Data via de EM Algorithm", in
Journal of the Royal Statistical Society, B 39(1) 1-38, 1976.
[20] Y. K. Muthusamy, R. A. Cole, and B. T. Oshika, "The OGI multilanguage
telephone speech corpus", Proceedings of the
International Conference on Spoken Language Proceedings, Banff,
Alberta, Canada, pp 895-898, October 1992.
[21] A. Varga and H. J. M. Steeneken, "Assessment for automatic
speech recognition: II. NOISEX-92: A database and an experiment
to study the effect of additive noise on speech recognition
systems", Speech Communication, Vol. 12, No. 3, pp. 247 - 251,
1993.
[22] K. Hornik, "Multilayer feedforward networks are universal
approximators". Neural Networks, vol. 2, pp. 359-366, 1989.