Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Comparison of Frequency Estimation Methods for Reflected Signals in Mobile Platforms

Precise frequency estimation methods for pulseshaped echoes are a prerequisite to determine the relative velocity between sensor and reflector. Signal frequencies are analysed using three different methods: Fourier Transform, Chirp ZTransform and the MUSIC algorithm. Simulations of echoes are performed varying both the noise level and the number of reflecting points. The superposition of echoes with a random initial phase is found to influence the precision of frequency estimation severely for FFT and MUSIC. The standard deviation of the frequency using FFT is larger than for MUSIC. However, MUSIC is more noise-sensitive. The distorting effect of superpositions is less pronounced in experimental data.