Abstract: In this study, we propose a novel technique for acoustic
echo suppression (AES) during speech recognition under barge-in
conditions. Conventional AES methods based on spectral subtraction
apply fixed weights to the estimated echo path transfer function
(EPTF) at the current signal segment and to the EPTF estimated until
the previous time interval. However, the effects of echo path changes
should be considered for eliminating the undesired echoes. We
describe a new approach that adaptively updates weight parameters in
response to abrupt changes in the acoustic environment due to
background noises or double-talk. Furthermore, we devised a voice
activity detector and an initial time-delay estimator for barge-in speech
recognition in communication networks. The initial time delay is
estimated using log-spectral distance measure, as well as
cross-correlation coefficients. The experimental results show that the
developed techniques can be successfully applied in barge-in speech
recognition systems.
Abstract: Speech enhancement is the process of eliminating
noise and increasing the quality of a speech signal, which is
contaminated with other kinds of distortions. This paper is on
developing an optimum cascaded system for speech enhancement.
This aim is attained without diminishing any relevant speech
information and without much computational and time complexity.
LMS algorithm, Spectral Subtraction and Kalman filter have been
deployed as the main de-noising algorithms in this work. Since these
algorithms suffer from respective shortcomings, this work has been
undertaken to design cascaded systems in different combinations and
the evaluation of such cascades by qualitative (listening) and
quantitative (SNR) tests.
Abstract: In this paper we present an enhanced noise reduction method for robust speech recognition using Adaptive Gain Equalizer with Non linear Spectral Subtraction. In Adaptive Gain Equalizer method (AGE), the input signal is divided into a number of subbands that are individually weighed in time domain, in accordance to the short time Signal-to-Noise Ratio (SNR) in each subband estimation at every time instant. Instead of focusing on suppression the noise on speech enhancement is focused. When analysis was done under various noise conditions for speech recognition, it was found that Adaptive Gain Equalizer method algorithm has an obvious failing point for a SNR of -5 dB, with inadequate levels of noise suppression for SNR less than this point. This work proposes the implementation of AGE when coupled with Non linear Spectral Subtraction (AGE-NSS) for robust speech recognition. The experimental result shows that out AGE-NSS performs the AGE when SNR drops below -5db level.
Abstract: This paper presents a formant-tracking linear prediction
(FTLP) model for speech processing in noise. The main focus of this
work is the detection of formant trajectory based on Hidden Markov
Models (HMM), for improved formant estimation in noise. The
approach proposed in this paper provides a systematic framework for
modelling and utilization of a time- sequence of peaks which satisfies
continuity constraints on parameter; the within peaks are modelled
by the LP parameters. The formant tracking LP model estimation
is composed of three stages: (1) a pre-cleaning multi-band spectral
subtraction stage to reduce the effect of residue noise on formants
(2) estimation stage where an initial estimate of the LP model of
speech for each frame is obtained (3) a formant classification using
probability models of formants and Viterbi-decoders. The evaluation
results for the estimation of the formant tracking LP model tested
in Gaussian white noise background, demonstrate that the proposed
combination of the initial noise reduction stage with formant tracking
and LPC variable order analysis, results in a significant reduction in
errors and distortions. The performance was evaluated with noisy
natual vowels extracted from international french and English vocabulary
speech signals at SNR value of 10dB. In each case, the
estimated formants are compared to reference formants.
Abstract: In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.