Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain
This work presents a fusion of Log Gabor Wavelet
(LGW) and Maximum a Posteriori (MAP) estimator as a speech
enhancement tool for acoustical background noise reduction. The
probability density function (pdf) of the speech spectral amplitude is
approximated by a Generalized Laplacian Distribution (GLD).
Compared to earlier estimators the proposed method estimates the
underlying statistical model more accurately by appropriately
choosing the model parameters of GLD. Experimental results show
that the proposed estimator yields a higher improvement in
Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral
Distortion (LSD) in two different noisy environments compared to
other estimators.
[1] Boll, S. F., "Suppression of Acoustic Noise in Speech using Spectral
Subtraction", IEEE ASSP, 27(2):113-120, 1979
[2] Y. Ephraim and D. Malah, "Speech Enhancement using a Minimum
Mean-Square Error Short-Time Spectral Amplitude Estimator", IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP32,
no. 6, pp. 1109-1121, Dec. 1984.
[3] T. H. Dat, K. Takeda and F. Itakura, "Generalized Gamma Modeling of
Speech and its Online Estimation for Speech Enhancement",
Proceedings of ICASSP-2005, 2005.
[4] R. Martin and C. Breithaupt, "Speech Enhancement in the DFT Domain
using Laplacian Speech Priors", in Proc. International Workshop on
Acoustic Echo and Noise Control (IWAENC 03), pp. 8790, Kyoto,
Japan, Sep. 2003.
[5] R. Martin, "Speech Enhancement Using MMSE Short Time Spectral
Estimation with Gamma Distributed Speech Priors", IEEE ICASSP-02,
Orlando, Florida, May 2002.
[6] Thomas Lotter and Peter Vary, "Speech Enhancement by MAP Spectral
Amplitude Estimation Using a Super-Gaussian Speech Model",
EURASIP Journal on Applied Signal Processing , vol. 2005, Issue 7,
Pages 11101126.
[7] C. Breithaupt and R. Martin, "MMSE Estimation of Magnitude-Squared
DFT Coefficients with Super-Gaussian Priors", IEEE Proc. Intern. Conf.
on Acoustics, Speech and Signal Processing, vol. I, pp. 896-899, April
2003.
[8] Deng, J. Droppo, and A. Acero. "Estimating cepstrum of speech under
the presence of noise using a joint prior of static and dynamic features",
IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, May
2004, pp. 218-233.
[9] I. Cohen, "Speech Enhancement Using a Noncausal A Priori SNR
Estimator", IEEE Signal Processing Letters, Vol. 11, No. 9, Sep. 2004,
pp. 725-728.
[10] S. Kamath and P. Loizou, "A Multi-Band Spectral Subtraction Method
for Enhancing Speech Corrupted by Colored Noise", In Proceedings
International Conference on Acoustics, Speech and Signal Processing,
2002.
[11] E. Zavarehei, S. Vaseghi and Q. Yan, "Speech Enhancement using
Kalman Filters for Restoration of Short-Time DFT Trajectories",
Automatic Speech Recognition and Understanding (ASRU), 2005 IEEE
Workshop, Nov. 27, 2005, Page(s):219 -224.
[12] D. Gabor, "Theory of communication", J. Inst. Electr. Eng. 93, pp.
429457, 1946.
[13] J. Morlet, G. Arens, E. Fourgeau and D. Giard, "Wave Propagation and
Sampling Theory -Part II: Sampling theory and complex waves",
Geophysics, 47(2):222-236, February 1982.
[14] D. J. Field, "Relations between the statistics of natural images and the
response properties of cortical cells", Journal of The Optical Society of
America A, 4(12):2379-2394, Dec. 1987.
[1] Boll, S. F., "Suppression of Acoustic Noise in Speech using Spectral
Subtraction", IEEE ASSP, 27(2):113-120, 1979
[2] Y. Ephraim and D. Malah, "Speech Enhancement using a Minimum
Mean-Square Error Short-Time Spectral Amplitude Estimator", IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP32,
no. 6, pp. 1109-1121, Dec. 1984.
[3] T. H. Dat, K. Takeda and F. Itakura, "Generalized Gamma Modeling of
Speech and its Online Estimation for Speech Enhancement",
Proceedings of ICASSP-2005, 2005.
[4] R. Martin and C. Breithaupt, "Speech Enhancement in the DFT Domain
using Laplacian Speech Priors", in Proc. International Workshop on
Acoustic Echo and Noise Control (IWAENC 03), pp. 8790, Kyoto,
Japan, Sep. 2003.
[5] R. Martin, "Speech Enhancement Using MMSE Short Time Spectral
Estimation with Gamma Distributed Speech Priors", IEEE ICASSP-02,
Orlando, Florida, May 2002.
[6] Thomas Lotter and Peter Vary, "Speech Enhancement by MAP Spectral
Amplitude Estimation Using a Super-Gaussian Speech Model",
EURASIP Journal on Applied Signal Processing , vol. 2005, Issue 7,
Pages 11101126.
[7] C. Breithaupt and R. Martin, "MMSE Estimation of Magnitude-Squared
DFT Coefficients with Super-Gaussian Priors", IEEE Proc. Intern. Conf.
on Acoustics, Speech and Signal Processing, vol. I, pp. 896-899, April
2003.
[8] Deng, J. Droppo, and A. Acero. "Estimating cepstrum of speech under
the presence of noise using a joint prior of static and dynamic features",
IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, May
2004, pp. 218-233.
[9] I. Cohen, "Speech Enhancement Using a Noncausal A Priori SNR
Estimator", IEEE Signal Processing Letters, Vol. 11, No. 9, Sep. 2004,
pp. 725-728.
[10] S. Kamath and P. Loizou, "A Multi-Band Spectral Subtraction Method
for Enhancing Speech Corrupted by Colored Noise", In Proceedings
International Conference on Acoustics, Speech and Signal Processing,
2002.
[11] E. Zavarehei, S. Vaseghi and Q. Yan, "Speech Enhancement using
Kalman Filters for Restoration of Short-Time DFT Trajectories",
Automatic Speech Recognition and Understanding (ASRU), 2005 IEEE
Workshop, Nov. 27, 2005, Page(s):219 -224.
[12] D. Gabor, "Theory of communication", J. Inst. Electr. Eng. 93, pp.
429457, 1946.
[13] J. Morlet, G. Arens, E. Fourgeau and D. Giard, "Wave Propagation and
Sampling Theory -Part II: Sampling theory and complex waves",
Geophysics, 47(2):222-236, February 1982.
[14] D. J. Field, "Relations between the statistics of natural images and the
response properties of cortical cells", Journal of The Optical Society of
America A, 4(12):2379-2394, Dec. 1987.
@article{"International Journal of Electrical, Electronic and Communication Sciences:49972", author = "Suman Senapati and Goutam Saha", title = "Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain", abstract = "This work presents a fusion of Log Gabor Wavelet
(LGW) and Maximum a Posteriori (MAP) estimator as a speech
enhancement tool for acoustical background noise reduction. The
probability density function (pdf) of the speech spectral amplitude is
approximated by a Generalized Laplacian Distribution (GLD).
Compared to earlier estimators the proposed method estimates the
underlying statistical model more accurately by appropriately
choosing the model parameters of GLD. Experimental results show
that the proposed estimator yields a higher improvement in
Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral
Distortion (LSD) in two different noisy environments compared to
other estimators.", keywords = "Speech Enhancement, Generalized Laplacian
Distribution, Log Gabor Wavelet, Bayesian MAP Marginal
Estimator.", volume = "2", number = "11", pages = "2456-7", }