Time Delay Estimation Using Signal Envelopes for Synchronisation of Recordings

In this work, a method of time delay estimation for 
dual-channel acoustic signals (speech, music, etc.) recorded under 
reverberant conditions is investigated. Standard methods based on 
cross-correlation of the signals show poor results in cases involving 
strong reverberation, large distances between microphones and 
asynchronous recordings. Under similar conditions, a method based 
on cross-correlation of temporal envelopes of the signals delivers a 
delay estimation of acceptable quality. This method and its properties 
are described and investigated in detail, including its limits of 
applicability. The method’s optimal parameter estimation and a 
comparison with other known methods of time delay estimation are 
also provided.

 





References:
[1] J. Chen, J. Benesty and Y.A. Huang, "Time Delay Estimation in Room
Acoustic Environments,” EURASIP Journal on Advances in Signal
Processing, vol. 2006, 2006, pp. 1-20.
[2] A. Sandmair, M. Lietz, J. Stefan, and F.P. Leon, "Time delay estimation
in the time-frequency domain based on a line detection approach”, in
Proc. ICASSP- International Conference on Acoustics, Speech, and
Signal Processing, 2011, pp. 2716-2719.
[3] K. Gedalyahu and Y.C. Eldar. "Time-delay estimation from low-rate
samples: A union of subspaces approach,” IEEE Transactions on Signal
Processing, vol. 58, no. 6, 2010, pp. 3017 –3031.
[4] B. Kirkwood, "Acoustic Source Localization Using Time-Delay
Estimation”, M.S. Thesis, 2003, http://brentkirkwood.com/science/
project-ms.html
[5] A. Kozlov, O. Kudashev, Yu. Matveev, T. Pekhovsky, K. Simonchik, A.
Shulipa."SVID Speaker Recognition System for NIST SRE 2012,”in.
Proc. of 15th International Conference "Speech and Computer”
(SPECOM 2013). Springer Lecture Notes in Computer Science, Lecture
Notes in Artificial Intelligence, 2013, Vol. 8113, pp. 278-285.
[6] S. Bédard, B. Champagne and A. Stéphenne, "Effects of Room
Reverberation on Time-Delay Estimation Performance,” IEEE
Transactions Acoustics, Speech, and Signal Processing, vol.2, 1994, pp.
261-264.
[7] R. Raya, A. Frizera, R. Ceres, L. Calderón, E. Rocon, "Design and
evaluation of a fast model-based algorithm for ultrasonic range
measurements,” Sensors and Actuators A: Physical, vol. 148, No. 1,
2008, pp. 335–341.
[8] L. Yang, A.V. Lavrinenko, J.M. Hvam, and O. Sigmund, "Design of
one-dimensional optical pulse-shaping filters by time-domain topology
optimization,” Appl. Phys. Lett. 95, 2009, 261101.
[9] B.S. Lazarov, R. Matzen, and Y. Elesin, "Topology optimization of
pulse shaping filters using the Hilbert transform envelope extraction,”
Structural and Multidisciplinary Optimization, vol. 44, no. 3, pp. 409–
419, 2011.
[10] P. Ignatov, M. Stolbov, S. Aleinik, "Semi-automated technique for noisy
recording enhancement using an independent reference recording,” in
Proc. 46th International Conference of the Audio Engineering Society,
2012, pp. 57-65
[11] N. Thrane, J. Wismer, H. Konstantin-Hansen, and S. Gade, "Practical
use of the Hilbert transform,” Application Note, Brüel&Kjær, Denmark.
Available: http://www.bksv.com/doc/bo0437.pdf
[12] C. Faller, C Tournery, "Estimating the delay and coloration effect of the
acoustic echo path for low-complexity echo suppression,” inProc. Intl.
Works. OnAcoust. Echo and Noise Control (IWAENC), The Netherlands,
2005, pp. 53-56.
[13] O.M.Bouzid, G. Y. Tian, J.Neasham, and B. Sharif, "Envelope and
Wavelet Transform for Sound Localisation at Low Sampling Rates in
Wireless Sensor Networks,” Journal of Sensors, vol. 2012, Article ID
680383, 9 pages.
[14] S. J. Orfanidis, Introduction to Signal Processing. Available:
http://www.ece.rutgers.edu/~orfanidi/intro2sp/orfanidis-i2sp.pdf
[15] T. Hougast, H. J. M. Steeneken, "A review of the MTF concept in room
acoustics and it’s use for estimating speech intelligibility in auditoria”, J.
Acoust. Soc. Am. 67, 1985, pp. 1060-1077.