A Robust Method for Hand Tracking Using Mean-shift Algorithm and Kalman Filter in Stereo Color Image Sequences

Real-time hand tracking is a challenging task in many computer vision applications such as gesture recognition. This paper proposes a robust method for hand tracking in a complex environment using Mean-shift analysis and Kalman filter in conjunction with 3D depth map. The depth information solve the overlapping problem between hands and face, which is obtained by passive stereo measuring based on cross correlation and the known calibration data of the cameras. Mean-shift analysis uses the gradient of Bhattacharyya coefficient as a similarity function to derive the candidate of the hand that is most similar to a given hand target model. And then, Kalman filter is used to estimate the position of the hand target. The results of hand tracking, tested on various video sequences, are robust to changes in shape as well as partial occlusion.




References:
[1] N. Liu, and B. C. Lovell, MMX-accelerated Real-Time Hand Tracking
System, Proceedings of Image and Vision Computing, pp. 381-385, 2001.
[2] D. B. Nguyen, S. Enokida, and E. Toshiaki, Real-Time Hand Tracking
and Gesture Recognition System, International Conference on Graphics,
Vision and Image Processing, CICC, pp. 362-368, 2005.
[3] T. Nobuhiko, S. Nobutaka, and S. Yoshiaki, Extraction of Hand Features
for Recognition of Sign Language Words, International Conference on
Vision Interface, pp. 391-398, 2002.
[4] D. Comaniciu, V. Ramesh, and P. Meer, Kernel-Based Object Tracking,
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.
25, pp. 564-577, 2003.
[5] D. Comaniciu, V. Ramesh, and P. Meer, Real-Time Tracking of Non-Rigid
Objects Using Mean Shift, Conference on CVPR, Vol. 2, pp. 1-8, 2000.
[6] G. Welch, and G. Bishop, An Introduction to the Kalman Filter, In
Technical Report, University of North Carolina at Chapel Hill, pp. 95-
041, 1995.
[7] M. Elmezain, A. Al-Hamadi, and B. Michaelis, Real-Time Capable
System for Hand Gesture Recognition Using Hidden Markov Models in
Stereo Color Image Sequences, Journal of WSCG, Vol. 16, No. 1, pp.
65-72, 2008.
[8] M. Elmezain, A. Al-Hamadi, and B. Michaelis, A Novel System for
Automatic Hand Gesture Spotting and Recognition in Stereo Color Image
Sequences, Journal of WSCG, Vol.17, No. 1, pp. 89-96, 2009.
[9] M. Elmezain, A. Al-Hamadi, J. Appenrodt, and B. Michaelis, A Hidden
Markov Model-Based Continuous Gesture Recognition System for
Hand Motion Trajectory, International Conference on Pattern Recognition
(ICPR), pp. 519-522, 2008.
[10] M. Elmezain, A. Al-Hamadi, and B. Michaelis, Spatio-Temporal Feature
Extraction-Based Hand Gesture Recognition for Isolated American Sign
Language and Arabic Numbers, IEEE Symposium on Image and Signal
Processing and Analysis (ISPA), pp. 254-259, 2009.
[11] R. Klette, K. SChl¨uns, and A. Koschan, Computer Vision: Three-
Dimensional Data from Images, Springer, Singapore, ISBN 981-3083-
71-9, 1998.
[12] R. Niese, A. Al-Hamadi, and B. Michaelis, A Novel Method for 3D
Face Detection and Normalization, Journal of Multimedia, Vol. 2, pp.
1-12, 2007.
[13] S. Khalid, U. Ilyas, S. Sarfaraz, and A. Ajaz, ABhattacharyya Coefficient
in Correlation of Gary-Scale Objects, Journal of Multimedia, Vol. 1, pp.
56-61, 2006.
[14] D. Comaniciu, and P. Meer, Mean Shift: A Robust Approach Toward
Feature Space Analysis, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 24, pp. 603-619, 2002.