An Adaptive Hand-Talking System for the Hearing Impaired

An adaptive Chinese hand-talking system is presented in this paper. By analyzing the 3 data collecting strategies for new users, the adaptation framework including supervised and unsupervised adaptation methods is proposed. For supervised adaptation, affinity propagation (AP) is used to extract exemplar subsets, and enhanced maximum a posteriori / vector field smoothing (eMAP/VFS) is proposed to pool the adaptation data among different models. For unsupervised adaptation, polynomial segment models (PSMs) are used to help hidden Markov models (HMMs) to accurately label the unlabeled data, then the "labeled" data together with signerindependent models are inputted to MAP algorithm to generate signer-adapted models. Experimental results show that the proposed framework can execute both supervised adaptation with small amount of labeled data and unsupervised adaptation with large amount of unlabeled data to tailor the original models, and both achieve improvements on the performance of recognition rate.

Authors:



References:
[1] I. C. Yoo and D. Yook, "Automatic sound recognition for the hearing
impaired," IEEE Trans. Consumer Electron., vol. 54, no. 4, pp. 2029-
2036, Nov. 2008.
[2] http://www.cdpf.com.cn/ggtz/content/2008-05/04/content 25053452.htm
(in Chinese)
[3] W. Gao, Y. Chen, G. Fang, C. Yang, D. Jiang, and et al., "HandTalker II:
a Chinese sign language recognition and synthesis system," in Proc. The
8th Int. Conf. on Control, Automation, Robotics and Vision (ICARCV
2004), pp. 759-764, 2004.
[4] L. R. Rabiner, "A tutorial on hidden Markov models and selected
applications in speech recognition," Proceedings of the IEEE, vol. 77,
no.2, pp. 257-286, 1989.
[5] J. L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for
multivariate Gaussian mixture observations of Markov chains," IEEE
Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
[6] B. J. Frey and D. Dueck, "Clustering by passing messages between data
points," Science, vol. 315, no. 5814, pp. 972-976, 2007.
[7] J. Takahashi and S. Sagayama, "Vector-field-smoothed Bayesian learning
for incremental speaker adaptation," in International Conference on
Acoustics, Speech, and Signal Processing, pp. 696-699, 1995.
[8] C. F. Li, M. H. Siu, and J. S. K. Au-Yeung, "Recursive likelihood
evaluation and fast search algorithm for polynomial segment model with
application to speech recognition," IEEE Trans. Audio Speech Lang.
Process., vol. 14, no.5, pp. 1704-1718, 2006.
[9] C. Wang, X. Chen, and W. Gao, "Generating data for signer adaptation,"
in Proc. Gesture Workshop, pp. 114-121, 2007.
[10] X. Zhu, "Semi-supervised learning literature survey," Computer Sciences
Technical Reports 1530, University of Wisconsin Madison, 2008.