A talking head system (THS) is presented to animate
the face of a speaking 3D avatar in such a way that it realistically
pronounces the given Korean text. The proposed system consists of
SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant
face animation generator. The input to the THS is a unicode text that is
to be spoken with synchronized lip shape. The TTS engine generates a
phoneme sequence with their duration and audio data. The TTS
applies the coarticulation rules to the phoneme sequence and sends a
mouth animation sequence to the face modeler. The proposed THS can
make more natural lip sync and facial expression by using the face
animation generator than those using the conventional visemes only.
The experimental results show that our system has great potential for
the implementation of talking head for Korean text.
[1] I. S. Pandzic and R. Forchheimer, Edited, MPEG-4 Facial animation,
Wiley, England, 2002.
[2] C. Pelachaud, E. Magno-Caldognetto, "Modelling an Italian Head",
Audio-visual speech processing, Scheelsminde, Denmark 2001
[3] E. Cosatto, J. Ostermann, H.P. Granf, "Lifelike talking faces for
interactive services", Proc. IEEE91(9), 1406-1428, 2003.
[4] S. Morishima and S. Nakamura "Multimodal translation system using
texture-mapped lip-sync images for video mail and automatic dubbing
applications", EURASIP Journal on Applied Signal processing, pp.
1637-1647, 2004.
[5] G. Zoric and I.S. Pandzic, "Real-time language independent lip
sysnchronization method using a genetic algorithm", Signal processing 86,
pp. 3644-3656, 2006.
[1] I. S. Pandzic and R. Forchheimer, Edited, MPEG-4 Facial animation,
Wiley, England, 2002.
[2] C. Pelachaud, E. Magno-Caldognetto, "Modelling an Italian Head",
Audio-visual speech processing, Scheelsminde, Denmark 2001
[3] E. Cosatto, J. Ostermann, H.P. Granf, "Lifelike talking faces for
interactive services", Proc. IEEE91(9), 1406-1428, 2003.
[4] S. Morishima and S. Nakamura "Multimodal translation system using
texture-mapped lip-sync images for video mail and automatic dubbing
applications", EURASIP Journal on Applied Signal processing, pp.
1637-1647, 2004.
[5] G. Zoric and I.S. Pandzic, "Real-time language independent lip
sysnchronization method using a genetic algorithm", Signal processing 86,
pp. 3644-3656, 2006.
@article{"International Journal of Electrical, Electronic and Communication Sciences:53520", author = "Sang-Wan Kim and Hoon Lee and Kyung-Ho Choi and Soon-Young Park", title = "A Talking Head System for Korean Text", abstract = "A talking head system (THS) is presented to animate
the face of a speaking 3D avatar in such a way that it realistically
pronounces the given Korean text. The proposed system consists of
SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant
face animation generator. The input to the THS is a unicode text that is
to be spoken with synchronized lip shape. The TTS engine generates a
phoneme sequence with their duration and audio data. The TTS
applies the coarticulation rules to the phoneme sequence and sends a
mouth animation sequence to the face modeler. The proposed THS can
make more natural lip sync and facial expression by using the face
animation generator than those using the conventional visemes only.
The experimental results show that our system has great potential for
the implementation of talking head for Korean text.", keywords = "Talking head, Lip sync, TTS, MPEG4.", volume = "3", number = "2", pages = "224-4", }