On a Pitch Duration Technique for Prosody Control

In this paper, we propose a method of alter duration in frequency domain that control prosody in real time after pitch alteration. If there has a method to alteration duration freely among prosody information, that may used in several fields such as speech impediment person's pronunciation proof reading or language study. The pitch alteration method used control prosody altered by PSOLA synthesis method which is in time domain processing method. However, the duration of pitch alteration speech is changed by the frequency domain. In this paper, we altered the duration with the method of duration alteration by Fast Fourier Transformation in frequency domain. Consequently, the intelligibility of the pitch and duration are controlled has a slight decrease than the case when only pitch is changed, but the proposed algorithm obtained the higher MOS score about naturalness.




References:
[1] G. Bristow, Electronic Speech Synthesis, McGraw-Hill, 1984.
[2] E.J. Yannakoudakis, P.J. Hutton, Speech Synthesis and Recognition
Systems, Ellis Horwood Ltd., 1987.
[3] J.R. Deller, J.G. Proakis, J.H.L. Hansen, Discrete-Time Processing of
Speech Signals, Macmillan Publishing Co., 1993.
[4] H.G. Lee, M.J. Bae, U.C. Im, "Detection pitch point of speech signal by
G-Peak detection", No. 6 signal processing combination art and science
contest, the sixth book, 1, pp.58-61, 1993.
[5] H.B. Park, W.R. Jo, J.D. Kim, W. Park, D.S. Shim, M.J. Bae, "Research
about pitch alteration method of optimistical by the pitch alteration rate",
No. 15 speech communication and new good handle workshop collection
of learned papers, Vol.15, No.1, PP.460-464, August , 1998 21 - 22 day.
[6] H.B. Park, M.J. Bae, "Research about pitch point detection for voice color
alter", South Korea sound learned society, summer art and science
announcement contest, emperor 19 book 1 (s) lake, No.1, pp.149-152,
July, 2000 7 - 8 day.
[7] B.E. Caspers, B.S. Atal, "Changing Pitch and Duration in LPC
Synthesised Speech using Multipulse Excitation," J. Acoust. Soc. Amer.,
Vol.73, No.1, pp.55, 1983.
[8] T. Takagi, E. Miyasaka, "A Speech Prosody Conversion System with a
high Quality Speech Analysis-Synthesis Method," Proc.
EUROSPEECH'93, pp.995-998, September 1993.