Continuous Text Translation Using Text Modeling in the Thetos System

In the paper a method of modeling text for Polish is discussed. The method is aimed at transforming continuous input text into a text consisting of sentences in so called canonical form, whose characteristic is, among others, a complete structure as well as no anaphora or ellipses. The transformation is lossless as to the content of text being transformed. The modeling method has been worked out for the needs of the Thetos system, which translates Polish written texts into the Polish sign language. We believe that the method can be also used in various applications that deal with the natural language, e.g. in a text summary generator for Polish.




References:
[1] P. Szmal, N. Suszczanska, "Selected problems of translation from the
Polish written language to the sign language", Archiwum Informatyki
Teoretycznej i Stosowanej, vol. 13, no.1, pp. 37-51, 2001
[2] N. Suszczanska, P. Szmal, J. Francik, "Translating Polish Texts into
Sign Language in the TGT System", in Proc. 20th IASTED Int. Conf.
Applied Informatics AI-2002. Innsbruck, Austria 2002. pp. 282-287.
[3] J. Francik, P. Fabian, "Animating Sign Language in the Real Time", in
Proc. 20th IASTED Int. Multi-Conf. Applied Informatics AI 2002,
Innsbruck, Austria, 2002, pp. 276-281.
[4] S. Kulik├│w, J. Romaniuk, N. Suszczanska. "A syntactical analysis of
anaphora in the Polsyn parser". in Proc. Int. IIS:IIPWM'04 Conf.,
Zakopane, Poland 2004, pp. 444-448.
[5] Grund, D. "Computer implementation of syntactical-generative Polish
verb dictionary (Komputerowa implementacja slownika syntaktyczno-
generatywnego czasownik├│w polskich)". Studia Informatica, vol. 21,
no. 3(41), pp. 243-256, 2000.
[6] M. Swidzinski, T. Galkowski, Study on lingual competence and commu-
nication of deaf (Studia nad kompetencja jezykowa i komunikacja nie-
slyszacych). Warsaw: Warsaw University, ISBN 83-904863-2-6, 2003.