Extracting Multiword Expressions in Machine Translation from English to Urdu using Relational Data Approach

Machine Translation, (hereafter in this document referred to as the "MT") faces a lot of complex problems from its origination. Extracting multiword expressions is also one of the complex problems in MT. Finding multiword expressions during translating a sentence from English into Urdu, through existing solutions, takes a lot of time and occupies system resources. We have designed a simple relational data approach, in which we simply set a bit in dictionary (database) for multiword, to find and handle multiword expression. This approach handles multiword efficiently.




References:
[1] I.A. Sag, T. Baldwin, F. Bond, A. Copestake, D. Flickinger. 2001,
"Multi-word Expressions: A Pain in the Neck for NLP", LinGO Working
Paper No. 2001-03. Stanford University, CA.
[2] Scott S. L. Piao, Paul Rayson, Dawn Archer, Andrew Wilson, Tony
McEnery, "Extracting Multiword Expression Using a Semantic Tagger",
Lancaster University.
[3] Ann Copestake, Fabre Lambeau, Aline Villavicecio, Francis Bond,
Timothy Baldwin, Ivan A.Sag, Dan Flickinger, "Multiword Expressions:
linguistic precision and reusability", University of Cambridge Computer
Laboratory,William Gates Building, JJ Thomson Avenue, Cambridge,
CB3 0FD, UK. NTT Communication Science Labortries, Hikari Dai,
Seiko-cho,Soraku-gun, Kioto 619-0237, JAPAN.
[4] Z. Pervez, S. Khan, F. Mustafa, M. Mahmood, U. Hasan, "Pharasal
Consolidation Algorithm For Part Of Speech Tags In Machine
Translation From English To Urdu", National University of Science and
Technology, Rawalpindi Pakistan.
[5] Sarmad Hussein. "Letter-to-Sound Conversion for Urdu Text-to-Speech
System". Center for Research in Urdu Language Processing, National
University of Computer and Emerging Sciences, Lahore, Pakistan.
[6] T. Rahman (2002). "Language Ideology and Power: Language Learning
Among the Muslims of Pakistan and North India", Oxford University
Press, Karachi, Pakistan.
[7] Ethnologue, 13th Edition.
[8] T. Mitamura, E. Nyberg, E. Torrejon, D. Svoboda, A.Brunner and K.
Baker. "Pronominal Anaphora Resolution in the Kantoo Multilingual
Machine Translation System", Proceedings of the 9th International
Conference on Theoretical and Methodological Issues in Machine
Translation. Keihanna, Japan, Mar 2002.
[9] AltaVista Babelfish. URL: http://babelfish.altavista.com
[10] Google Language Tool.
URL: http://www.google.com.pk/language_tools
[11] Z. Pervez, S. Khan, F. Mustafa, M. Mahmood, U. Hasan, "Pharasal
Consolidation Algorithm for Part Of Speech Tags In Machine
Translation from English to Urdu", NUST Institute of Information
Technology, National University of Sciences and Technology.