A New Model of English-Vietnamese Bilingual Information Retrieval System

In this paper, we propose a new model of English- Vietnamese bilingual Information Retrieval system. Although there are so many CLIR systems had been researched and built, the accuracy of searching results in different languages that the CLIR system supports still need to improve, especially in finding bilingual documents. The problems identified in this paper are the limitation of machine translation-s result and the extra large collections of document to be found. So we try to establish a different model to overcome these problems.




References:
[1] Ranbeer Makin, Mikita Pandey, Prasad Pingali and Vasudeve Varma.
"Experiments in Cross-lingual IR among Indian Languages". Advances in Multilingual and Multimodal Information Retrieval. Springer
Berlin/Heidelberg, 2008.
[2] Jeanine Liileng and Stein L. Tomassen. "Cross-lingual Information Retrieval by Fearture Vectors". Natural Language Processing and
Information Systems. Springer Berlin/Heidelberg , 2007.
[3] Jagadeesh Hagarlamudi and A Kumaran. "Cross-Lingual Information
Retrieval System for Indian languages". 8th Workshop of CLEF, 2007.
[4] Manoj Kumar Chinnakotla, Sagar Ranadive, Om P. Damani and Pushpak Bhattacharyya. "Hindi to English and Marathi to English Cross
Language Information Retrieval Evaluation". 8th Workshop of CLEF,2007.
[5] Martínez-Santiago, A. Montejo-Ráez, and M.A. García-Cumbreras.
"SINAI at CLEF Ad-Hoc Robust Track 2007: Applying Google Search
Engine for Robust Cross-Lingual Retrival". 8th Workshop of CLEF,2007.
[6] Aitao Chen, Hailing Jiang and Fredric Gey. "English-Chinese Cross-
Language IR using Bilingual Dictionaries". 2001.
[7] Atsushi Fujii and Tetsuya Ishikawa, "Japanese/English Cross-Language
Information Retrieval: Exploration of Query Translation and
Transliteration". 2001.
[8] Andrei Z. Broder. "Some applications of Rabin-s fingerprinting
method". Sequences II: Methods in Communications, Security, and Computer Science. Springer-Verlag, 1993.
[9] W. B. Cavnar and J. M. Trenkle. "N-gram-based text categorization".
Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, 1994.
[10] http://lucene.apache.org/