OCIRS: An Ontology-based Chinese Idioms Retrieval System

Chinese Idioms are a type of traditional Chinese idiomatic expressions with specific meanings and stereotypes structure which are widely used in classical Chinese and are still common in vernacular written and spoken Chinese today. Currently, Chinese Idioms are retrieved in glossary with key character or key word in morphology or pronunciation index that can not meet the need of searching semantically. OCIRS is proposed to search the desired idiom in the case of users only knowing its meaning without any key character or key word. The user-s request in a sentence or phrase will be grammatically analyzed in advance by word segmentation, key word extraction and semantic similarity computation, thus can be mapped to the idiom domain ontology which is constructed to provide ample semantic relations and to facilitate description logics-based reasoning for idiom retrieval. The experimental evaluation shows that OCIRS realizes the function of searching idioms via semantics, obtaining preliminary achievement as requested by the users.




References:
[1] J. Xu, "A study of classification of Chinese idioms," Journal of Yichun
University (social science), 2003, vol. 25, no. 5, pp.86-88, in Chinese.
[2] M. Zuo, " Development of idioms in meaning," Journal of Wuhan University
of Science & Technology (Social Science Edition), 2004, vol. 6, no.
3, pp.78-81, in Chinese.
[3] Yojijukugo, http://en.wikipedia.org/wiki/yojijukugo, last viewed on April
12, 2010.
[4] Korean idioms, http://wiki.galbijim.com/Category:Korean_idioms, last
viewed on April 12, 2010.
[5] R. Baeza-Yates, Modern Information Retrieval. Addison Wesley, 1999.
[6] B. O. Szuprowicz. Search Engine Technologies for the World Wide Web
and Intranets, Computer Technology Research Corp., 1997.
[7] Search engines, http://www.lib.berkeley.edu/teachinglib/Guides/Internet
/SearchEngines.html, last viewed on April 12, 2010.
[8] J. Mei. Synonymous with the Word Forest. Shanghai Lexicographical
Publishing House, 1983, in Chinese.
[9] C.Fellbaum, WordNet-An Electronic Lexical Database, MIT Press, 1998.
[10] R. Grubera A translation approach to portable Ontology specifications.
Knowledge Acquisition, 1993, vol.5, no.2, pp.199-220.
[11] Description Logics home page, http://dl.kr.org/, last viewed on April 12,
2010.
[12] Q. Liu and H. Zhang, "Chinese lexical analysis using cascaded hidden
Markov model," Journal of Computer Research and Development, 2004,
vol.41, no.8, pp.1421-1429, in Chinese.
[13] M. Uschold, "Ontologies principles, methods and application," Knowledge
Engineering Review, 1996, vo.11, no.2, pp.93-155.
[14] M. Gruninger and S. Fox, "Methodology for the design and evaluation of
ontologies," In proceedings of the Workshop on Basic Ontological Issues
in Knowledge Sharing, held in conjunction with IJCAI-95, Montreal,
Canada.
[15] M. Fernandez, A. Gomez, N. Juristo, "METHONTOLOGY: From ontological
art towards ontological engineering," In Proceedings of AAAI-97
Spring Symposium on Ontological Engineering Stanford: AAAI Press,
1997, pp.33-40.
[16] L. Wang and X. Hou, Chinese Classify Idioms Dictionary. Guangdong
People-s Publishing House, 1985, in Chinese.
[17] Chinese Idioms Dictionary. http://cy.kdd.cc/sy/, last viewed on April 12,
2010.
[18] B. Wang, Research on Automatic Chinese-English Bilingual Corpus
Alignment, PhD Thesis, Beijing: Institute of Computing Technology
Chinese Academy of Sciences, 1999, in Chinese.
[19] B. Qin and T. Liu, "Question answering system based on frequently asked
questions," Journal of Harbin Institute of Technology, 2003, vol. 35, no.
10, pp.1179-1182, in Chinese.
[20] B. Jin, Y. Shi, "Similarity algorithm of text based on semantic understanding,"
Journal of Dalian University of Technology, 2005, vol. 45, no.
2, pp.291-297, in Chinese.