Sounds Alike Name Matching for Myanmar Language

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.




References:
[1] A.J Lait and B. Randell. An assessment of name matching algorithms.
Technical report, Deptartment of Computer Science, University of
Newcastle upon Tyne, 1993.
[2] D. Holmes and C. M. McCabe. Improving precision and recall for soundex
retrieval. In Proceedings of the IEEE International Conference on
Information Technology - Coding and Computing (ITCC), Las Vegas,
2002.
[3] L. Philips. The double-metaphone search algorithm. C/C++ User-s
Journal, 18(6), 2000.
[4] N.Uzzaman , M.Khan "A Bangla Phonetic Encoding for Better Spelling
Suggestions", PAN Localization Project. International Development
Research Centre, Ottawa ,Canada.
[5] P. Jokinen, J. Tarhio, and E. Ukkonen. "A comparison of approximate
string matching algorithms". Software Practice and Experience,
26(12):1439-1458, 1996.
[6] R.K.Joshi, K.Shroff, S.P.Mudur." A phonetic Code Based Scheme for
Effective Processing of Indian Languages", 23rdInternationalization and
Unicode Conference, Prague, Czech Republic , March 2003.
[7] R. Cilibrasi and P. M. Vit'anyi. Clustering by compression.
IEEE Transactions on Information Theory, 51(4):1523-1545, 2005.
[8] S.U.Aqeel, S.Beitzel, E.Jensen, O.Frieder and D.Grossman. "On the
Development of Name Search Techniques for Arabic", Illinois Institute of
technology ,Chicago,IL 60616
[9] The International Phonetic Association. University of Glasgow, Glasgow,
UK, http://www.arts.gla.ac.uk/IPA/ipa.html
[10] T. Gadd. " PHONIX: The algorithm". Program: automated Library and
information systems, 24(4):363-366, 1990.