Determining the Gender of Korean Names for Pronoun Generation

Scholarly

Volume:1, Issue: 8, 2007 Page No: 2437 - 2441

International Journal of Information, Control and Computer Sciences

ISSN: 2517-9942

1754 Downloads

Abstract Full Text Download References Share Add to Favorites

DOI:10.5281/zenodo.1061842 BibTeX JSON

Determining the Gender of Korean Names for Pronoun Generation

It is an important task in Korean-English machine translation to classify the gender of names correctly. When a sentence is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus, in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method yields the improvement in accuracy over the simple looking-up of the collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed method is more plausible for the gender classification of the Korean names.

Authors:

Keywords:

References:

[1] E.-S. Chung, Y.-G. Hwang, and M.-G. Jang, "Korean Named Entity Recognition Using HMM and Co-Training Model," In Proceedings of
the 6th International Workshop on Information Retrieval with Asian
Languages, pp. 161-167, 2003.
[2] C. Drummond and R. Holte, "C4.5, Class Imbalance, and Cost Sensitivity:
Why Under-Sampling beats Over-Sampling," In Proceedings of Workshop on Learning from Imabalanced Datasets II, ICML, 2003.
[3] N.-R. Han, Korean Zero Pronouns: Analysis and Resolution, Ph.D
Thesis, University of Pennsylvania, 2006.
[4] S. Katz, "Estimation of Probabilities from Sparse Data for the Language
Model Component of a Speech Recognizer," IEEE Transactions on
Acoustics, Speech, and Signal Processing, Vol. 35, No. 3, pp. 400-401, 1987.
[5] K.-N. Kim, Y.-H. Yoon, H.-S. Kim, and J.-Y. Seo, "Named Entity
Recognition Using Acyclic Weighted Digraphs: A Semi-Supervised Statistical Method," Lecture Notes in Computer Science, Vol. 4426, pp. 571-578, 2007.
[6] Y.-T. Kim, Introduction to Natural Language Processing, 2nd Edition,
Saeng-Neung Publisher, 2001. (In Korean)
[7] B.-K. Kwak and J.-W. Cha, "Named Entity Tagging for Korean Using
DL-CoTrain Algorithm," Lecture Notes in Computer Science, Vol. 3689, pp. 589-594, 2005.
[8] C.-K. Lee, Y.-G. Hwang, H.-J. Oh, S.-J. Lim, J. Heo, C.-H. Lee, H.-J. Kim, J.-H. Wang, and M.-G. Jang, "Fine-Grained Named Entity
Recognition Using Conditional Random Fields for Question Answering," Lecture Notes in Computer Science, Vol. 4182, pp. 581-587, 2006.
[9] S.-H. Lee, D. Byron, and S.-B. Jang, "Why Is Zero Marking Important in Korean?" In Proceedings of the 2nd International Conference on Natural Language Processing, pp. 588-599, 2005.
[10] J.-E. Roh and J.-H. Lee, "Generation of Zero Pronouns Based on the
Centering Theory and Pairwise Salience of Entities," IEICE Transactions
on Information and Systems, Vol. E880D(2), pp. 837-846, 2006.
[11] C.-N. Seon, Y-.J. Ko, J. Kim, and J.-Y. Seo, "Named Entity Recognition
Using Machine Learning Methods and Pattern-Recognition Rules,"
In Proceedings of the 6th Natural Language Processing Pacific Rim
Symposium, 2001.
[12] S. Zhao and H. Ng, "Identification and Resolution of Chinese Zero
Pronouns: A Machine Learning Approach," In Proceedings of the 2007
Joint Conference on Empirical Methods in Natural Language Processing
and Computational Natural Language Learning, pp. 541-550, 2007.
[13] G. Zhou and J. Su, "Named Entity Recognition Using an HMM-Based
Chunk Tagger," In Proceedings of the 40th Annual Meeting of the
Association for Computational Linguistics, pp. 473-480, 2002.

Scholarly

International Journal of Information, Control and Computer Sciences

Archive

Last Issue

Commitee

Determining the Gender of Korean Names for Pronoun Generation

Scholarly

International Journal of Information, Control and Computer Sciences

Archive

Last Issue

Commitee

Determining the Gender of Korean Names for Pronoun Generation

Preview