Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora
Since the one-to-one word translator does not have the
facility to translate pragmatic aspects of Javanese, the parallel text
alignment model described uses a phrase pair combination. The
algorithm aligns the parallel text automatically from the beginning to
the end of each sentence. Even though the results of the phrase pair
combination outperform the previous algorithm, it is still inefficient.
Recording all possible combinations consume more space in the
database and time consuming. The original algorithm is modified by
applying the edit distance coefficient to improve the data-storage
efficiency. As a result, the data-storage consumption is 90% reduced
as well as its learning period (42s).
[1] A. P. Wibawa and A. Nafalski, "Intelligent tutoring system: a proposed
approach to Javanese language learning in Indonesia," World Institute
for Engineering and Technology Education vol. 8, pp. 216-220, 2010.
[2] N. Murray, "Pragmatics, awareness raising and the cooperative
principle.," E:T Journal, pp. 1-9, 2009.
[3] J. Zhao, et al., "Two-phase base noun phrase alignment in Chinese-
English parallel corpora," in Natural Language Processing and
Knowledge Engineering, Wuhan, 2005, pp. 360-365.B. Smith, "An
approach to graphs of linear forms (Unpublished work style),"
unpublished.
[4] L. Ahrenberg, et al., "A simple hybrid aligner for generating lexical
corespondences in parallel text," in 36 th Annual Meetingof the
Association for Computational Linguistics Montreal, Quebec, Canada.,
1998, pp. 29-35.
[5] R. Terashima, et al., "Learning method for extraction of partial
correspondence from parallel corpus," in International Conference on
Asian Language Processing, Singapore, 2009, pp. 293-298.
[6] S. Poedjosoedarmo, "Javanese Speech Levels," Indonesia, pp. 54-81,
1968.
[7] P. Purwadi, et al., Javanese language structure. Yogyakarta: Media
Abadi, 2005.
[8] A. B. Setiyanto, Parama Satra: Javanese Language. Yogyakarta: Panji
Pustaka, 2010.
[9] Sukarno, "The Reflection of the Javanese Cultural Concepts in the
Politeness of Javanese," k@ta, vol. 12, pp. 59-71, 2010.
[10] S. Wibawa, "Efforts to maintain and develop Javanese language
politeness," in International Seminar of Javanese Language,
Paramaribo,Suriname, 2005, pp. 1-10.
[1] A. P. Wibawa and A. Nafalski, "Intelligent tutoring system: a proposed
approach to Javanese language learning in Indonesia," World Institute
for Engineering and Technology Education vol. 8, pp. 216-220, 2010.
[2] N. Murray, "Pragmatics, awareness raising and the cooperative
principle.," E:T Journal, pp. 1-9, 2009.
[3] J. Zhao, et al., "Two-phase base noun phrase alignment in Chinese-
English parallel corpora," in Natural Language Processing and
Knowledge Engineering, Wuhan, 2005, pp. 360-365.B. Smith, "An
approach to graphs of linear forms (Unpublished work style),"
unpublished.
[4] L. Ahrenberg, et al., "A simple hybrid aligner for generating lexical
corespondences in parallel text," in 36 th Annual Meetingof the
Association for Computational Linguistics Montreal, Quebec, Canada.,
1998, pp. 29-35.
[5] R. Terashima, et al., "Learning method for extraction of partial
correspondence from parallel corpus," in International Conference on
Asian Language Processing, Singapore, 2009, pp. 293-298.
[6] S. Poedjosoedarmo, "Javanese Speech Levels," Indonesia, pp. 54-81,
1968.
[7] P. Purwadi, et al., Javanese language structure. Yogyakarta: Media
Abadi, 2005.
[8] A. B. Setiyanto, Parama Satra: Javanese Language. Yogyakarta: Panji
Pustaka, 2010.
[9] Sukarno, "The Reflection of the Javanese Cultural Concepts in the
Politeness of Javanese," k@ta, vol. 12, pp. 59-71, 2010.
[10] S. Wibawa, "Efforts to maintain and develop Javanese language
politeness," in International Seminar of Javanese Language,
Paramaribo,Suriname, 2005, pp. 1-10.
@article{"International Journal of Electrical, Electronic and Communication Sciences:56024", author = "Aji P. Wibawa and Andrew Nafalski and Neil Murray and Wayan F. Mahmudy", title = "Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora", abstract = "Since the one-to-one word translator does not have the
facility to translate pragmatic aspects of Javanese, the parallel text
alignment model described uses a phrase pair combination. The
algorithm aligns the parallel text automatically from the beginning to
the end of each sentence. Even though the results of the phrase pair
combination outperform the previous algorithm, it is still inefficient.
Recording all possible combinations consume more space in the
database and time consuming. The original algorithm is modified by
applying the edit distance coefficient to improve the data-storage
efficiency. As a result, the data-storage consumption is 90% reduced
as well as its learning period (42s).", keywords = "edit distance coefficient, Javanese, parallel text
alignment, phrase pair combination", volume = "6", number = "9", pages = "967-5", }