Indonesian News Classification using Support Vector Machine

Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.




References:
[1] W. S. Maulsby, "Getting in News", in Mondry, 2008, pp. 132-133
[2] A. Z. Arifin, and A. N. Setiono, "Klasifikasi Dokumen Berita Kejadian
Berbahasa Indonesia dengan Algoritma Single Pass Clustering", Institut
Teknologi Sepuluh Nopember(ITS). Surabaya. http://mail.itssby.
edu/~agusza/SITIAKlasifikasiEvent.pdf.
[3] I. Saputra, "Analisa Dan Implementasi Klasifikasi Berita Berbahasa
Indonesia Menggunakan Metode Naive Bayes Analysis and
Implementation of Classification Indonesian News With Naive Bayes
Method". Institut Teknologi Telkom. Bandung.
[4] M. Srinivas, and A. H. Sung. "Feature Selection for Intrusion Detection
Using Neural Networks and Support Vector Machines", in Journal of
Department of Computer Science, MIT. USA, 2003.
[5] Y. Yang, and X. Liu, " A Re-examination of Text Categorization
Methods", Proceedings of SIGIR-99, 22nd ACM International
Conference on Research and Development in Information Retrieval,
1999, pp. 42-49
[6] Tala, and Z. Fadillah, 2003, "A Study of Stemming Effects on
Information Retrieval in Bahasa Indonesia". Master of Logic Project.
Institute for Logic, Language and Computation, Universiteit van
Amsterdam, 2003 The Netherlands
www.illc.uva.nl/Publications/ResearchReports/MoL-200302.text.pdf.
[7] J. C. Platt, "Sequential Minimal Optimization : A Fast Algorithm for
Training Support Vector Machine", Microsoft research, 1998.
[8] N. Cristianini, and J. Shawe-Taylor, "An Introduction to Support Vector
Machines" Cambridge, UK: Cambridge University Press, 2000.