Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.




References:
[1] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das, "The 1999
DARPA off-line intrusion detection evaluation," Computer networks:
The International Journal of Computer and Telecommunications
Networking, 34, 2000, pp. 579-595.
[2] M. Stillerman, C. Marceau, and M. Stillman, "Intrusion detection on
distributed systems," Communications of the ACM, 42(7), pp. 62-69.
[3] D. Barbara, J. Couto, S. Jajodia, L. Popyack, and N. Wu, "ADAM:
Detecting intrusion by data mining," IEEE Workshop on Information
Assurance and Security, West Point, New York, June 5-6, 2001.
[4] W. Lee, "A data mining and CIDF based approach for detecting novel
and distributed intrusions," Recent Advances in Intrusion Detection, 3rd
International Workshop, RAID 2000, Toulouse, France, October 2-4,
2000, Proc. Lecture Notes in Computer Science 1907 Springer, 2000, pp.
49-65.
[5] W. Lee, S. J. Stolfo, and K. W. Mok, "Adaptive Intrusion Detection: A
Data Mining Approach," Artificial Intelligence Review, 14(6),
December 2000, pp. 533-567.
[6] R. Wasniowski, "Multi-sensor agent-based intrusion detection system,"
In Proc. of the 2nd Annual Conference on Information Security,
Kennesaw, Georgia, 2005, pp. 100-103.
[7] N.B. Amor, S. Benferhat, and Z. Elouedi, "Naïve Bayes vs. decision
trees in intrusion detection systems," In Proc. of 2004 ACM Symposium
on Applied Computing, 2004, pp. 420-424.
[8] YU Yan, and Huang Hao, "An ensemble approach to intrusion detection
based on improved multi-objective genetic algorithm," Journal of
Software, vol. 18, no. 6, June 2007, pp. 1369-1378.
[9] T. Shon, J. Seo, and J. Moon, "SVM approach with a genetic algorithm
for network intrusion detection," In Proc. of 20th International
Symposium on Computer and Information Sciences (ISCIS 2005),
Berlin: Springer-Verlag, 2005, pp. 224-233.
[10] S. Mukkamala, G. Janoski, and A. H. Sung, "Intrusion detection using
neural networks and support vector machines," In Proc. of the IEEE
International Joint Conference on Neural Networks, 2002, pp.1702-
1707.
[11] J. Luo, and S.M. Bridges, "Mining fuzzy association rules and fuzzy
frequency episodes for intrusion detection," International Journal of
Intelligent Systems, John Wiley & Sons, vol. 15, no. 8, 2000, pp. 687-
703.
[12] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, "A
comparative study of anomaly detection schemes in network intrusion
detection," In Proc. of the SIAM Conference on Data Mining, 2003.
[13] James P. Anderson, "Computer security threat monitoring and
surveillance," Technical Report 98-17, James P. Anderson Co., Fort
Washington, Pennsylvania, USA, April 1980.
[14] Dorothy E. Denning, "An intrusion detection model," IEEE Transaction
on Software Engineering, SE-13(2), 1987, pp. 222-232.
[15] Dorothy E. Denning, and P.G. Neumann "Requirement and model for
IDES- A real-time intrusion detection system," Computer Science
Laboratory, SRI International, Menlo Park, CA 94025-3493, Technical
Report # 83F83-01-00, 1985.
[16] S.E. Smaha, and Haystack, "An intrusion detection system," in Proc. of
the IEEE Fourth Aerospace Computer Security Applications Conference,
Orlando, FL, 1988, pp. 37-44.
[17] S. Forrest, S.A. Hofmeyr, A. Somayaji, T.A. Longstaff, "A sense of self
for Unix processes," in Proc. of the IEEE Symposium on Research in
Security and Privacy, Oakland, CA, USA, 1996, pp. 120-128.
[18] A. Valdes, K. Skinner, "Adaptive model-based monitoring for cyber
attack detection," in Recent Advances in Intrusion Detection Toulouse,
France, 2000, pp. 80-92.
[19] C. Kruegel, D. Mutz, W. Robertson, F. Valeur, "Bayesian event
classification for intrusion detection," in Proc. of the 19th Annual
Computer Security Applications Conference, Las Vegas, NV, 2003.
[20] M.L. Shyu, S.C. Chen, K. Sarinnapakorn, L. Chang, "A novel anomaly
detection scheme based on principal component classifier," in Proc. of
the IEEE Foundations and New Directions of Data Mining Workshop,
Melbourne, FL, USA, 2003, pp. 172-179.
[21] D. Y. Yeung, and Y. X. Ding, "Host-based intrusion detection using
dynamic and static behavioral models," Pattern Recognition, 36, 2003,
pp. 229-243.
[22] W. Lee, S.J. Stolfo, "Data mining approaches for intrusion detection," In
Proc. of the 7th USENIX Security Symposium (SECURITY-98),
Berkeley, CA, USA, 1998, pp. 79-94.
[23] J.E. Dickerson, J.A. Dickerson, "Fuzzy network profiling for intrusion
detection," In Proc. of the 19th International Conference of the North
American Fuzzy Information Processing Society (NAFIPS), Atlanta,
GA, 2000, pp. 301-306.
[24] M. Ramadas, S.O.B. Tjaden, "Detecting anomalous network traffic with
self-organizing maps," In Proc. of the 6th International Symposium on
Recent Advances in Intrusion Detection, Pittsburgh, PA, USA, 2003, pp.
36-54.
[25] The KDD Archive. KDD99 cup dataset, 1999.
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html