Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering
Network security attacks are the violation of
information security policy that received much attention to the
computational intelligence society in the last decades. Data mining
has become a very useful technique for detecting network intrusions
by extracting useful knowledge from large number of network data
or logs. Naïve Bayesian classifier is one of the most popular data
mining algorithm for classification, which provides an optimal way
to predict the class of an unknown example. It has been tested that
one set of probability derived from data is not good enough to have
good classification rate. In this paper, we proposed a new learning
algorithm for mining network logs to detect network intrusions
through naïve Bayesian classifier, which first clusters the network
logs into several groups based on similarity of logs, and then
calculates the prior and conditional probabilities for each group of
logs. For classifying a new log, the algorithm checks in which cluster
the log belongs and then use that cluster-s probability set to classify
the new log. We tested the performance of our proposed algorithm by
employing KDD99 benchmark network intrusion detection dataset,
and the experimental results proved that it improves detection rates
as well as reduces false positives for different types of network
intrusions.
[1] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das, "The 1999
DARPA off-line intrusion detection evaluation," Computer networks:
The International Journal of Computer and Telecommunications
Networking, 34, 2000, pp. 579-595.
[2] M. Stillerman, C. Marceau, and M. Stillman, "Intrusion detection on
distributed systems," Communications of the ACM, 42(7), pp. 62-69.
[3] D. Barbara, J. Couto, S. Jajodia, L. Popyack, and N. Wu, "ADAM:
Detecting intrusion by data mining," IEEE Workshop on Information
Assurance and Security, West Point, New York, June 5-6, 2001.
[4] W. Lee, "A data mining and CIDF based approach for detecting novel
and distributed intrusions," Recent Advances in Intrusion Detection, 3rd
International Workshop, RAID 2000, Toulouse, France, October 2-4,
2000, Proc. Lecture Notes in Computer Science 1907 Springer, 2000, pp.
49-65.
[5] W. Lee, S. J. Stolfo, and K. W. Mok, "Adaptive Intrusion Detection: A
Data Mining Approach," Artificial Intelligence Review, 14(6),
December 2000, pp. 533-567.
[6] R. Wasniowski, "Multi-sensor agent-based intrusion detection system,"
In Proc. of the 2nd Annual Conference on Information Security,
Kennesaw, Georgia, 2005, pp. 100-103.
[7] N.B. Amor, S. Benferhat, and Z. Elouedi, "Naïve Bayes vs. decision
trees in intrusion detection systems," In Proc. of 2004 ACM Symposium
on Applied Computing, 2004, pp. 420-424.
[8] YU Yan, and Huang Hao, "An ensemble approach to intrusion detection
based on improved multi-objective genetic algorithm," Journal of
Software, vol. 18, no. 6, June 2007, pp. 1369-1378.
[9] T. Shon, J. Seo, and J. Moon, "SVM approach with a genetic algorithm
for network intrusion detection," In Proc. of 20th International
Symposium on Computer and Information Sciences (ISCIS 2005),
Berlin: Springer-Verlag, 2005, pp. 224-233.
[10] S. Mukkamala, G. Janoski, and A. H. Sung, "Intrusion detection using
neural networks and support vector machines," In Proc. of the IEEE
International Joint Conference on Neural Networks, 2002, pp.1702-
1707.
[11] J. Luo, and S.M. Bridges, "Mining fuzzy association rules and fuzzy
frequency episodes for intrusion detection," International Journal of
Intelligent Systems, John Wiley & Sons, vol. 15, no. 8, 2000, pp. 687-
703.
[12] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, "A
comparative study of anomaly detection schemes in network intrusion
detection," In Proc. of the SIAM Conference on Data Mining, 2003.
[13] James P. Anderson, "Computer security threat monitoring and
surveillance," Technical Report 98-17, James P. Anderson Co., Fort
Washington, Pennsylvania, USA, April 1980.
[14] Dorothy E. Denning, "An intrusion detection model," IEEE Transaction
on Software Engineering, SE-13(2), 1987, pp. 222-232.
[15] Dorothy E. Denning, and P.G. Neumann "Requirement and model for
IDES- A real-time intrusion detection system," Computer Science
Laboratory, SRI International, Menlo Park, CA 94025-3493, Technical
Report # 83F83-01-00, 1985.
[16] S.E. Smaha, and Haystack, "An intrusion detection system," in Proc. of
the IEEE Fourth Aerospace Computer Security Applications Conference,
Orlando, FL, 1988, pp. 37-44.
[17] S. Forrest, S.A. Hofmeyr, A. Somayaji, T.A. Longstaff, "A sense of self
for Unix processes," in Proc. of the IEEE Symposium on Research in
Security and Privacy, Oakland, CA, USA, 1996, pp. 120-128.
[18] A. Valdes, K. Skinner, "Adaptive model-based monitoring for cyber
attack detection," in Recent Advances in Intrusion Detection Toulouse,
France, 2000, pp. 80-92.
[19] C. Kruegel, D. Mutz, W. Robertson, F. Valeur, "Bayesian event
classification for intrusion detection," in Proc. of the 19th Annual
Computer Security Applications Conference, Las Vegas, NV, 2003.
[20] M.L. Shyu, S.C. Chen, K. Sarinnapakorn, L. Chang, "A novel anomaly
detection scheme based on principal component classifier," in Proc. of
the IEEE Foundations and New Directions of Data Mining Workshop,
Melbourne, FL, USA, 2003, pp. 172-179.
[21] D. Y. Yeung, and Y. X. Ding, "Host-based intrusion detection using
dynamic and static behavioral models," Pattern Recognition, 36, 2003,
pp. 229-243.
[22] W. Lee, S.J. Stolfo, "Data mining approaches for intrusion detection," In
Proc. of the 7th USENIX Security Symposium (SECURITY-98),
Berkeley, CA, USA, 1998, pp. 79-94.
[23] J.E. Dickerson, J.A. Dickerson, "Fuzzy network profiling for intrusion
detection," In Proc. of the 19th International Conference of the North
American Fuzzy Information Processing Society (NAFIPS), Atlanta,
GA, 2000, pp. 301-306.
[24] M. Ramadas, S.O.B. Tjaden, "Detecting anomalous network traffic with
self-organizing maps," In Proc. of the 6th International Symposium on
Recent Advances in Intrusion Detection, Pittsburgh, PA, USA, 2003, pp.
36-54.
[25] The KDD Archive. KDD99 cup dataset, 1999.
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
[1] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das, "The 1999
DARPA off-line intrusion detection evaluation," Computer networks:
The International Journal of Computer and Telecommunications
Networking, 34, 2000, pp. 579-595.
[2] M. Stillerman, C. Marceau, and M. Stillman, "Intrusion detection on
distributed systems," Communications of the ACM, 42(7), pp. 62-69.
[3] D. Barbara, J. Couto, S. Jajodia, L. Popyack, and N. Wu, "ADAM:
Detecting intrusion by data mining," IEEE Workshop on Information
Assurance and Security, West Point, New York, June 5-6, 2001.
[4] W. Lee, "A data mining and CIDF based approach for detecting novel
and distributed intrusions," Recent Advances in Intrusion Detection, 3rd
International Workshop, RAID 2000, Toulouse, France, October 2-4,
2000, Proc. Lecture Notes in Computer Science 1907 Springer, 2000, pp.
49-65.
[5] W. Lee, S. J. Stolfo, and K. W. Mok, "Adaptive Intrusion Detection: A
Data Mining Approach," Artificial Intelligence Review, 14(6),
December 2000, pp. 533-567.
[6] R. Wasniowski, "Multi-sensor agent-based intrusion detection system,"
In Proc. of the 2nd Annual Conference on Information Security,
Kennesaw, Georgia, 2005, pp. 100-103.
[7] N.B. Amor, S. Benferhat, and Z. Elouedi, "Naïve Bayes vs. decision
trees in intrusion detection systems," In Proc. of 2004 ACM Symposium
on Applied Computing, 2004, pp. 420-424.
[8] YU Yan, and Huang Hao, "An ensemble approach to intrusion detection
based on improved multi-objective genetic algorithm," Journal of
Software, vol. 18, no. 6, June 2007, pp. 1369-1378.
[9] T. Shon, J. Seo, and J. Moon, "SVM approach with a genetic algorithm
for network intrusion detection," In Proc. of 20th International
Symposium on Computer and Information Sciences (ISCIS 2005),
Berlin: Springer-Verlag, 2005, pp. 224-233.
[10] S. Mukkamala, G. Janoski, and A. H. Sung, "Intrusion detection using
neural networks and support vector machines," In Proc. of the IEEE
International Joint Conference on Neural Networks, 2002, pp.1702-
1707.
[11] J. Luo, and S.M. Bridges, "Mining fuzzy association rules and fuzzy
frequency episodes for intrusion detection," International Journal of
Intelligent Systems, John Wiley & Sons, vol. 15, no. 8, 2000, pp. 687-
703.
[12] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, "A
comparative study of anomaly detection schemes in network intrusion
detection," In Proc. of the SIAM Conference on Data Mining, 2003.
[13] James P. Anderson, "Computer security threat monitoring and
surveillance," Technical Report 98-17, James P. Anderson Co., Fort
Washington, Pennsylvania, USA, April 1980.
[14] Dorothy E. Denning, "An intrusion detection model," IEEE Transaction
on Software Engineering, SE-13(2), 1987, pp. 222-232.
[15] Dorothy E. Denning, and P.G. Neumann "Requirement and model for
IDES- A real-time intrusion detection system," Computer Science
Laboratory, SRI International, Menlo Park, CA 94025-3493, Technical
Report # 83F83-01-00, 1985.
[16] S.E. Smaha, and Haystack, "An intrusion detection system," in Proc. of
the IEEE Fourth Aerospace Computer Security Applications Conference,
Orlando, FL, 1988, pp. 37-44.
[17] S. Forrest, S.A. Hofmeyr, A. Somayaji, T.A. Longstaff, "A sense of self
for Unix processes," in Proc. of the IEEE Symposium on Research in
Security and Privacy, Oakland, CA, USA, 1996, pp. 120-128.
[18] A. Valdes, K. Skinner, "Adaptive model-based monitoring for cyber
attack detection," in Recent Advances in Intrusion Detection Toulouse,
France, 2000, pp. 80-92.
[19] C. Kruegel, D. Mutz, W. Robertson, F. Valeur, "Bayesian event
classification for intrusion detection," in Proc. of the 19th Annual
Computer Security Applications Conference, Las Vegas, NV, 2003.
[20] M.L. Shyu, S.C. Chen, K. Sarinnapakorn, L. Chang, "A novel anomaly
detection scheme based on principal component classifier," in Proc. of
the IEEE Foundations and New Directions of Data Mining Workshop,
Melbourne, FL, USA, 2003, pp. 172-179.
[21] D. Y. Yeung, and Y. X. Ding, "Host-based intrusion detection using
dynamic and static behavioral models," Pattern Recognition, 36, 2003,
pp. 229-243.
[22] W. Lee, S.J. Stolfo, "Data mining approaches for intrusion detection," In
Proc. of the 7th USENIX Security Symposium (SECURITY-98),
Berkeley, CA, USA, 1998, pp. 79-94.
[23] J.E. Dickerson, J.A. Dickerson, "Fuzzy network profiling for intrusion
detection," In Proc. of the 19th International Conference of the North
American Fuzzy Information Processing Society (NAFIPS), Atlanta,
GA, 2000, pp. 301-306.
[24] M. Ramadas, S.O.B. Tjaden, "Detecting anomalous network traffic with
self-organizing maps," In Proc. of the 6th International Symposium on
Recent Advances in Intrusion Detection, Pittsburgh, PA, USA, 2003, pp.
36-54.
[25] The KDD Archive. KDD99 cup dataset, 1999.
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
@article{"International Journal of Information, Control and Computer Sciences:62393", author = "Dewan Md. Farid and Nouria Harbi and Suman Ahmmed and Md. Zahidur Rahman and Chowdhury Mofizur Rahman", title = "Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering", abstract = "Network security attacks are the violation of
information security policy that received much attention to the
computational intelligence society in the last decades. Data mining
has become a very useful technique for detecting network intrusions
by extracting useful knowledge from large number of network data
or logs. Naïve Bayesian classifier is one of the most popular data
mining algorithm for classification, which provides an optimal way
to predict the class of an unknown example. It has been tested that
one set of probability derived from data is not good enough to have
good classification rate. In this paper, we proposed a new learning
algorithm for mining network logs to detect network intrusions
through naïve Bayesian classifier, which first clusters the network
logs into several groups based on similarity of logs, and then
calculates the prior and conditional probabilities for each group of
logs. For classifying a new log, the algorithm checks in which cluster
the log belongs and then use that cluster-s probability set to classify
the new log. We tested the performance of our proposed algorithm by
employing KDD99 benchmark network intrusion detection dataset,
and the experimental results proved that it improves detection rates
as well as reduces false positives for different types of network
intrusions.", keywords = "Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.", volume = "4", number = "6", pages = "1116-5", }