Incorporating Multiple Supervised Learning Algorithms for Effective Intrusion Detection

As internet continues to expand its usage with an 
enormous number of applications, cyber-threats have significantly 
increased accordingly. Thus, accurate detection of malicious traffic in 
a timely manner is a critical concern in today’s Internet for security. 
One approach for intrusion detection is to use Machine Learning (ML) 
techniques. Several methods based on ML algorithms have been 
introduced over the past years, but they are largely limited in terms of 
detection accuracy and/or time and space complexity to run. In this 
work, we present a novel method for intrusion detection that 
incorporates a set of supervised learning algorithms. The proposed 
technique provides high accuracy and outperforms existing techniques 
that simply utilizes a single learning method. In addition, our 
technique relies on partial flow information (rather than full 
information) for detection, and thus, it is light-weight and desirable for 
online operations with the property of early identification. With the 
mid-Atlantic CCDC intrusion dataset publicly available, we show that 
our proposed technique yields a high degree of detection rate over 99% 
with a very low false alarm rate (0.4%). 

 





References:
[1] L. Bernaille, R. Teixeira and K. Salamatian, "Early Application
Identification," in ACM CoNEXT Conference (CoNEXT '06), 2006.
[2] L. Grimaudo, M. Mellia and E. Baralis, "Hierarchical Learning for Fine
Grained Internet Traffic Classification," IWCMC, 2012.
[3] T. E. Najjary, G. U. Keller and M. Pietrzyk, "Application-Based Feature
Selection for Internet Traffic Classification," in 22nd International
Teletraffic Congress (ITC 2010), 2010.
[4] G. Xie, M. Iliofotou, R. Keralapura, M. Faloutsos and A. Nucci,
"Subflow: Towards Practical Flow-Level Traffic Classification," in
INFOCOM, 2012.
[5] V. Paxson, "Bro: A System for Detection Network Intruders in
Real-Time,” Computer Network, no. 31(23-24), pp. 2435-2463, 1999.
[6] V. Kumar, and O. Sangwan, "Signature Based Intrusion Detection
System Using Snort,” International Journal of Computer Application &
Information Technology, 2012.
[7] G. Pannell, and H. Ashman, "Anomaly Detection over User Profiles for
Intrusion Detection,” Information Security Management Conference,
2010
[8] C. Pfleeger and S. Pfleeger, Security in Computing, 4th ed. Massachusetts
U.S.A, 2011, pp 485-486.
[9] J. Eman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson,
"Offline/Realtime Traffic Classification Using Semi-Supervised
Learning,” Performance Evaluation., pp 1194-1213, 2007
[10] T. Karagiannis, A. Broido, M. Faloutsos, and K. Claffy, "Transport Layer
Identification of P2P Traffic,” the 4th ACM SIGCOMM Conference on
Internet Measurement, pp 121-134, 2004.
[11] G. Xie, M. Iliofotou, R. Keralapura, M. Faloutsos, and A. Nucci,
"Subflow: Towards Practical Flow-Level Traffic Classification,” Proc
IEEE INFOCOM Proceedings - IEEE INFOCOM, pp 2541-2545, 2012
[12] T. E. Najjary, G. U. Keller and M. Pietrzyk, "Application-Based Feature
Selection for Internet Traffic Classification," in 22nd International
Teletraffic Congress (ITC 2010), 2010.
[13] T. Nguyen, and G. Armitage, "A Survey of Techniques for Internet
Traffic Classification Using Machine Learning,” Communications
surveys Tutorials IEEE, no (10), pp 55-76, 2008.
[14] Y. Reich, J. Fenves, "The Formation and Use of Abstract Concepts in
design,” Concepts Formation: Knowledge and Experience in
Unsupervised Learning, 1991.
[15] S. Hussein, F. Ali, and Z. Kasiran, "Evaluation Effectiveness of Hybrid
ID Susing Snort with Naïve Bayes to Detect Attacks,” Second
International Conference on Digital Information and Communication
Technology and its Application, pp 256-260, 2012.
[16] Snort http://www.snort.org/
[17] Z. Muda, W. Yassin, M.N Sulaiman, and N.I Udzir, "Intrusion Detection
Based On K-Means Clustering and Naïve Bayes Classification,”7th
International Conference on (IAS), pp 192-197, 2011.
[18] M. Panda, M.R. Patra, "A Comparative Study of Data Mining Algorithms
for Network Intrusion Detection,”1st International Conference ICETET,
pp 504-507, 2008.
[19] DARPA/MIThttp://www.ll.mit.edu/mission/communications/cyber/CST
corpora/ideval/data/
[20] The national cyberWatch Mid-Atlantic CCDC (MACCDC).
http://www.netresec.com/?page=MACCDC
[21] Libpcap file format. http://wiki.wireshark.org/Development/
LibpcapFileFormat.
[22] M. Dash, and H. Liu, "Feature Selection for Classification,” Intelligent
Data Analysis, pp 131-156, 1997.
[23] S. Gadat, and L. Younes, "A Stochastic Algorithm for Feature Selection
in Pattern Recognition,” Machine Learning Research, pp 509-547, 2007.
[24] G. Ricardo, "CSCE Pattern Analysis,” CSE@TAMU, 2010.
[25] B. Qinghai, "Analysis of Particle Swarm Optimization Algorithm,”
CCSE, 2010.
[26] R. Schapire, "The Boosting Approach to Machine Learning,” MSRI
workshop on Nonlinear Estimation and classification, 2002.
[27] J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann
Publishers Inc., San Francisco, CA, USA, 1993.
[28] P. Cheeseman, and J. Stutz, Advances in Knowledge Discovery and Data
Mining. Chapter Bayesian Classification: Theory and Result, American
Association for Artificial Intelligence, Mentlo Park, CA, USA, 1996, pp
153-180.
[29] WEKA. http://www.cs.waikato.ac.nz/ml/weka/.
[30] S. Wu, and E. Yen, "Data Mining-Based Intrusion Detectors,” Expert
System with Applications, pp 5605-5612, 2009.