Evaluation of Ensemble Classifiers for Intrusion Detection

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Authors:



References:
[1] Summers RC, Secure computing: threats and safeguards, New York:
McGraw-Hill, 1997.
[2] Heady R, Luger G, Maccabe A, Servilla M, “The architecture of a
network level intrusion detection system”, Technical Report,
Department of Computer Science, University of New Mexico, 1990.
[3] Sundaram A, An introduction to intrusion detection, ACM Cross Roads,
2(4), 1996.
[4] Mukkamala S, Sung AH, Abraham A, “Intrusion detection using
ensemble of soft computing paradigms”, proceedings of the third
international conference on intelligent systems design and applications,
intelligent systems design and applications, advances in soft computing.
Germany, Springer, 2003,pp. 239–48.
[5] Mukkamala S, Sung AH, Abraham A, “Modeling intrusion detection
systems using linear genetic programming approach”, proceedings of the
17th international conference on industrial & engineering applications
of artificial intelligence and expert systems, innovations in applied
artificial intelligence. In: Robert O., Chunsheng Y., Moonis A., editors.
Lecture Notes in Computer Science, vol. 3029, Germany: Springer,
2004a, dpp. 633–42.
[6] Mukkamala S, Sung AH, Abraham A, Ramos V, “Intrusion detection
systems using adaptive regression splines”, Seruca I, Filipe J,
Hammoudi S, Cordeiro J, editors. Proceedings of the 6th international
conference on enterprise information systems, ICEIS’04, vol. 3,
Portugal, 2004b, pp. 26–33.
[7] Shah K, Dave N, Chavan S, Mukherjee S, Abraham A, Sanyal S,
“Adaptive neuro-fuzzy intrusion detection system”, IEEE International
Conference on Information Technology: Coding and Computing
(ITCC’04), vol. 1. USA: IEEE Computer Society, 2004, pp. 70–74.
[8] Haykin, S, Neural networks: a comprehensive foundation(second ed.),
New Jersey: Prentice Hall, 1999.
[9] T. Shon and J. Moon, "A hybrid machine learning approach to network
anomaly detection", Information Sciences, vol.177, 2007, pp. 3799-
3821.
[10] P. Anderson, "Computer security threat monitoring and surveillance",
Technical Report, James P. Anderson Co., Fort Washington, PA, 1980.
[11] W. Stallings, Cryptography and network security principles and
practices, USA: Prentice Hall, 2006.
[12] C. Tsai , Y. Hsu, C. Lin and W. Lin, "Intrusion detection by machine
learning: A review", Expert Systems with Applications, vol. 36, 2009,
pp.11994-12000.
[13] E. Biermann, E. Cloete and L.M. Venter, "A comparison of intrusion
detection Systems", Computer and Security, vol. 20, 2001, pp. 676-683.
[14] T. Verwoerd and R. Hunt, "Intrusion detection techniques and
approaches", Computer Communications, vol. 25, 2002, pp.1356-1365.
[15] K. Ilgun, R.A. Kemmerer and P.A. Porras, "State transition analysis:A
rule-based intrusion detection approach" , IEEE Trans. Software Eng.
vol. 21, 1995, pp. 181-199.
[16] D. Marchette, "A statistical method for profiling network traffic",
proceedings of the First USENIX Workshop on Intrusion Detection and
Network Monitoring (Santa Clara), CA, 1999, pp. 119-128.
[17] S. Mukkamala, G. Janoski and A.Sung, "Intrusion detection: support
vector machines and neural networks" proceedings of the IEEE
International Joint Conference on Neural Networks (ANNIE), St. Louis,
MO, 2002, pp. 1702-1707.
[18] E. Lundin and E. Jonsson, "Anomaly-based intrusion detection: privacy
concerns and other problems", Computer Networks, vol. 34, 2002, pp.
623-640.
[19] S. Wu and W. Banzhaf, "The use of computational intelligence in
intrusion detection systems: A review", Applied Soft Computing, vol.10,
2010, pp. 1-35.
[20] Ghosh AK, Schwartzbard A, “A study in using neural networks for
anomaly and misuse detection”, proceeding on the 8th USENIX security
symposium, http://citeseer.ist.psu.edu/context/1170861/0, 1999.
(accessed August 2006).
[21] W. H. Chen, S. H. Hsu, H.P Shen, “Application of SVM and ANN
forintrusion detection”, ComputOperRes, 32(10), 2005a, pp. 2617–2634.
[22] Chen Y, Abraham A, and Yang J, “Feature deduction and intrusion
detection using flexible neural trees”, proceedings of Second IEEE
International Symposium on Neural Networks, 2005b,pp. 2617-2634.
[23] C. Katar, “Combining multiple techniques for intrusion detection”, Int J
ComputSci Network Security, 2006, pp. 208–218.
[24] Freund, Y. and Schapire, R, “A decision-theoretic generalization of online
learning and an application to boosting”, proceedings of the Second
European Conference on Computational Learning Theory, 1995, pp. 23-
37.
[25] Freund, Y. and Schapire, R, “Experiments with a new boosting
algorithm”, Proceedings of the Thirteenth International Conference on
Machine Learning, 1996, pp.148-156 Bari, Italy.
[26] M.Govindarajan, RM.Chandrasekaran, “Intrusion Detection using an
Ensemble of Classification Methods”, Proceedings of International
Conference on Machine Learning and Data Analysis, San Francisco,
U.S.A, 2012, pp. 459-464.
[27] Oliver Buchtala, Manuel Klimek, and Bernhard Sick, Member,
IEEE,“Evolutionary Optimization of Radial Basis Function Classifiers
for Data Mining Applications”, IEEE Transactions on systems, man, and
cybernetics—part b: cybernetics, 35(5), 2005.
[28] Cherkassky, V. and Mulier, F, Learning from Data - Concepts, Theory
and Method”, John Wiley & Sons, New York, 1998.
[29] Burges, C. J. C, “A tutorial on support vector machines for pattern
recognition”, Data Mining and Knowledge Discovery, 2(2), 1998,
pp.121-167.
[30] Vanajakshi, L. and Rilett, L.R, “A Comparison of the Performance of
Artificial Neural Network and Support Vector Machines for the
Prediction of Traffic Speed”, proceedings of the IEEE Intelligent
Vehicles Symposium, University of Parma, Parma, Italy, 2004, pp. 194-
199.
[31] E. Allwein, R.E. Schapire and Y. Singer, Reducing multiclass to binary:
A unifying approach for margin classifiers, Journal of Machine
Learning Research, 1, 2000, pp.113–141.
[32] T.G. Dietterich and G. Bakiri, Solving multiclass learning problems via
error-correcting output codes, Journal of Artificial Intel Research, 2
1995, pp.263–286.
[33] Breiman, L, Bagging predictors. Machine Learning, 24(2), 1996a,
pp.123– 140.
[34] Breiman.L, “Bias, Variance, and Arcing Classifiers”, Technical Report
460, Department of Statistics, University of California, Berkeley, CA,
1996.
[35] Jiawei Han, Micheline Kamber, DataMining – Concepts and
Techniques, Elsevier Publications, 2003.
[36] Ira Cohen, Qi Tian, Xiang Sean Zhou and ThomsS.Huang, "Feature
Selection Using Principal Feature Analysis", Proceedings of the 15th
international conference on Multimedia, Augsburg, Germany,
September, 2007, pp. 25-29.
[37] KDD'99 dataset, http://kdd.ics.uci.edu/databases, Irvine, CA, USA,
2010.
[38] Kohavi, R, “A study of cross-validation and bootstrap for accuracy
estimation and model selection”, Proceedings of International Joint
Conference on Artificial Intelligence, 1995, pp. 1137–1143.