Designing a Framework for Network Security Protection

As the Internet continues to grow at a rapid pace as the primary medium for communications and commerce and as telecommunication networks and systems continue to expand their global reach, digital information has become the most popular and important information resource and our dependence upon the underlying cyber infrastructure has been increasing significantly. Unfortunately, as our dependency has grown, so has the threat to the cyber infrastructure from spammers, attackers and criminal enterprises. In this paper, we propose a new machine learning based network intrusion detection framework for cyber security. The detection process of the framework consists of two stages: model construction and intrusion detection. In the model construction stage, a semi-supervised machine learning algorithm is applied to a collected set of network audit data to generate a profile of normal network behavior and in the intrusion detection stage, input network events are analyzed and compared with the patterns gathered in the profile, and some of them are then flagged as anomalies should these events are sufficiently far from the expected normal behavior. The proposed framework is particularly applicable to the situations where there is only a small amount of labeled network training data available, which is very typical in real world network environments.

Authors:



References:
[1] Homeland Security Council of USA, "National strategy for homeland
security," 2007.
[2] T. N. Saadawi, and L. H. Jordan, Cyber Infrastructure Protection.
Strategic Studies Institute, US Army War College, 2011.
[3] A. Patcha and J. Park, "An overview of anomaly detection technologies:
exisiting solutions and latest technological data," Computer Networks,
vol. 51(12), 2007, pp. 3448-3470.\
[4] E. Eskin, et.al. A Geometric Framework for Unsupervised Anomaly
Detection: Detecting Intrusions in Unlabeled Data. Application of Data
Mining in Computer Security (eds. S. Jajodia and B. Dordrecht),
Kluwer, 2002, ch. 4.
[5] E. Jiang. Automatic Text Classification from Labeled and Unlabeled
Data. A chapter to be appears in Intelligent Data Analysis for Real-Life
Applications: Theory and Practice (eds. R. Magdalena, et. al.), IGI
Global Publishing, 2012.
[6] J, MacQueen, "Some methods for classification and analysis of
multivariate observations," in 1967 Proc. 5th Berkeley Symposium on
Mathematical Statistics and Probability, University of California Press,
pp. 281-297.
[7] A. Dempster, N. Laird and D. Rubin, "Maximum likelihood from
incomplete data via the EM algorithm," J. Royal Statistical Society,
Series B, 39, pp. 1-38, 1977.
[8] KDD Cup, The International Knowledge Discovery and Data Mining
Tools Competition KDD-CUP. http://kdd.ics.uci.edu/datasets/kddcup99,
1999.