Neural Networks Learning Improvement using the K-Means Clustering Algorithm to Detect Network Intrusions

In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism. The K-means algorithm is first applied to the training dataset to reduce the amount of samples to be presented to the neural network, by automatically selecting an optimal set of samples. The obtained results demonstrate that the proposed technique performs exceptionally in terms of both accuracy and computation time when applied to the KDD99 dataset compared to a standard learning schema that use the full dataset.




References:
[1] Hecht-Nielsen, R. (1988). Applications of counter propagation networks.
Neural Networks, 1, 131-139.
[2] J. B. MacQueen (1967): "Some Methods for classification and Analysis
of Multivariate Observations, Proceedings of 5-th Berkeley Symposium
on Mathematical Statistics and Probability", Berkeley, University of
California Press, 1:281-297.
[3] E. M. Johansson, F. U. Dowla and D. M. Goodman, "Backpropagation
Learning for Multilayer Feed-forward Neural Networks using the
Conjugate Gradient Method'', Int. J. Neur. Syst. 2, 291 (1992).
[4] KDD data set, 1999;
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, cited April
2003.
[5] Levin I.: KDD-99 Classifier Learning Contest LLSoft-s Results
Overview. SIGKDD Explorations. ACM SIGKDD. 1(2) (2000) 67- 75.
[6] Kayacik G., Zincir-Heywood N., and Heywood M. On the Capability of
an SOM based Intrusion Detection System. In Proceedings of
International Joint Conference on Neural Networks, 2003.
[7] Dong Song, Malcolm I. Heywood, and A. Nur Zincir-Heywood.
"Training Genetic Programming on Half a Million Patterns: An Example
from Anomaly Detection", IEEE Transactions on Evolutionary
Computation, 9(3), pp 225-240, 2005.
[8] Application of Machine Learning Algorithms to KDD Intrusion
Detection Dataset within Misuse Detection Context, Maheshkumar
Sabhnani, Gursel Serpen, Proceedings of the International Conference
on Machine Learning, Models, Technologies and Applications
(MLMTA 2003), Las Vegas, NV, June 2003, pages 209-215.
[9] F. Provost, T. Fawcett, and R. Kohavi. The case against accuracy
estimation for comparing induction algorithms. In Proceedings Of 15th
International Conference On Machine Learning, pages 445-453, San
Francisco, Ca, 1998. Morgan Kaufmann.
[10] C. Elkan, "Results of the KDD-99 Classifier Learning", SIGKDD
Explorations, ACM SIGKDD, Jan 2000.