Correlation-based Feature Selection using Ant Colony Optimization

Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.





References:
[1] Hall, M.A. Correlation-based Feature selection for Machine Learning,
Ph.D. Thesis, Department of Computer Science. Hamilton, New Zeland:
The University of Waikato, 1999.
[2] Kohavi, R., and John, G. H, : The Wrapper Approach, Feature Selection
for Knowledge Discovery and Data Mining, H. Liu and H. Motoda
Eds., Kluwer Academic Publishers, 1998, 33-50.
[3] Almuallim, H. and Dietterich, T. G., Learning with many irrelevant
features, Proceedings of the Ninth National Conference on Artificial
Intelligence, AAAI Press, 1991, 547-552.
[4] Koller, D. and Sahami, M. , Toward optimal feature selection,
Proceedings of the Thirteenth International Conference on Machine
Learning, Morgan Kaufmann, 1996, 248-292.
[5] Kira, K. and Rendell, L. A practical approach to feature selection,
Proceedings of the Ninth International Conferene on Machine
Learning, Morgan Kaufmann, 1992, 249-256.
[6] Dash, M., Liu, H. , Consistency-based search in feature selection,
Artificial Intelligence 151, Elsevier Pub., 2003, 155-176.
[7] Deng, K. and Moore, A., On the Greediness of Feature Selection
Algorithms, International Conference of Machine Learning (ICML'98),
Van den Herik H. J. and Weijters Eds T. , University Maastricht, The
Netherlands, 1998.
[8] Skalak, D., Prototype and Feature Selection by Sampling and Random
Mutation Hill Climbing Algorithms, Eleventh International
MachineLearning Conference, Morgan Kaufmann, New Brunswick,
NJ, 1994, 293-301.
[9] J. Klitter. Feature set search algorithms. In C. H. Chen, editor, Pattern
Recognition and signal Processing. Sijhoff and Noordhoff, the
Netherlands, 1978.
[10] P. Paudil, J. Novovicova, and J. Klittler. Floating search methods in
feature selection. Pattern Recognition Letters, 15: 1994, 1119-1125.
[11] P.M. Narendra and K. Fukunaga, A branch and bound algorithm for
feature subset selection. IEEE Transaction on Computers, C-26, 1977,
917-922.
[12] J. Yang and V. Honavar, Feature subset selection using a genetic
algorithm, IEEE Transactions on Intelligent Systems, 13, 1998, 44-49.
[13] Leardi, R., Boggia, R. and Terrile, M., Genetic Algorithms as a Strategy
for Feature Selection, Journal of Chemometrics, vol. 6, 1992, 267-281
[14] M. Sadeghzadeh and M.H. Shenasa, Correlation Based Feature
Selection Using Genetic Algorithms, Proceedings of the Third
International Conference on Modeling, Simulation and Applied
Optimization, Sharjah, U.A.E. , 2009.
[15] M. Gletsos, S.G. Mougiakakou, G.K. Matsopoulos, K.S. Nikita, A. S.
Nikita, and D. Kelekis. A Computer-Aided Diagnostic System to
Characterized CT Focal Liver Lesions: Design and Optimization of a
Neural Network Classifier, IEEE Transactions on Information
Technology in Biomedicine, 7, 2003, 153-162.
[16] I.-S. Oh, J.-S. Lee, and B.-R. Moon, Hybrid Genetic Algorithms for
Feature Selection, IEEE Transactions on Pattern Analysis and Machine
Inteligence, 26, 2004, 1424-1437.
[17] M. Sadeghzadeh and M.H. Shenasa, Correlation Based Feature
Selection Using Evolutionary Programming, Proceedings of the Twelfth
IASTED International Conference on Artificial Intelligence and Soft
Computing, Palma de Mallorca, Spain , 2008, 185-190.
[18] M. Dorigo, V. Maniezzo, and A. Colorni. Ant Systems: Optimization by
a colony of cooperating agents. IEEE Transactions on Systems, Man,
and Cybernetics-Part B, 26: 1996, 29-41.
[19] T. Stutzle and M. Dorigo, The Ant Colony Optimization Metaheuristic:
Algorithm, Applications, and Advances. In F. Glover and G.
Kochenberger, editors, Handbook of Metaheuristics, Kluwer Academic
Publishers, Norwell, MA, 2002.
[20] G. Di Caro and M. Dorigo, AntNet:Distributed stigmergetic control for
communications networks. Journal of Artificial Intelligence Research,
9: 1998, 317-365.
[21] R. S. Parpinelli: H. S. Lopes, A. A. Freitas, Data mining with an ant
colony optimization algorithm, IEEE Transactions on Evolutionary
Computation,6, 2002, 321-332.
[22] G. Di Caro and M. Dorigo. AntNet: Distributed stigmergetic control for
communications networks. Journal of Artificial Inteligence Research,
9, 1998, 317-365.
[23] Blake, C., Keogh, E., and Merz C. J., UCI Repository of Machine
Learning Data Bases, University of California, Department of
Information and Computer Science,Irvine,CA.,1998.
(http://www.ics.uci.edu/~mlean/MLRepository.html)
[24] Langky, P. and Sage, S., Scaling to domains with irrelevant features,
R. Greiner (ed) Computational Learning Theory and Natural Learning
Systems,MIT Press, 1994.