A Hybrid Feature Selection by Resampling, Chi squared and Consistency Evaluation Techniques

In this paper a combined feature selection method is proposed which takes advantages of sample domain filtering, resampling and feature subset evaluation methods to reduce dimensions of huge datasets and select reliable features. This method utilizes both feature space and sample domain to improve the process of feature selection and uses a combination of Chi squared with Consistency attribute evaluation methods to seek reliable features. This method consists of two phases. The first phase filters and resamples the sample domain and the second phase adopts a hybrid procedure to find the optimal feature space by applying Chi squared, Consistency subset evaluation methods and genetic search. Experiments on various sized datasets from UCI Repository of Machine Learning databases show that the performance of five classifiers (Naïve Bayes, Logistic, Multilayer Perceptron, Best First Decision Tree and JRIP) improves simultaneously and the classification error for these classifiers decreases considerably. The experiments also show that this method outperforms other feature selection methods.




References:
[1] J. Han, and M. Kamber, Data Mining: Concepts and Techniques,
Morgan Kaufmann Publishers, 2006.
[2] R. Bellman, Adaptive Control Processes: A Guided Tour, Princeton
University Press, Princeton 1961.
[3] R. Duangsoithong, and T. Windeatt," Relevance and Redundancy
Analysis for Ensemble Classifiers", Springer-Verlag, Berlin Heidelberg,
2009.
[4] H. Liu and, Z. Zhao, "Manipulating Data and Dimension Reduction
Methods: Feature Selection", Journal of Computational Complexity, pp.
1790-1800, 2012.
[5] A. L. Blum, P. Langley, "Selection of relevant features and examples in
machine learning ," Artificial Intelligence, vol. 97, pp. 245-271,1997.
[6] M. Dash, H. Liu, "Consistency-based search in feature selection",
Artificial Intelligence, vol. 151, pp. 155-176, 2003.
[7] P. A. Devijver, J. Kittler, Pattern Recognition : A Statistical Approach,
Prentice Hall, Englewood Cliffs, NJ, 1982.
[8] A. G. K. Janecek, W. N. Gansterer, M. A. Demel, and G. F. Ecker, "On
the relationship between feature selection and classification accuracy,"
Journal of Machine Learning and Research. JMLR: Workshop and
Conference Proceedings 4, pp. 90-105, 2008.
[9] A. Assareh, M. Moradi, and L. G. Volkert, "A hybrid random subspace
classifier fusion approach for protein mass spectra classification,"
Springer, LNCS, vol. 4973, pp. 1-11, Heidelberg, 2008.
[10] J. Hayward, S. Alvarez, C. Ruiz, M. Sullivan, J. Tseng, and G. Whalen,
"Knowledge discovery in clinical performance of cancer patients," IEEE
International Conference on Bioinformatics and Biomedicine, USA, pp.
51-58, 2008.
[11] K. Dhiraj, S. K. Rath, and A. Pandey, "Gene Expression Analysis Using
Clustering," 3rd international Conference On Bioinformatics and
Biomedical Engineering, 2009.
[12] B. N. Jiang, X. Q. Ding, L. T. Ma, Y. He, T. Wang, and W. W. Xie, "A
Hybrid Feature Selection Algorithm: Combination of Symmetrical
Uncertainty and Genetic Algorithms," The Second International
Symposium on Optimization and Systems Biology, pp. 152-157, Lijiang,
China, October 31- November 3, 2008.
[13] J. Zhou, H. Peng, and C. Y. Suen, "Data-driven decomposition for
multi-class classification," Journal of Pattern Recognition, vol. 41, pp.
67 - 76, 2008.
[14] A. A. Azofra, J. M. Benitez, and J. L. Castro: "A feature set measure
based on Relief". RASC. 2004.
[15] J. Novakovic, P. Strbac, D. Bulatovic, "Toward optimal feature selection
using ranking methods and classification algorithms," Yugoslav Journal
of Operations Research, vol. 21, pp. 119-135, 2011.
[16] H. Liu, H. Motoda, Feature Selection for Knowledge Discovery and
Data Mining, Kluwer Academic Publishers, 1998.
[17] J. Yang, V. Honavar, "Feature subset selection using a genetic
algorithm," IEEE Intelligent Systems, vol. 13, pp. 44-49, 1998.
[18] D. Koller, M. Sahami, "Toward optimal feature selection," International
Conference on Machine Learning, pp. 284-292, 1996.
[19] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer,
"SMOTE: Synthetic Minority Over-sampling Technique," Journal of
Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[20] H. Almuallim, T. G. Dietterich, "Learning with many irrelevant
features," Proceedings of the ninth National Conference on Artificial
Intelligence, pp. 547-552, 1991.
[21] I. Kononeko, "Estimating attributes: Analysis and extensions of relief,"
Proceedings of the Seventh European Conference on Machine Learning,
pp. 171-182, 1994.
[22] H. Liu, R. Setiono, "Chi2: Feature selection and discretization of
numeric attributes," IEEE 7th International Conference on Tools with
Artificial Intelligence, pp. 338-391, 1995.
[23] A. Silberschatz, H. F. Korth, and S.Sudarshan, Database System
Concepts, McGrawHill, 2010.
[24] Haupt, Randy and S. E. Haupt, Practical Genetic Algorithms, John
Wiley and Sons, 1998.
[25] C.J. Mertz, and P.M. Murphy, UCI Repository of machine learning
databases, http://www.ics.uci.edu/~mlearn/MLRepository.html,
University of California, 2011.