Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Abstract—Attribute or feature selection is one of the basic
strategies to improve the performances of data classification tasks,
and, at the same time, to reduce the complexity of classifiers,
and it is a particularly fundamental one when the number
of attributes is relatively high. Its application to unsupervised
classification is restricted to a limited number of experiments in
the literature. Evolutionary computation has already proven itself
to be a very effective choice to consistently reduce the number
of attributes towards a better classification rate and a simpler
semantic interpretation of the inferred classifiers. We present a feature
selection wrapper model composed by a multi-objective evolutionary
algorithm, the clustering method Expectation-Maximization (EM),
and the classifier C4.5 for the unsupervised classification of data
extracted from a psychological test named BASC-II (Behavior
Assessment System for Children - II ed.) with two objectives:
Maximizing the likelihood of the clustering model and maximizing
the accuracy of the obtained classifier. We present a methodology
to integrate feature selection for unsupervised classification, model
evaluation, decision making (to choose the most satisfactory model
according to a a posteriori process in a multi-objective context), and
testing. We compare the performance of the classifier obtained by the
multi-objective evolutionary algorithms ENORA and NSGA-II, and
the best solution is then validated by the psychologists that collected
the data.




References:
[1] C. Reynolds and R. Kamphaus, Behavior Assessment for Children and
Adolescent (2nd Ed.). Circle Pines, MN: American Guidance Service,
2004.
[2] J. Gonz´alez, S. Fern´andez, E. P´erez, and P. Santamar´ıa, Adaptaci´on
Espa˜nola del Cuestionario de Evaluaci´on de conducta en Ni˜nos y
Adolescentes (in Spanish). TEA, 2004.
[3] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From data mining
to knowledge discovery: An overview,” in Advances in Knowledge
Discovery and Data Mining. AAAI, 1996, pp. 1–34.
[4] V. Kumar and S. Minz, “Feature selection: A literature review,” Smart
Computing Review, vol. 4, no. 3, pp. 211–229, 2014.
[5] R. Caruana and D. Freitag, “Greedy attribute selection,” in Proc. of the
11th International Conference on Machine Learning (ICML). Morgan
Kaufmann, 1994, pp. 28–36.
[6] A. Arauzo-Azofra, J. Benitez, and J. Castro, “Consistency measures for
feature selection,” Journal of Intelligence Information Systems, vol. 30,
no. 3, pp. 273–292, 2008.
[7] B. Blesser, T. Kuklinski, and R. Shillman, “Empirical tests for feature
selection based on a psychological theory of character recognition,”
Pattern Recognition, vol. 8, no. 2, pp. 77 – 85, 1976.
[8] J. Tang and H. Liu, “Feature selection for social media data,” ACM
Trans. Knowl. Discov. Data, vol. 8, no. 4, pp. 1–27, 2014.
[9] F. Jim´enez, A. G´omez-Skarmeta, G. S´anchez, and K. Deb, “An
evolutionary algorithm for constrained multi-objective optimization,” in
Proc. of the Congress on Evolutionary Computation (CEC), vol. 2.
IEEE, 2002, pp. 1133–1138.
[10] F. Jim´enez, G. S´anchez, and J. Ju´arez, “Multi-objective evolutionary
algorithms for fuzzy classification in survival prediction,” Artificial
Intelligence in Medicine, vol. 60, no. 3, pp. 197–219, 2014.
[11] K. Deb, A. Pratab, S. Agarwal, and T. Meyarivan, “A fast and
elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. on
Evolutionary Computation, vol. 6, no. 2, pp. 182 – 197, 2002.
[12] R. Sokal and P. Sneath, Principles of Numerical Taxonomy. W.H.
Freeman, 1963.
[13] F. Wilmink and H. Uytterschaut, “Cluster analysis, history, theory
and applications,” in Multivariate Statistical Methods in Physical
Anthropology, G. van Vark and W. Howells, Eds. D. Reidel Publishing
Company, 1984, pp. 135–175.
[14] F. Borgen and D. Barnett, “Hierarchical cluster analysis: Comparison
of three linkage measures and application to psychological data,” The
Quantitative Methods for Psychology, vol. 11, no. 1, pp. 456–468, 1987.
[15] Y. Odilia and K. Kylee, “Applying cluster analysis in counselling
psychology research,” Journal of Counseling Psychology, vol. 4, no. 34,
pp. 8–21, 2015.
[16] R. Sokal and P. Sneath, Handbook of Psychology. Wiley, 2003.
[17] E. Shortliffe, Ed., Computer-Based Medical Consultations: Mycin.
Elsevier, 1976.
[18] Z. Cui, “A novel medical image dynamic fuzzy classification model
based onridgelet transform,” Journal of Software, vol. 5, no. 5, pp.
456–458, 2010.
[19] I. Naresh, A. Kandel, and M. Schneider, “Feature-based fuzzy
classification for interpretation of mammograms,” Fuzzy Sets and
Systems, vol. 114, no. 2, pp. 271–280, 2000.
[20] H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and
Data Mining. Kluwer, 1998.
[21] A. Marcano-Cedeno, J. Quintanilla-Dominguez, M. Cortina-Januchs,
and D. Andina, “Feature selection using sequential forward selection
and classification applying artificial metaplasticity neural network,” in
Proc. of the 46th Annual Conference on IEEE Industrial Electronics
Society (IECON), 2010, pp. 2845–2850.
[22] S. Cotter, K. Kreutz-Delgado, and B. Rao, “Backward sequential
elimination for sparse vector subset selection,” Signal Processing,
vol. 81, no. 9, pp. 1849 – 1864, 2001.
[23] G. Nandi, “An enhanced approach to las vegas filter (LVF) feature
selection algorithm,” in Proc. of the 2nd National Conference on
Emerging Trends and Applications in Computer Science (NCETACS),
2011, pp. 1–3.
[24] H. Vafaie and K. D. Jong, “Genetic algorithms as a tool for feature
selection in machine learning,” in Proc. of the 4th International
Conference on Tools with Artificial Intelligence (TAI), 1992, pp.
200–204.
[25] S. Dreyer, “Evolutionary feature selection,” Master’s thesis, Institutt for
datateknikk og informasjonsvitenskap, 2013.
[26] W. Siedlecki and J. Sklansky, “A note on genetic algorithms for
large-scale feature selection,” Pattern Recognition Letters, vol. 10, no. 5,
pp. 335 – 347, 1989.
[27] M. ElAlami, “A filter model for feature subset selection based on genetic
algorithm,” Knowledge-Based Systems, vol. 22, no. 5, pp. 356 – 362,
2009. [28] R. Anirudha, R. Kannan, and N. Patil, “Genetic algorithm based
wrapper feature selection on hybrid prediction model for analysis of
high dimensional data,” in Proc. of the 9th International Conference on
Industrial and Information Systems (ICIIS), 2014, pp. 1–6.
[29] H. Ishibuchi, “Multi-objective pattern and feature selection by a genetic
algorithm,” in Proc. of the Genetic and Evolutionary Computation
Conference (GECCO), 2000, pp. 1069–1076.
[30] C. Emmanouilidis, A. Hunter, J. MacIntyre, and C. Cox, “A
multi-objective genetic algorithm approach to feature selection in neural
and fuzzy modeling,” Journal of Evolutionary Optimization, vol. 3, no. 1,
pp. 1–26, 2001.
[31] A. Ekbal, S. Saha, and C. Garbe, “Feature selection using multiobjective
optimization for named entity recognition,” in Proc. of the 20th
International Conference on Pattern Recognition (ICPR), 2010, pp.
1937–1940.
[32] Y. Jin, Ed., Multi-Objective Machine Learning, ser. Studies in
Computational Intelligence. Springer, 2006, vol. 16.
[33] J. Garc´ıa-Nieto, E. Alba, L. Jourdan, and E. Talbi, “Sensitivity
and specificity based multiobjective approach for feature selection:
Application to cancer diagnosis,” Information Processing Letters, vol.
109, no. 16, pp. 887–896, 2009.
[34] A. Jara, R. Mart´ınez, D. Vigueras, G. S´anchez, and F. Jim´enez,
“Attribute selection by multiobjective evolutionary computation applied
to mortality from infection in severe burns patients,” in Proc. of the
International Conference on Health Informatics (HEALTHINF), 2011,
pp. 467–471.
[35] M. Venkatadri and K. S. Rao, “A multiobjective genetic algorithm for
feature selection in data mining,” International Journal of Computer
Science and Information Technologies, vol. 1, no. 5, pp. 443–448, 2010.
[36] A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from
incomplete data via the EM algorithm,” Journal of the Royal Statistical
Society, vol. 39, no. 1, pp. 1–38, 1977.
[37] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools
and Techniques, Second Edition (Morgan Kaufmann Series in Data
Management Systems). Morgan Kaufmann, 2005.
[38] J. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann,
1993.
[39] F. Jim´enez, E. Marzano, G. S´anchez, G. Sciavicco, and N. Vitacolonna,
“Attribute selection via multi-objective evolutionary computation applied
to multi-skill contact center data classification,” in Proc. of the IEEE
Symposium on Computational Intellgence in Big Data, 2015, pp.
488–495.
[40] K. Deb, Multi-Objective Optimization using Evolutionary Algorithms.
Wiley, 2001.
[41] I. Rechenberg, Evolutionsstrategie: optimierung technischer systeme
nach prinzipien der biologischen evolution. Frommann, 1973.
[42] H. Schwefel, Numerical Optimization of Computer Models. Wiley,
1981.
[43] Y. Matsuyama, “Hidden markov model estimation based on alpha-EM
algorithm: Discrete and continuous alpha-hmms,” in Proc. of the 2011
International Joint Conference on Neural Networks (IJCNN), 2011, pp.
808–816.
[44] M. Srinivas and L. Patnaik, “Adaptive probabilities of crossover and
mutation in genetic algorithms,” IEEE Trans. on Systems, Man, and
Cybernetics, vol. 24, no. 4, pp. 656–667, 1994.
[45] “Package caret,” http://cran.r-project.org/web/packages/caret/caret.pdf,
2015.
[46] M. O’Mahony, Sensory Evaluation of Food: Statistical Methods and
Procedures. CRC Press, 1986.
[47] J. Platt, “Sequential minimal optimization: A fast algorithm for
training support vector machines,” Microsof Research, Tech. Rep., 1998,
mSR-TR-98-14.
[48] E. Frank and I. Witten, “Generating accurate rule sets without global
optimization,” in Proc. of the 15th International Conference on Machine
Learning, 1998, pp. 144–151.
[49] R. Bisquerra and N. P´erez, “Las competencias emocionales (in
Spanish),” Revista Educaci´on XXI, vol. 10, pp. 61–82, 2007.
[50] P. Fern´andez and N. Ramos, Corazones Inteligentes (in Spanish).
Kairos, 2002.
[51] J. Payton, D. Wardlaw, P. Graczyk, M. Bloodworth, and C. T. R.
Weissberg, “Social and emotional learning: A framework for promoting
mental health and reducing risk behaviors in children and youth,”
Journal of School Health, vol. 70, pp. 179–185, 2000.
[52] M. Berkowitz and M. Bier, What Works in Character Education. A
Research-Driven Guide for Educators. Missouri University, 2005.