Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Authors:



References:
[1] H. Fröhlich, O. Chapelle and B. Schölkopf, "Feature selection for
support vector machines by means of genetic algorithms," in
Proceedings: 15th IEEE International Conference on Tools with
Artificial Intelligence, 2003, pp. 142-148.
[2] A. Rakotomamonjy, "Variable Selection Using SVM-based Criteria,"
Journal of Machine Learning Research, vol. 3, pp. 1357-1370, 2003.
[3] C. Huang and C. Wang, "A GA-based feature selection and parameters
optimization for support vector machines," Expert Systems with
Applications, vol. 31, pp. 231-240, /8. 2006.
[4] K. Y. Chan, H. L. Zhu, C. C. Lau and S. H. Ling, "Gene signature
selection for cancer prediction using an integrated approach of genetic
algorithm and support vector machine," in 2008 IEEE Congress on
Evolutionary Computation, CEC 2008, 2008, pp. 217-224.
[5] I. Mejía-Guevara and Á. Kuri-Morales, "Genetic support vector
classification and feature selection," in 7th Mexican International
Conference on Artificial Intelligence, MICAI 2008, 2008, pp. 75-81.
[6] Y. Bengio, "Gradient-based optimization of hyperparameters," Neural
computation, vol. 12, pp. 1889-1900, 2000.
[7] O. Chapelle and V. Vapnik, "Model Selection for Support Vector
Machines," 2000.
[8] J. Weston, S. Mukherjee, O. Chapelle, M. Pontil and V. Vapnik,
"Feature selection for SVMs," in Advances in Neural Information
Processing Systems 13, 2000, pp. 668-674.
[9] O. Chapelle, V. Vapnik, O. Bousquet and S. Mukherjee, "Choosing
multiple parameters for support vector machines," Machine Learning,
vol. 46, pp. 131-159, 2002.
[10] I. Guyon, J. Weston, S. Barnhill and V. Vapnik, "Gene Selection for
Cancer Classification using Support Vector Machines," Machine
Learning, vol. 46, pp. 389-422, 2002.
[11] I. Guyon and A. Elisseeff, "An introduction to variable and feature
selection," Journal of Machine Learning Research, vol. 3, pp. 1157-
1182, 2003.
[12] J. Reunanen, "Overfitting in making comparisons between variable
selection methods," Journal of Machine Learning Research., vol. 3, pp.
1371-1382, 2003.
[13] F. Friedrichs and C. Igel, "Evolutionary tuning of multiple SVM
parameters," Neurocomputing, vol. 64, pp. 107-117, 2005.
[14] H. Frohlich and A. Zell, "Efficient parameter selection for support
vector machines in classification and regression via model-based global
optimization," in International Joint Conference on Neural Networks,
IJCNN 2005, July 31, 2005 - August 4, 2005, pp. 1431-1436.
[15] K. Kira and L. A. Rendell, "Feature selection problem: Traditional
methods and a new algorithm," in Proceedings Tenth National
Conference on Artificial Intelligence - AAAI-92, 1992, pp. 129-134.
[16] Y. Chen, Y. Li, X. Cheng and L. Guo, "Survey and Taxonomy of
Feature Selection Algorithms in Intrusion Detection System," In: H.
Lipmaa, M. Yung and D. Lin, Editors, Inscrypt 2006 4318, LNCS 2006,
pp. 153-167.
[17] H. Cheng, H. Chen, G. Jiang and K. Yoshihira, "Nonlinear feature
selection by relevance feature vector machine," in proceedings MLDM
'07: Proceedings of the 5th international conference on Machine
Learning and Data Mining in Pattern Recognition, pp.144-159, 2007.
[18] C. Park, J. -. Koo, P. T. Kim and J. W. Lee, "Stepwise feature selection
using generalized logistic loss," Computational Statistics and Data
Analysis, vol. 52, pp. 3709-3718, 2008.
[19] K. Shen, C. Ong, X. Li and E. P. V. Wilder-Smith, "Novel multi-class
feature selection methods using sensitivity analysis of posterior
probabilities," in 2008 Conference Proceedings - IEEE International
Conference on Systems, Man and Cybernetics, SMC 2008, 2008, pp.
1116-1121.
[20] P. Maji, "F-Information measures for efficient selection of
discriminative genes from microarray data," IEEE Transactions on
Biomedical Engineering., vol. 56, pp. 1063-1069, 2009.
[21] H. Huang and F. Chang, "ESVM: Evolutionary support vector machine
for automatic feature selection and classification of microarray data,"
BioSystems, vol. 90, pp. 516-528, 2007.
[22] P. L. Braga, A. L. I. Oliveira and S. R. L. Meira, "A GA-based feature
selection and parameters optimization for support vector regression
applied to software effort estimation," in 23rd Annual ACM Symposium
on Applied Computing, SAC'08, 2008, pp. 1788-1792.
[23] E. Avci, "Selecting of the optimal feature subset and kernel parameters
in digital modulation classification by using hybrid genetic algorithmsupport
vector machines: HGASVM," Expert Systems with Applications,
vol. 36, pp. 1391-1402, 2009.
[24] K. C. Tan, E. J. Teoh, Q. Yu and K. C. Goh, "A hybrid evolutionary
algorithm for attribute selection in data mining," Expert Systems with
Applications, vol. 36, pp. 8616-8630, 2009.
[25] R. Kohavi and G. H. John, "Wrappers for feature subset selection,"
Artificial Intelligence, vol. 97, pp. 273-324, 1997.
[26] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor,
MI, USA: University of Michigan Press, 1975.
[27] D. E. Goldberg, Genetic Algorithms in Search, Optimization and
Machine Learning. Boston, MA, USA: Addison-Wesley Longman
Publishing Co., Inc, 1989.
[28] V. Vapnik, The Nature of Statistical Learning Theory. Springer-Verlag
New York, Inc, 1995.
[29] V. N. Vapnik, Statistical Learning Theory. Wiley, New York, 1998.
[30] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector
Machines: And Other Kernel-Based Learning Methods. Cambridge
University Press, 2000.
[31] K. Duan, S. S. Keerthi and A. N. Poo, "Evaluation of simple
performance measures for tuning SVM hyperparameters,"
Neurocomputing, vol. 51, pp. 41-59, 2003.
[32] N. Cristianini, J. Kandola, A. Elisseeff and J. Shawe-Taylor, "On kerneltarget
alignment," in Advances in Neural Information Processing
Systems 14, 2002, pp. 367-373.
[33] C. H. Nguyen and T. B. Ho, "An efficient kernel matrix evaluation
measure," Pattern Recognition, vol. 41, pp. 3366-3372, 2008.
[34] L. Jia and S. Liao, "Combinatorial kernel matrix model selection using
feature distances," in International Conference on Intelligent
Computation Technology and Automation, ICICTA 2008, 2008, pp. 40-
43.
[35] J. Kandola, J. Shawe-Taylor and N. Cristianini, "Optimizing kernel
alignment over combinations of kernel," Department of Computer
Science,Royal Holloway, University of London, UK, 2002.
[36] C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines,
2001.
[37] A. Asuncion and D. J. Newman, "UCI Machine Learning Repository,"
2007. http://archive.ics.uci.edu/ml/citation_policy.html