Feature Selection for Breast Cancer Diagnosis: A Case-Based Wrapper Approach

This article addresses feature selection for breast cancer diagnosis. The present process contains a wrapper approach based on Genetic Algorithm (GA) and case-based reasoning (CBR). GA is used for searching the problem space to find all of the possible subsets of features and CBR is employed to estimate the evaluation result of each subset. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer (WDBC) dataset.




References:
[1] I. Harirchi, et al., "Breast cancer in Iran: a review of 903 case records,"
Public Health, 2000. 114(2): p. 143-145.
[2] T. Subashini, V. Ramalingam, and S. Palanivel, "Breast mass
classification based on cytological patterns using RBFNN and SVM,"
Expert Systems with Applications, 2009. 36(3): p. 5284-5290.
[3] R.A. Miller, "Medical diagnostic decision support systems - past,
present, and future," Journal of the American Medical Informatics
Association, 1994. 1(1): p. 8.
[4] J. Han, and M. Kamber, "Data mining: concepts and techniques," 2006:
Morgan Kaufmann.
[5] R. Kohavi, and G.H. John, "Wrappers for feature subset selection,"
Artificial intelligence, 1997. 97(1-2): p. 273-324.
[6] Y. Yuling, "A Feature Selection Method for Online Hybrid Data Based
on Fuzzy-rough Techniques," 2009: IEEE.
[7] N. Abe, et al., "A divergence criterion for classifier-independent feature
selection," Advances in Pattern Recognition, 2000: p. 668-676.
[8] M. Dash, and H. Liu, "Feature selection for classification," Intelligent
data analysis, 1997. 1(3): p. 131-156.
[9] R. Jensen, and Q. Shen, "Computational intelligence and feature
selection: rough and fuzzy approaches," IEEE Press Series On
Computational Intelligence, 2008: p. 340.
[10] I. Guyon, and A. Elisseeff, "An introduction to variable and feature
selection," The Journal of Machine Learning Research, 2003. 3: p. 1157-
1182.
[11] M. Sun, et al. "A GA-Based Feature Selection for High-Dimensional
Data Clustering," 2009: IEEE.
[12] C.H. Yang, et al., "A Novel GA-Taguchi-Based Feature Selection
Method," Intelligent Data Engineering and Automated Learning-IDEAL
2008, 2008: p. 112-119.
[13] I.S. Oh, J.S. Lee, and B.R. Moon, "Hybrid genetic algorithms for feature
selection," IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2004: p. 1424-1437.
[14] P. Zhang, B. Verma, and K. Kumar, "Neural vs. statistical classifier in
conjunction with genetic algorithm based feature selection," Pattern
Recognition Letters, 2005. 26(7): p. 909-919.
[15] J.H. Hong, and S.B. Cho, "Efficient huge-scale feature selection with
speciated genetic algorithm," Pattern Recognition Letters, 2006. 27(2):
p. 143-150.
[16] R. Leardi, and A. Lupiáñez González, "Genetic algorithms applied to
feature selection in PLS regression: how and when to use them,"
Chemometrics and Intelligent Laboratory Systems, 1998. 41(2): p. 195-
207.
[17] M., Mitchell, "An introduction to genetic algorithms," 1998: The MIT
press.
[18] J.H. Holland, "Adaptation in natural and artificial systems," 1992: MIT
Press Cambridge, MA, USA.
[19] A. Aamodt, and E. Plaza, "Case-based reasoning: Foundational issues,
methodological variations, and system approaches," AI communications,
1994. 7(1): p. 39-59.
[20] H. Ahn, K. Kim, and I. Han, "A case-based reasoning system with the
two-dimensional reduction technique for customer classification,"
Expert Systems with Applications, 2007. 32(4): p. 1011-1019.
[21] H. Ahn, K. Kim, and I. Han, "Hybrid genetic algorithms and case based
reasoning systems for customer classification," Expert Systems, 2006.
23(3): p. 127-144.
[22] H. Ahn, and K. Kim, "Bankruptcy prediction modeling with hybrid
case-based reasoning and genetic algorithms approach," Applied Soft
Computing, 2009. 9(2): p. 599-607.
[23] K.J. Kim, "Toward global optimization of case-based reasoning systems
for financial forecasting," Applied Intelligence, 2004. 21(3): p. 239-249.
[24] G.R. Beddoe, and S. Petrovic, "Selecting and weighting features using a
genetic algorithm in a case-based reasoning approach to personnel
rostering," European Journal of Operational Research, 2006. 175(2): p.
649-671.
[25] J. Jarmulak, S. Craw, and R. Rowe, "Genetic algorithms to optimise
CBR retrieval," Advances in Case-Based Reasoning, 2000: p. 159-194.
[26] E. Golobardes, X. Llor , and M. Salam├│, "Computer aided diagnosis
with case-based reasoning and genetic algorithms," Knowledge-Based
Systems, 2002. 15(1-2): p. 45-52.
[27] Y. Avramenko, and A. Kraslawski, Case Based Design. Applications in
Process Engineering, 2008: p. 51-108.
[28] M. Bacauskiene, and A. Verikas, "Selecting salient features for
classification based on neural network committees," Pattern Recognition
Letters, 2004. 25(16): p. 1879-1891.
[29] Y. Prasad, K. Biswas, and C. Jain, "SVM Classifier Based Feature
Selection Using GA, ACO and PSO for siRNA Design," Advances in
Swarm Intelligence, 2010: p. 307-314.