Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis

Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.




References:
[1] J. Deng, A. Berg, and L. Fei-Fei, “Hierarchical semantic indexing for
large scale image retrieval,” in Computer Vision and Pattern Recognition
(CVPR), 2011 IEEE Conference on, June 2011, pp. 785–792.
[2] S.-Y. Kung and M.-W. Mak, Feature Selection for Genomic and
Proteomic Data Mining. John Wiley & Sons, Inc., 2008, pp. 1–45.
(Online). Available: http://dx.doi.org/10.1002/9780470397428.ch1
[3] R. ”Bellman, ”Dynamic Programming”, ”1” ed. ”Princeton, NJ, USA”:
”Princeton University Press”, ”1957”.
[4] A. Y. Ng, “On feature selection: Learning with exponentially many
irrelevant features as training examples,” in Proceedings of the Fifteenth
International Conference on Machine Learning. Morgan Kaufmann,
1998, pp. 404–412.
[5] I. Guyon and A. Elisseeff, “An introduction to variable and feature
selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003. (Online).
Available: http://dl.acm.org/citation.cfm?id=944919.944968
[6] J. Yu and X.-W. Chen, “Bayesian neural network approaches to
ovarian cancer identification from high-resolution mass spectrometry
data,” Bioinformatics, vol. 21, no. 1, pp. 487–494, Jan. 2005. (Online).
Available: http://dx.doi.org/10.1093/bioinformatics/bti1030
[7] S. Datta and L. M. DePadilla, “Feature selection and
machine learning with mass spectrometry data for distinguishing
cancer and non-cancer samples,” Statistical Methodology, vol. 3,
no. 1, pp. 79 – 92, 2006, bioinformatics. (Online). Available:
http://www.sciencedirect.com/science/article/pii/S157231270500064X
[8] P. R. Srinivas, M. Verma, Y. Zhao, and S. Srivastava,
“Proteomics for cancer biomarker discovery,” Clinical Chemistry,
vol. 48, no. 8, pp. 1160–1169, 2002. (Online). Available:
http://www.clinchem.org/content/48/8/1160.abstract
[9] M. D. I. C. W. E. C. L. H. S. O. T. E. R. Kuschner, Karl W., “A
bayesian network approach to feature selection in mass spectrometry
data,” BMC Bioinformatics, vol. 11, 2010. (Online). Available:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098056/
[10] D. I. Malyarenko, W. E. Cooke, B.-L. Adam, G. Malik, H. Chen,
E. R. Tracy, M. W. Trosset, M. Sasinowski, O. J. Semmes, and D. M.
Manos, “Enhancement of sensitivity and resolution of surface-enhanced
laser desorption/ionization time-of-flight mass spectrometric records
for serum peptides using time-series analysis techniques,” Clinical
Chemistry, vol. 51, no. 1, pp. 65–74, 2005. (Online). Available:
http://www.clinchem.org/content/51/1/65.abstract
[11] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection
for cancer classification using support vector machines,” Mach.
Learn., vol. 46, no. 1-3, pp. 389–422, 2002. (Online). Available:
http://dx.doi.org/10.1023/A:1012487302797
[12] O. Chapelle, S. Keerthi, O. Chapelle, and S. Keerthi, “Multi-class feature
selection with support vector machines,” in Proceedings of the American
Statistical Association, 2008.
[13] B. Dittmann and S. Nitz, “Strategies for the development of
reliable qa/qc methods when working with mass spectrometry-based
chemosensory systems,” Sensors and Actuators B: Chemical,
vol. 69, no. 3, pp. 253 – 257, 2000, proceedings of the
International Symposium on Electronic Noses. (Online). Available:
http://www.sciencedirect.com/science/article/pii/S0925400500005049
[14] U. Depczynski, V. Frost, and K. Molt, “Genetic algorithms applied to
the selection of factors in principal component regression,” Analytica
Chimica Acta, vol. 420, no. 2, pp. 217 – 227, 2000. (Online). Available:
http://www.sciencedirect.com/science/article/pii/S000326700000893X
[15] M. Suganthy and P. Ramamoorthy, “Principal component analysis based
feature extraction, morphological edge detection and localization for fast
iris recognition.”
[16] M. Dash and H. Liu, “Feature selection for classification,” Intelligent
Data Analysis, vol. 1, no. 14, pp. 131 – 156, 1997. (Online). Available:
http://www.sciencedirect.com/science/article/pii/S1088467X97000085
[17] A. L. Blum and P. Langley, “Selection of relevant features
and examples in machine learning,” Artificial Intelligence,
vol. 97, no. 12, pp. 245 – 271, 1997. (Online). Available:
http://www.sciencedirect.com/science/article/pii/S0004370297000635
[18] S. Das, “Filters, wrappers and a boosting-based hybrid for feature
selection,” in Proceedings of the Eighteenth International Conference on
Machine Learning, ser. ICML ’01. San Francisco, CA, USA: Morgan
Kaufmann Publishers Inc., 2001, pp. 74–81. (Online). Available:
http://dl.acm.org/citation.cfm?id=645530.658297
[19] C. Huertas and R. Ju´arez-Ram´ırez, “Heat map based feature selection:
A case study for ovarian cancer,” in Applications of Evolutionary
Computation - 18th European Conference, EvoApplications 2015,
Copenhagen, Denmark, April 8-10, 2015, Proceedings, 2015, pp. 3–13. [20] H. Liu and R. Setiono, “Chi2: Feature selection and discretization
of numeric attributes,” in In Proceedings of the Seventh International
Conference on Tools with Artificial Intelligence, 1995, pp. 388–391.
[21] L. Yu and H. Liu, “Feature selection for high-dimensional data: A fast
correlation-based filter solution,” 2003, pp. 856–863.
[22] K. Kira and L. A. Rendell, “A practical approach to feature
selection,” in Proceedings of the Ninth International Workshop on
Machine Learning, ser. ML92. San Francisco, CA, USA: Morgan
Kaufmann Publishers Inc., 1992, pp. 249–256. (Online). Available:
http://dl.acm.org/citation.cfm?id=141975.142034
[23] Y. Liu, “Feature extraction and dimensionality reduction for mass
spectrometry data,” Comput. Biol. Med., vol. 39, no. 9, pp. 818–823,
Sep 2009.
[24] T. Abeel, T. Helleputte, Y. Van de Peer, P. Dupont,
and Y. Saeys, “Robust biomarker identification for cancer
diagnosis with ensemble feature selection methods,” Bioinformatics,
vol. 26, no. 3, pp. 392–398, Feb. 2010. (Online). Available:
http://dx.doi.org/10.1093/bioinformatics/btp630
[25] H. Kim, J. Watkinson, and D. Anastassiou, “Biomarker discovery using
statistically significant gene sets.” Journal of Computational Biology,
vol. 18, no. 10, pp. 1329–1338, 2011.
[26] F. Gonzlez and L. A. B. Muoz, “Feature selection for microarray gene
expression data using simulated annealing guided by the multivariate
joint entropy,” CoRR, vol. abs/1302.1733, 2013.
[27] L. Yang, S. Lv, and J. Wang, “Model-free variable selection in
reproducing kernel hilbert space,” Journal of Machine Learning
Research, vol. 17, no. 82, pp. 1–24, 2016. (Online). Available:
http://jmlr.org/papers/v17/15-390.html
[28] L. Yu and H. Liu, “Efficient feature selection via analysis of relevance
and redundancy,” J. Mach. Learn. Res., vol. 5, pp. 1205–1224, dec 2004.
(Online). Available: http://dl.acm.org/citation.cfm?id=1005332.1044700
[29] G. H. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the
subset selection problem,” in MACHINE LEARNING: PROCEEDINGS
OF THE ELEVENTH INTERNATIONAL. Morgan Kaufmann, 1994,
pp. 121–129.
[30] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg,
J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and
E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of
Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[31] U. Alon, N. Barkai, D. Notterman, K. Gish, S. Ybarra, D. Mack, and
A. Levine, “Broad patterns of gene expression revealed by clustering
analysis of tumor and normal colon tissues probed by oligonucleotide
arrays,” Proceedings of the National Academy of Sciences, vol. 96,
no. 12, pp. 6745–6750, jun 1999.
[32] D. Chowdary, J. Lathrop, J. Skelton, K. Curtin, T. Briggs, Y. Zhang,
J. Yu, Y. Wang, and A. Mazumder, “Prognostic gene expression
signatures can be measured in tissues collected in rnalater preservative,”
The Journal of Molecular Diagnostics, vol. 8, no. 1, pp. 31–39, feb
2006.
[33] Gravier, Eleonore, G. Pierron, A. Vincent-Salomon, N. gruel, V. Raynal,
A. Savignoni, Y. De Rycke, J.-Y. Pierga, C. Lucchesi, F. Reyal,
A. Fourquet, S. Roman-Roman, F. Radvanyi, X. Sastre-Garau,
B. Asselain, and O. Delattre, “A prognostic DNA signature for T1T2
node-negative breast cancer patients.” Genes, Chromosomes and Cancer,
vol. 49, no. 12, pp. 1125–1125, Sep. 2010.
[34] E. Tian, F. Zhan, R. Walker, E. Rasmussen, Y. Ma, B. Barlogie, and J. D.
Shaughnessy, Jr., “The role of the wnt-signaling antagonist dkk1 in the
development of osteolytic lesions in multiple myeloma,” New England
Journal of Medicine, vol. 349, no. 26, pp. 2483–2494, dec 2003.
[35] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and
Y. Singer, “Online passive-aggressive algorithms,” J. Mach.
Learn. Res., vol. 7, pp. 551–585, dec 2006. (Online). Available:
http://dl.acm.org/citation.cfm?id=1248547.1248566
[36] L. Breiman, “Random forests,” Mach. Learn., vol. 45,
no. 1, pp. 5–32, oct 2001. (Online). Available:
http://dx.doi.org/10.1023/A:1010933404324
[37] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J.
Lin, “Liblinear: A library for large linear classification,” J. Mach.
Learn. Res., vol. 9, pp. 1871–1874, jun 2008. (Online). Available:
http://dl.acm.org/citation.cfm?id=1390681.1442794