Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

As DNA microarray data contain relatively small
sample size compared to the number of genes, high dimensional
models are often employed. In high dimensional models, the selection
of tuning parameter (or, penalty parameter) is often one of the crucial
parts of the modeling. Cross-validation is one of the most common
methods for the tuning parameter selection, which selects a parameter
value with the smallest cross-validated score. However, selecting a
single value as an ‘optimal’ value for the parameter can be very
unstable due to the sampling variation since the sample sizes of
microarray data are often small. Our approach is to choose multiple candidates of tuning parameter
first, then average the candidates with different weights depending
on their performance. The additional step of estimating the weights
and averaging the candidates rarely increase the computational cost,
while it can considerably improve the traditional cross-validation. We
show that the selected value from the suggested methods often lead to
stable parameter selection as well as improved detection of significant
genetic variables compared to the tradition cross-validation via real
data and simulated data sets.

Authors:



References:
[1] J. Zhu and T. Hastie, “Classification of gene microarrays by penalized
logistic regression,” Biostatistics, vol. 5, no. 3, pp. 427 – 443, 2004.
[2] L. Shen and E. C. Tan, “Dimension reduction-based penalized logistic
regression for cancer classification using microarray data,” IEEE/ACM
Transactions on Computational Biology and Bioinformatics, vol. 2,
no. 2, pp. 166 – 175, 2005.
[3] C. Li and H. Li, “Network-constrained regularization and variable
selection for analysis of genomic data,” Bioinformatics, vol. 24, no. 9,
pp. 1175 – 1182, 2008.
[4] W. Pan, B. Xie, and X. Shen, “Incorporating predictor network in
penalized regression with application to microarray data,” Biometrics,
vol. 66, pp. 474 – 484, 2010.
[5] G. Fort and S. Lambert-Lacroix, “Classification using partial least
squares with penalized logistic regression,” Bioinformatics, vol. 21,
no. 7, pp. 1104 – 1111, 2005.
[6] G. C. Cawley and N. L. C. Talbot, “Gene selection in cancer
classification using sparse logistic regression with bayesian
regularization,” Bioinformatics, vol. 22, no. 19, pp. 2348 – 2355,
2006.
[7] L. Waldron, M. Pintilie, M.-S. Tsao, F. A. Shepherd, C. Huttenhower,
and I. Jurisica, “Optimized application of penalized regression methods
to diverse genomic data,” Bioinformatics, vol. 27, no. 24, pp. 3399 –
3406, 2011.
[8] P. Breheny and J. Huang, “Coordinate descent algorithms for nonconvex
penalized regression, with applications to biological feature selection,”
The Annals of Applied Statistics, vol. 5, no. 457, pp. 232 – 253, 2011.
[9] R. Tibshirani, “Regression shrinkage and selection via the lasso,”
Journal of the Royal Statistical Society. Series B (Methodological),
vol. 58, no. 1, pp. 267 – 288, 1996.
[10] J. Friedman, T. Hastie, and R. Tibshirani, “Regularization paths for
generalized linear models via coordinate descent,” Journal of Statistical
Software, vol. 33, no. 1, pp. 1 – 22, 2008. [Online]. Available:
http://www.jstatsoft.org/v33/i01/
[11] N. Simon, J. Friedman, T. Hastie, and R. Tibshirani, “Regularization
paths for cox’s proportional hazards model via coordinate descent,”
Journal of Statistical Software, vol. 39, no. 5, pp. 1 – 13, 2011.
[Online]. Available: http://www.jstatsoft.org/v39/i05/
[12] M. Y. Park and T. Hastie, “L1 regularization path algorithm for
generalized linear models,” Journal of the Royal Statistical Society.
Series B (Methodological), vol. 69, no. 4, pp. 659 – 677, 2007.
[13] R. Tibshirani and J. Taylor, “The solution path of the generalized lasso,”
Annals of Statistics, vol. 39, no. 3, pp. 1335 – 1371, 2011.
[14] M. Stone, “Cross-validatory choice and the assessment of statistical
predictions (with discussion),” Journal of the Royal Statistical Society.
Series B (Methodological), vol. 36, no. 2, pp. 111 – 147, 1974.
[15] S. Geisser, “The predictive sample reuse method with applications,”
Journal of the American Statistical Association, vol. 70, no. 350, pp.
320 – 328, 1975.
[16] L. J. Buturovi´c, “Pcp: a program for supervised classification of gene
expression profiles,” Bioinformatics, vol. 22, no. 2, pp. 245 – 247, 2006.
[17] V. V. Belle, K. Pelckmans, S. V. Huffel, and J. A. K. Suykens, “Improved
performance on high-dimensional survival data by application of
survival-svm,” Bioinformatics, vol. 27, no. 1, pp. 87 – 94, 2011.
[18] A.-L. Boulesteix, C. Porzelius, and M. Daumer, “Microarray-based
classification and clinical predictors: on combined classifiers and
additional predictive value,” Bioinformatics, vol. 24, no. 15, pp. 1698 –
1706, 2008.
[19] W. Pan and X. Shen, “Penalized model-based clustering with application
to variable selection,” Journal of Machine Learning Research, vol. 8, pp.
1145 – 1164, 2007.
[20] T. Hancock, I. Takigawa, and H. Mamitsuka, “Mining metabolic
pathways through gene expression,” Bioinformatics, vol. 26, no. 17, pp.
2128 – 2135, 2010.
[21] S. Arlot and A. Celisse, “A survey of cross-validation procedures for
model selection,” Statistics Surveys, vol. 4, pp. 40 – 79, 2010.
[22] B. Efron and R. Tibshirani, “Improvements on cross-validation:
The .632+ bootstrap method,” Journal of the American Statistical
Association, vol. 92, no. 438, pp. 548 – 560, 1997.
[23] U. Braga-Neto, R. Hashimoto, E. R. Dougherty, D. V. Nguyen, and
R. J. Carroll, “Is cross-validation better than resubstitution for ranking
genes?” Bioinformatics, vol. 20, no. 2, pp. 253 – 258, 2004.
[24] B. Scholk¨opf, K. Sung, C. Burges, T. P. F. Girosi, P. Niyogi, and
V. Vapnik., “Comparing support vector machines with gaussian kernels
to radial basis function classifiers,” IEEE Trans. Sign. Processing,
vol. 45, pp. 2758 – 2765, 1997.
[25] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel,
“e1071: Misc functions of the department of statistics (e1071),” TU
Wien,Version 1.5-11, Tech. Rep., 2005.
[26] A. Karatzoglou, A. Smola, K. Hornik, and A. Zeileis, “kernlab –
an S4 package for kernel methods in R,” Journal of Statistical
Software, vol. 11, no. 9, pp. 1 – 20, 2004. [Online]. Available:
http://www.jstatsoft.org/v11/i09/
[27] Y. Guo, T. Hastie, and R. Tibshirani, “Regularized linear discriminant
analysis and its application in microarrays,” Biostatistics, vol. 8, pp. 86
– 100, 2007.
[28] G. Schwarz, “Estimating the dimension of a model,” The Annals of
Statistics, vol. 6, no. 2, pp. 461 – 464, 1978.
[29] H. Akaike, “A new look at the statistical model identification,” IEEE
Transactions on Automatic Control, vol. 19, no. 6, pp. 716 – 723,
1974.
[30] J. Chen and Z. Chen, “Extended bayesian information criteria for model
selection with large model spaces,” Biometrika, vol. 95, no. 3, pp. 759
– 771, 2008.
[31] H. Wang, B. Li, and C. Leng, “Shrinkage tuning parameter selection
with a diverging number of parameters,” Journal of the Royal Statistical
Society. Series B (Methodological), vol. 71, no. 3, pp. 671 – 683, 2009.
[32] J. Chen and Z. Chen, “Extended BIC for small-n-large-p sparse GLM,”
Statistica Sinica, vol. 22, pp. 555 – 574, 2012.
[33] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation
for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55 –
67, 1970.
[34] A. Karatzoglou, D. Meyer, and K. Hornik, “Support vector machines in
r,” Journal of Statistical Software, vol. 15, no. 9, pp. 1 – 28, 4 2006.
[35] T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. Mesirov,
H. Coller, M. Loh, J. Downing, C. Caligiuri, M.A.and Bloomfield, and
E. Lander, “Molecular classification of cancer: class discovery and class
prediction by gene expression monitoring.” Science, vol. 286, pp. 531 –
537, 1999.
[36] U. Alon, N. Barkai, D. Notterman, K. Gish, S. Mack, and J. Levine,
“Broad patterns of gene expression revealed by clustering analysis of
tumor and normal colon tissues probed by oligonucleotide arrays.”
Proceedings of the National Academy of Sciences of the USA, vol. 96,
pp. 6745 – 6750, 1999. [37] A. Alizadeh, M. Eisen, R. Davis, C. Ma, I. Lossos, A. Rosenwald,
J. Boldrick, H. Sabet, T. Tran, and X. e. a. Yu, “Distinct types of diffuse
large b-cell lymphoma identified by gene expression profiling.” Nature,
vol. 403, no. 6769, pp. 503 – 511, 2000.
[38] J. Khan, J. Wei, M. Ringner, L. Saal, M. Ladanyi, F. Westermann,
F. Berthold, M. Schwab, and C. e. a. Antonescu, “Classification and
diagnostic prediction of cancer using gene expression profiling and
artificial neural networks.” Nature Medicine, vol. 7, pp. 673 – 679, 2001.
[39] D. Witten and R. Tibshirani, “Penalized classification using fisher’s
linear discriminant,” Journal of the Royal Statistical Society. Series B
(Methodological), vol. 73, no. 5, pp. 753 – 772, 2011.
[40] H. Zou, “The adaptive lasso and its oracle properties,” Journal of the
American Statistical Association, vol. 101, no. 476, pp. 1418 – 1429,
2006.
[41] L. W. Hahn, M. D. Ritchie, and J. H. Moore, “Multifactor dimensionality
reduction software for detecting genegene and geneenvironment
interactions,” Bioinformatics, vol. 19, no. 3, pp. 376 – 382, 2003.
[42] C. Kooperberg, M. LeBlanc, J. Y. Dai, and I. Rajapakse, “Structures and
assumptions: Strategies to harness gene x gene and gene x environment
interactions in GWAS,” Statistical Science, vol. 24, no. 4, pp. 472 – 488,
2009.