Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Authors:



References:
[1] T.R. Golub,D.K. Slonim ,P. Tamayo ,et al, Molecular Classification
of Cancer: Class Discovery and Class Prediction by Gene Expression
Monitoring, Science, vol.286,1999,pp. 531-537.
[2] I.T. Jolliffe,Principal Component Analysis.2nd edition. New
York:Springer,2002.
[3] P. Comon, Independent Component Analysis-A New Concept?, Signal
Process, vol.36,1994,pp.287-314.
[4] O.D. Richard,E.H. Peter and G.S. David, Pattern Classification,2nd edition.
New York:Wiley-Interscience, 2000.
[5] S. Bicciato ,A. Luchini and C.D. Bello , PCA Disjoint Models
for Multiclass Cancer Analysis using Gene Expression Data,
Bioinformatics,vol.19,2003,pp.571-578.
[6] W. Liebermeister, Linear Modes of Gene Expression Determined by
Independent Component Analysis, Bioinformatics, vol. 18,2002,pp. 51-
60.
[7] X.W. Zhang ,Y.L. Yap ,D. Wei ,et al . Molecular Diagnosis of Human
Cancer Type by Gene Expression Profiles and Independent Component
Analysis . European Journal of Human Genetics , vol.5,2005,pp.46-56.
[8] S. Dudoit, J. Fridlyand, and T. P. Speed, Comparison of Discrimination
Methods for the Classification of Tumors using Gene Expression Data,
Journal of the American Statistical Association, vol.97,2002,pp.77-87 .
[9] J.B.Tenenbaum,V.Silva and J.C.Langford,A global geometric framework
for nonlinear dimensionality reduction,vol.290,2000,pp.2319-2323.
[10] S.T.Roweis,L.K.Saul,Nonlinear dimensionality reduction by locally linear
embedding,vol.290,2000,pp.2323-2326.
[11] X. He and P. Niyogi, Locality Preserving Projections, Advances in
Neural Information Processing Systems 16, , Cambridge,MIT Press, 2003.
[12] C. Shi and L.H. Chen, Feature Dimension Reduction for Microarray
Data Analysis using Locally Linear Embedding. APBC,vol. 16,
2004,pp.1-7.
[13] G. Lee, C. Rodriguez and A. Madabhushi , Investigating the Efficacy
of Nonlinear Dimensionality Reduction Schemes in Classifyi Gene- and
Protein-Expression Studies, IEEE/ACM Transactions on Computational
Biology and Bioinformatics,vol.5,2008, pp. -384.
[14] K. Huang and S. Aviyente, Sparse Representation for Signal Classification,
Advances in Neural Information Processing Systems, vol.19,2006,
pp. 609-616.
[15] S. Yan and H. Wang, Semi-Supervised Learning by Sparse Representation,
SIAM International Conference on Data Mining, pp. 792-801
March,2009.
[16] John Wright, Yi Ma, Julien Mairal, et at, Sparse Representation For
Computer Vision and Pattern Recognition .Proceedings of International
Conference on Computer Vision and Pattern Recognition, vol.98,2010,
pp. 1031-1044.
[17] H. Xue, S.C. Chen, Q. Yang, Discriminatively Regularized Least-
Squares Classification. Pattern Recognition,vol.42,2009,pp. 93-104.
[18] S. Pomeroy , P. Tamayo and M. Gaasenbeek, et al, Prediction of central
nervous system embryonal tumour outcome based on gene expression,
Nature,vol.415,2002,pp.436-442.
[19] T. R. Golub, D. K. Slonim and P. T., et al. Molecular classification
of cancer: Class discovery and class prediction by gene expression
monitoring, 1999,Science,vol.286, pp.531-537 .
[20] U. Alon and N. Bkraai and D.A. Notterman, et al,Broad patterns of gene
expression revealed by clustering analysis of tumor and normal colon
tissues probed by oligonucleotide arrays, Proceedings of the National
Academy of Sciences,vol.96,1999 ,pp.6745-6750.
[21] S. Deegalla and H. Bostrom,Classification of microarrays with kNN:
comparison of dimensionality reduction methods, Lecture Notes in Computer
Science,vol.4881,2007,pp.800-809.
[22] P. Helman ,R. Veroff and S.R. Atlas , et al,A Bayesian network classification
methodology for gene expression data, Journal of Computational
Biology,vol.11,2004,pp.581-615.
[23] T.S.Furey,N.Cristianini and N.Duffy,et al, Support vector machines
classification and validation of cancer tissue samples using microarray
expression data,Bioinformatics,vol.16,2000, pp.906-914.