Dimension Reduction of Microarray Data Based on Local Principal Component

Analysis and visualization of microarraydata is veryassistantfor biologists and clinicians in the field of diagnosis and treatment of patients. It allows Clinicians to better understand the structure of microarray and facilitates understanding gene expression in cells. However, microarray dataset is a complex data set and has thousands of features and a very small number of observations. This very high dimensional data set often contains some noise, non-useful information and a small number of relevant features for disease or genotype. This paper proposes a non-linear dimensionality reduction algorithm Local Principal Component (LPC) which aims to maps high dimensional data to a lower dimensional space. The reduced data represents the most important variables underlying the original data. Experimental results and comparisons are presented to show the quality of the proposed algorithm. Moreover, experiments also show how this algorithm reduces high dimensional data whilst preserving the neighbourhoods of the points in the low dimensional space as in the high dimensional space.





References:
[1] V. Tenenbaum and J.C. Langford, A Global Geometric framework For
Nonlinear Dimensionality reduction. Science, 290
(5500):23192323,2009.
[2] S.T. Roweis and L.K. Saul, Nonlinear Dimensionality Reduction by
Locally Linear Embedding. Science, 290(5500):23232326, 2000.
[3] C. Bowman, R. Baumgartner et al, Dimensionality Reduction for
BiomedicalSpectra. Electrical and Computer Engineering, 2002. IEEE
CCECE,2002.
[4] P. J. Kennedy, S. J. Simoff, D. Skillicorn and D. Catchpoole, Extracting
and Explaining Biological Knowledge in Microarray Data. Proc. Eighth
Pacific-Asia Conference on Knowledge Discovery and Data Mining,
Sydney. (eds) Dai, H., Srikant, R., and Zhang, C., LNAI 3056, pp 699-
703, Springer-Verlag Berlin, 2004.
[5] I. Guyon and A. Elisseeff, An Introduction to Variable and Feature
Selection. Journal of Machine Learning Research 3 (2003) 1157-1182,
2002.
[6] J. Lee and M. Verleysen, Nonlinear Dimensionality Reduction Springer,
2007.
[7] J. Quansheng, J. Minping, et al., New approach of intelligent fault
diagnosis based on LLE algorithm. Control and Decision Conference,
2008. CCDC 2008. Chinese, 2008.
[8] C. Varini, T. W. Nattkemper, et al., Breast MRI Data Analysis by
LLE.Neural Networks, 2004.Proceedings. 2004 IEEE International Joint
Conference, 2004.
[9] H. Tian, H. and D.G. Goodenough, Nonlinear Feature extraction of
Hyperspectral Data Based on Locally Linear Embedding (LLE). In
Geoscience and Remote Sensing Symposium, 2005.IGARSS
-05.Proceedings.2005 IEEE International. 2005.
[10] Z. Zhang and H. Zha, Principal Manifolds and Nonlinear
DimensionalityReduction Via Local tangent Space Alignment. SIAM
Journal ofScientific Computing, 26(1):313338, 2004.
[11] D. Ridder and D. Rober, Locally Linear Embedding for classification. In
the Pattern Recognition Group Technical Report Series. ICIP. 2005.
[12] E. Anderson, The Irises of the gasp Peninsula. Bulletin of the American
Iris Society, 59(2-5), 1935.
[13] S. Kaski, J. Nikkila and et al., Trustworthiness and metrics in
Visualizing Similarity of gene Expression. BMC Bioinformatics, 4:48,
2003.
[14] J. Venna, and S. Kaski, Visualizing gene Interaction Graphs With Local
Multidimensional Scaling. In Michel Verleysen, editor, Proceedings of
the 14th European Symposium on Artificial Neural Networks
(ESANN2006), Bruges, Belgium, April 2628, pp. 557562, d-side, Evere,
Belgium, 2006.
[15] K. Pearson, On Lines and Planes of Closest Fit to Systems of Points in
Space . Philosophical Magazine, 2:559-572, 1901.
[16] A. Anaissi ,P. Kennedy and M. Goyal, A Framework for Very High
Dimensional Data Reduction in the Microarray Domain . IEEEBITA,
2010.