A Novel Microarray Biclustering Algorithm

Biclustering aims at identifying several biclusters that reveal potential local patterns from a microarray matrix. A bicluster is a sub-matrix of the microarray consisting of only a subset of genes co-regulates in a subset of conditions. In this study, we extend the motif of subspace clustering to present a K-biclusters clustering (KBC) algorithm for the microarray biclustering issue. Besides minimizing the dissimilarities between genes and bicluster centers within all biclusters, the objective function of the KBC algorithm additionally takes into account how to minimize the residues within all biclusters based on the mean square residue model. In addition, the objective function also maximizes the entropy of conditions to stimulate more conditions to contribute the identification of biclusters. The KBC algorithm adopts the K-means type clustering process to efficiently make the partition of K biclusters be optimized. A set of experiments on a practical microarray dataset are demonstrated to show the performance of the proposed KBC algorithm.




References:
[1] J. L. DeRisi, V. R. Iyer, and P. O. Brown, 1997. "Exploring the metabolic
and genetic control of gene expression on a genomic scale," Science, vol.
278, pp. 680-686, 1997.
[2] Y. Cheng and G. M. Church, "Biclustering of expression data," in
Proceedings of the Eighth International Conference on Intelligent
Systems for Molecular Biology, pp. 93-103, 2000.
[3] S. C. Madeira and A. L. Oliveira, "Biclustering algorithms for biological
data analysis: A survey," IEEE/ACM Transactions on Computational
Biology and Bioinformatics, vol. 1, pp.24-45, 2004.
[4] D. P. Berrer, W. Dubitzky and M. Granzow, A Practical Approach to
Microarray Data Analysis. Kluwer, Norwell, pp. 15-19, 2003.
[5] L. Parsons, E. Haque and H. Liu, "Subspace clustering for high
dimensional data: A review," SIGKDD Explorations, vol. 6, pp. 90-105,
2004.
[6] H. Frigui and O. Nasraoui, "Unsupervised learning of prototypes and
attribute weights," Pattern Recognition, vol. 37, pp. 567-581, 2004.
[7] J. B. MacQueen, "Some methods for classification and analysis of
multivariate observations," in Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability, pp. 281-297,
1967.
[8] K. Bryan, P. Cunningham and N. Bolshakova, "Application of simulated
annealing to the biclustering of gene expression data," IEEE Transactions
on Information Technology in Biomedicine, vol. 10, pp. 519-525, 2006.
[9] J. Yang, W. Wang, H. Wang and P. Yu, "╬┤-clusters: Capturing subspace
correlation in a large data set," in Proceedings of the 18th IEEE
International Conference on Data Engineering, pp. 517-528, 2002.
[10] J. Yang, H. Wang, W. Wang and P. Yu, "Enhanced biclustering on
expression data," in Proceedings of the Third IEEE Symposium on
BioInformatics and Bioengineering, pp. 1-7, 2003.
[11] A. Tanay, R. Sharan and R., Shamir, "Discovering statistically significant
biclusters in gene expression data," Bioinformatics, vol. 18, pp. 136-144,
2002.
[12] S. Mitra and H. Banka, "Multi-objective evolutionary biclustering of gene
expression data," Pattern Recognition, vol. 39, pp. 2464-2477, 2006.
[13] M. Filippone, F. Masulli, S. Rovetta, S. Mitra and H. Banka, "Possibilistic
approach to biclustering: An application to oligonucleotide microarray
data analysis," Lecture Notes in Computer Science, vol. 4210, pp.
312-322, 2006.
[14] S. Mitra, H. Banka and J. H. Paik, "Evolutionary fuzzy biclustering of
gene expression data," Lecture Notes in Artificial Intelligence, vol. 4481,
pp. 284-291, 2007.
[15] C. E. Shannon, "A mathematical theory of communication," The Bell
System Technical Journal, vol. 27, pp. 379-423, 623-656, 1948.
[16] R. J. Cho, M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L.
Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart
and R. W. Davis, "A genome-wide transcriptional analysis of the mitotic
cell cycle," Molecular Cell, vol. 2, pp. 65-73, 1998.
[17] P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B.
Eisen, P. O. Brown, D. Botstein and B. Futcher, "Comprehensive
identification of cell cycle regulated genes of the yeast Saccharomyces
Cerevisiae by microarray hybridization," Molecular Biology of the Cell,
vol. 9, pp. 3273-3297, 1998.