Dynamic Bayesian Networks Modeling for Inferring Genetic Regulatory Networks by Search Strategy: Comparison between Greedy Hill Climbing and MCMC Methods

Using Dynamic Bayesian Networks (DBN) to model genetic regulatory networks from gene expression data is one of the major paradigms for inferring the interactions among genes. Averaging a collection of models for predicting network is desired, rather than relying on a single high scoring model. In this paper, two kinds of model searching approaches are compared, which are Greedy hill-climbing Search with Restarts (GSR) and Markov Chain Monte Carlo (MCMC) methods. The GSR is preferred in many papers, but there is no such comparison study about which one is better for DBN models. Different types of experiments have been carried out to try to give a benchmark test to these approaches. Our experimental results demonstrated that on average the MCMC methods outperform the GSR in accuracy of predicted network, and having the comparable performance in time efficiency. By proposing the different variations of MCMC and employing simulated annealing strategy, the MCMC methods become more efficient and stable. Apart from comparisons between these approaches, another objective of this study is to investigate the feasibility of using DBN modeling approaches for inferring gene networks from few snapshots of high dimensional gene profiles. Through synthetic data experiments as well as systematic data experiments, the experimental results revealed how the performances of these approaches can be influenced as the target gene network varies in the network size, data size, as well as system complexity.





References:
[1] T. Akutsu, S. Miyano, and S. Kuhara. Identification of genetic networks
from a small number of gene expression patterns under the boolean
network model. Pac. Symp. Biocomput, page 17C28, 1999.
[2] U. Alon, N. Barkai, D. Notterman, Ybarra S.and Mack D. Gish, K., and
A. J. Levine. Broad patterns of gene expression revealed by clustering
analysis of tumor and normal colon tissues probed by oligonucleotide
arrays. Proc. Nat. Acad. Sci. USA, 96:6745C6750, 1999.
[3] A. Ben-Dor, R. Shamir, and Z. Yakhini. Clustering gene expression
patterns. Journal of Computational Biology, 6:281C297, 1999.
[4] Xue-wen Chen, Gopalakrishna Anantha, and Xinkun Wang. An effective
structure learning method for constructing gene networks. Bioinformatics,
22(11):1367-1374, 2006.
[5] S. Chib and E. Greenberg. Understanding the metropolis-hastings
algorithm. Amer. Statist., 49:327-335, 1995.
[6] David Maxwell Chickering, David Heckerman, and Christopher Meek.
Large-sample learning of bayesian networks is np-hard. J. Mach. Learn.
Res., 5:1287-1330, 2004.
[7] F. Cooper and E. H. Herskovits. A bayesian method for the induction of
probabilistic networks from data. Machine Learning, 9:309-347, 1992.
[8] Honghua Dai, Gang Li, and Yiqing Tu. An empirical study of encoding
schemes and search strategies in discovering causal networks. In
Machine Learning: ECML 2002: 13th European Conference on Machine
Learning, Helsinki, Finland, August 19-23, 2002. Proceedings.
[9] LAWRENCE A. DAVID and CHRIS H. WIGGINS. Benchmarking of
Dynamic Bayesian Networks Inferred from Stochastic Time-Series Data.
Ann NY Acad Sci, 1115(1):90-101, 2007.
[10] M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and
display of genomewide expression patterns. Proc. Nat. Acad. Sci. USA,
95(14863C14868), 1998.
[11] N. Friedman and D. Koller. Being bayesian about network structure: A
bayesian approach to structure discovery in bayesian networks. 2001.
[12] Dan Geiger and David Heckerman. Learning gaussian networks. (MSRTR-
94-10), 1994.
[13] A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Using
graphical models and genomic expression data to statistically validate
models of genetic regulatory networks. Pac. Symp. Biocomput., 6:422-
433, 2001.
[14] David Heckerman. Bayesian networks for data mining. Data Min.
Knowl. Discov., 1(1):79-119, 1997.
[15] David Heckerman, Dan Geiger, and David M. Chickering. Learning
bayesian networks: The combination of knowledge and statistical data.
Mach. Learn., 20(3):197-243, 1995.
[16] K. Hoffgen. Learning and robust learning of product distributions.
in:Sixth Annual Workshop on Computational Learning Theory, pages
77-83, 1993.
[17] Dirk Husmeier. Sensitivity and specificity of inferring genetic regulatory
interactions from microarray experiments with dynamic Bayesian
networks. Bioinformatics, 19(17):2271-2282, 2003.
[18] S. Imoto, T. Goto, and S. Miyano. Estimation of genetic networks and
functional structures between genes by using bayesian networks and
nonparametric regression. Pac. Symp. Biocomput., 7:175-186, 2002.
[19] Seiya Imoto, Tomoyuki Higuchi, Takao Goto, Kousuke Tashiro, Satoru
Kuhara, and Satoru Miyano. Combining microarrays and biological
knowledge for estimating gene networks via bayesian networks. In
CSB -03: Proceedings of the IEEE Computer Society Conference on
Bioinformatics, page 104, 2003.
[20] B. Kholodenko, A. Kiyatkin, F. Bruggeman, E. Sontag, H. Westerhoff,
and J. Hoek. Untangling the wires: a strategy to trace functional
interactions in signaling and gene networks. Proc. Natl. Acad. Sci.,
99:12841C12846, 2002.
[21] Tie-Fei Liu, Wing-Kin Sung, and Ankush Mittal. Learning multi-time
delay gene network using bayesian network framework. In ICTAI -04:
Proceedings of the 16th IEEE International Conference on Tools with
Artificial Intelligence (ICTAI-04), pages 640-645, 2004.
[22] D.J. Lockhart and E.A. Winzeler. Genomics, gene expression and dna
arrays. Nature, 405:827-836, 2000.
[23] N. Nariai, S. Kim, S. Imoto, and S. Miyano. Using protein-protein
interactions for refining gene networks estimated from microarray data
by bayesian networks. Pac Symp Biocomput, pages 336-347, 2004.
[24] D. Peer, A. Regev, G. Elidan, and Friedman. Inferring subnetworks from
perturbed expression profiles. Bioinformatics, 17:S215CS224, 2001.
[25] D. Peer, A. Regev, G. Elidan, and Friedman. Inferring subnetworks from
perturbed expression profiles. Bioinformatics, 17(S215CS224), 2001.
[26] a Phillip P. Le, a Amit Bahl, and Lyle H. Ungar. Using prior knowledge
to improve genetic network reconstruction from microarray data. In
Silico Biology, 4(0027), 2004.
[27] Claudia Rangel, John Angus, Zoubin Ghahramani, Maria Lioumi, Elizabeth
Sotheran, Alessia Gaiba, David L. Wild, and Francesco Falciani.
Modeling T-cell activation using gene expression profiling and statespace
models. Bioinformatics, 20(9):1361-1372, 2004.
[28] Claudia Rangel, David L. Wild, and Francesco Falciani. Modelling biological
responses using gene expression profiling and linear dynamical
systems. 2001.
[29] Nitzan Rosenfeld and Uri Alon. Response delays and the structure of
transcription networks. Journal of Molecular Biology, 329.
[30] Adriano V. Werhli and Dirk Husmeier. Reconstructing gene regulatory
networks with bayesian networks by combining expression data with
multiple sources of prior knowledge. Statistical Applications in Genetics
and Molecular Biology, 6.
[31] Z.Bar-Joseph, G.K.Gerber, T.I.Lee, N.J.Rinaldi, J.Y.Yoo, F.Robert,
D.B.Gordon, E.Fraenkel, T.S.Jaakkola, and R.A.Young. Computational
discovery of gene modules and regulatory networks. Nat. Biotechnol,
21:1337-1342, 2002.