An Integrative Bayesian Approach to Supporting the Prediction of Protein-Protein Interactions: A Case Study in Human Heart Failure

Recent years have seen a growing trend towards the integration of multiple information sources to support large-scale prediction of protein-protein interaction (PPI) networks in model organisms. Despite advances in computational approaches, the combination of multiple “omic" datasets representing the same type of data, e.g. different gene expression datasets, has not been rigorously studied. Furthermore, there is a need to further investigate the inference capability of powerful approaches, such as fullyconnected Bayesian networks, in the context of the prediction of PPI networks. This paper addresses these limitations by proposing a Bayesian approach to integrate multiple datasets, some of which encode the same type of “omic" data to support the identification of PPI networks. The case study reported involved the combination of three gene expression datasets relevant to human heart failure (HF). In comparison with two traditional methods, Naive Bayesian and maximum likelihood ratio approaches, the proposed technique can accurately identify known PPI and can be applied to infer potentially novel interactions.




References:
[1] C. Royer, "Protein-protein interactions," Outline of the Thermodynamic
and Structural Principles Governing the Ways that Proteins Interact with
Other Proteins. Previously Published in the Biophysics Textbook Online
(BTOL), 1999.
[2] P. Uetz, L. Giot, G. Cagney, T. A. Mansfield et al., "A comprehensive
analysis of protein-protein interactions in Saccharomyces cerevisiae."
Nature, vol. 403, pp. 623-627, 2000.
[3] A. C. Gavin, P. Aloy, P. Grandi, R. Krause, M. Boesche, et al.,
"Proteome survey reveals modularity of the yeast cell machinery,"
Nature, vol. 440, pp. 631-636, 2006.
[4] N. J. Krogan, G. Cagney, H. Yu, G. Zhong, X. Guo, et al., "Global
landscape of protein complexes in the yeast Saccharomyces cerevisiae,"
Nature, vol. 440, pp. 637-643, Mar 30. 2006.
[5] M. Middendorf, E. Ziv and C. H. Wiggins, "Inferring network
mechanisms: The Drosophila melanogaster protein interaction network,"
Proceedings of the National Academy of Sciences of the United States
of Americal, vol. 102, pp. 3192-3197, 2005.
[6] R. M. Ewing, P. Chu, F. Elisma, H. Li, et al., "Large-scale mapping of
human protein-protein interactions by mass spectrometry," Molecular
Systems Biology, vol. 3, 2007.
[7] C. von Mering, R. Krause, B. Snel, M. Cornell, S. Oliver, et
al.,"Comparative assessment of large-scale data sets of protein-protein
interactions". Nature 417(6887), pp. 399-403, 2002.
[8] H. Ge, A. Walhout and M. Vidal, "Integrating ÔÇÿomic-information: a
bridge between genomics and systems biology," Trends Genet., vol. 19,
pp. 551-560, 2003.
[9] R. Jansen, H. Yu, D. Greenbaum, Y. Kluger, et al., "A Bayesian
networks approach for predicting protein-protein interactions from
genomic data," Science, vol. 302, pp. 449-453, 2003.
[10] O. G. Troyanskaya, K. Dolinski, A. B. Owen, R. B. Altman and D.
Botstein, "A Bayesian framework for combining heterogeneous data
sources for gene function prediction (in Saccharomyces cerevisiae),"
Proceedings of the National Academy of Sciences, vol. 100, pp. 8348-
8353, 2003.
[11] E. M. Marcotte, "Detecting Protein Function and Protein-Protein
Interactions from Genome Sequences," Science, vol. 285, pp. 751-753,
1999.
[12] L. J. Lu, Y. Xia, A. Paccanaro, H. Yu and M. Gerstein, "Assessing the
limits of genomic data integration for predicting protein networks,"
Genome Res., vol. 15, pp. 945, 2005.
[13] M. S. Scott and G. J. Barton, "Probabilistic prediction and ranking of
human protein-protein interactions," BMC Bioinformatics, vol. 8, pp.
239, 2007.
[14] D. Rhodes R., S. Tomlins A., S. Varambally, V. Mahavisno, et al.,
"Probabilistic model of the human protein-protein interaction network".
Nature 23(8), pp. 951-959, 2005.
[15] Y. Qi, J. Klein-Seetharaman and Z. Bar-Joseph, "A mixture of feature
experts approach for protein-protein interaction prediction," BMC
Bioinformatics, vol. 8 Suppl 10, pp. S6, 2007.
[16] S. Peri, J. D. Navarro, R. Amanchy, T. Z. Kristiansen, et al.,
"Development of Human Protein Reference Database as an Initial
Platform for Approaching Systems Biology in Humans," Genome Res.,
vol. 13, pp. 2363, 2003.
[17] Y. Qi, Z. Bar-Joseph and J. Klein-Seetharaman, "Evaluation of different
biological data and computational classification methods for use in
protein interaction prediction." Proteins: Structure, Function, and
Bioinformatics, vol. 63, pp. 490 - 500, 2006.
[18] American Heart Association (AHA) American Heart Association, "Heart
diseases and stroke Statistics-2007 update," 2007
[19] A. Camargo and F. Azuaje, "Linking Gene Expression and Functional
Network Data in Human Heart Failure," PLoS ONE, vol. 2, 2007.
[20] "Gene Expression Omnibus" [http://www.ncbi.nlm.nih.gov/geo/]
[21] M. Ashburner, C. Ball and J. Blake, "Gene ontology: tool for the
unification of biology. The Gene Ontology Consortium Database
resources of the National Center for Biotechnology Information,"
Nucleic Acids Res., vol. 34, 2006.
[22] L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie and D.
Eisenberg, "The Database of Interacting Proteins: 2004 update," Nucleic
Acids Res., vol. 32, pp. 449-451, 2004.
[23] C. J. Needham, J. R. Bradford, A. J. Bulpitt and D. R. Westhead, "A
primer on learning in Bayesian networks for computational biology,"
PLoS Comput Biol, vol. 3, pp. e129, 2007.
[24] B. J. Breitkreutz, C. Stark and M. Tyers, "The GRID: the General
Repository for Interaction Datasets," Genome Biol., vol. 4, pp. R23,
2003.