A Bayesian Kernel for the Prediction of Protein- Protein Interactions

Understanding proteins functions is a major goal in the post-genomic era. Proteins usually work in context of other proteins and rarely function alone. Therefore, it is highly relevant to study the interaction partners of a protein in order to understand its function. Machine learning techniques have been widely applied to predict protein-protein interactions. Kernel functions play an important role for a successful machine learning technique. Choosing the appropriate kernel function can lead to a better accuracy in a binary classifier such as the support vector machines. In this paper, we describe a Bayesian kernel for the support vector machine to predict protein-protein interactions. The use of Bayesian kernel can improve the classifier performance by incorporating the probability characteristic of the available experimental protein-protein interactions data that were compiled from different sources. In addition, the probabilistic output from the Bayesian kernel can assist biologists to conduct more research on the highly predicted interactions. The results show that the accuracy of the classifier has been improved using the Bayesian kernel compared to the standard SVM kernels. These results imply that protein-protein interaction can be predicted using Bayesian kernel with better accuracy compared to the standard SVM kernels.




References:
[1] H. Lodish, A. Berk, L. Zipursky, P. Matsudaira, D. Baltimore, and J.
Darnell, Molecular cell biology (4th edition). W.H. Freeman, New
York, 2000.
[2] B. Alberts, A. Johnson, J. Lewis, M. Raff, K.Roberts, and P. Walter,
Molecular Biology of the Cell (4th edition). Garland Science, 2002.
[3] T. Ito, K. Tashiro, S. Muta, R. Ozawa, T. Chiba, M. Nishizawa, K.
Yamamoto, S. Kuhara, and Y. Sakaki, "Toward a protein-protein
interaction map of the budding yeast: a comprehensive system to
examine two-hybrid interactions in all possible combinations between
the yeast proteins," Proc. Natl. Acad. Sci. USA. 97: 1143-1147, 2000.
[4] P. Uetz, L. Giot, G. Cagney, T.A. Mansfield, R.S. Judson, J.R. Knight,
D. Lockshon, V. Narayan, M. Srinivasan, et al., "A Comprehensive
analysis of protein-protein interactions in Saccharomyces cerevisiae,"
Nature 403:623 627, 2000.
[5] J. R. Newman, E. Wolf, and P. S. Kim, "A computationally directed
screen identifying interacting coiled coils from Saccharomyces
cerevisiae," Proc. Natl. Acad. Sci. U. S. A. 97, 13203-13208, 2000.
[6] P. Uetz and C. S. Vollert, "Protein-Protein Interactions," Encyclopedic
Reference of Genomics and Proteomics in Molecular Medicine
(ERGPMM), Springer Verlag, 2005.
[7] E. M. Phizicky and S. Fields, "Protein-protein interactions: Method for
detection and analysis," Microbiological Reviews, pp.94-123, 1995.
[8] R. Jansen, H. Yu, D. Greenbaum, Y. Kluger, N.J. Krogan, S. Chung, A.
Emili, M. Snyder, J.F. Greenblatt, and M. Gerstein. "A Bayesian
networks approach for predicting protein-protein interactions from
genomic data." Science. 302, pp:449-453, 2003.
[9] J. Yu, F. Fotouhi, and R.L. Finley. "Combining Bayesian Networks and
Decision Trees to Predict Drosophila melanogaster Protein-Protein
Interactions." In the 21st International Conference on Data Engineering
Workshops. April 5-8. Tokyo, Japan. 2005.
[10] M. Sikora, F. Morcos, D.J. Costello, and J.A. Izaguirre. "Bayesian
Inference of Protein and Domain Interactions Using the Sum-Product
Algorithm." Proc. Information Theory and Applications Workshop, San
Diego, Jan. 29, 2007.
[11] D. Koller. "Probabilistic Relational Models Source." Lecture Notes in
Computer Science. 1634: 3-13. 1999.
[12] F. Fleuret and W. Gerstner. "A Bayesian Kernel for the Prediction of
Neuron Properties from Binary Gene Profiles." Proceedings of the IEEE
International Conference on Machine Learning and Applications.
Special session Applications of Machine Learning in Medicine and
Biology (ICMLA):129-134. 2005.
[13] D. Heckerman, D. Geiger and D.Chickering. "Learning Bayesian
networks: The combination of knowledge and statistical data." Machine
Learning. 20:197-243. 1995.
[14] P. Larra├▒aga, M.Y. Gallego, B. Sierra, L. Urkola, and M.J. Michelena.
"Bayesian networks, rule induction and logistic regression in the
prediction of the survival of women suffering from breast cancer."
Lecture Notes in Artificial Intelligence. 1323. E. Costa, A. Cardoso
(eds.):303-308. Springer-Verlag. 1997.
[15] M. Tipping. "The relevance vector machine." In Advances in Neural
Information Processing Systems, 12:652-658. Cambridge MIT Press,
2000.
[16] D.S. Han, H.S. Kim, W.H. Jang, and S.D. Lee. "PreSPI: A Domain
Combination Based Prediction System for Protein-Protein Interaction."
Nucleic Acids Research. 32(21): 6312-6320. 2004.
[17] H. Alashwal, S. Deris, and R. M. Othman. "One-class support vector
machines for protein-protein interactions prediction." International
Journal of Biomedical Sciences, 1(2):120-127, 2006.
[18] C. C. Chang and C. J. Lin, "LIBSVM : a library for support vector
machines," 2001. Software available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm.