Protein Residue Contact Prediction using Support Vector Machine

Protein residue contact map is a compact representation of secondary structure of protein. Due to the information hold in the contact map, attentions from researchers in related field were drawn and plenty of works have been done throughout the past decade. Artificial intelligence approaches have been widely adapted in related works such as neural networks, genetic programming, and Hidden Markov model as well as support vector machine. However, the performance of the prediction was not generalized which probably depends on the data used to train and generate the prediction model. This situation shown the importance of the features or information used in affecting the prediction performance. In this research, support vector machine was used to predict protein residue contact map on different combination of features in order to show and analyze the effectiveness of the features.




References:
[1] J. Cheng, P. Baldi, "Improved Residue Contact Prediction Using
Support Vector Machines and A Large Feature Set," BMC
Bioinformatics, Vol. 8, no. 1, 2007.
[2] X. Yuan, C. Bystroff, "Protein Contact Map Prediction," in
Computational Methods for Protein Structure Prediction and
Modelling, X. Ying, X. Dong, L. Jie, Ed. Heidelberg: Springer, 2007,
pp. 255-277.
[3] L. Bartoli, E. Capriotti, P. Fariselli, P. L. Martelli, R. Casadio, "The
Pros and Cons of Predicting Protein Contact Maps," in Protein
Structure Prediction, 2nd ed., M. Zaki, C. Bystroff, Ed. New Jersey:
Humana Press, 2008, pp. 199-217.
[4] J. Cheng, A. Randall, M. Sweredoski, P. Baldi, "SCRATCH: a protein
structure and structural feature prediction server," Nucleic Acids
Research, Vol. 33, pp. 72-76, 2005.
[5] P. Fariselli, R. Casadio, "A Neural Network Based Predictor of Residue
Contacts in Proteins," Protein Engineering, Vol. 12, pp. 15-21, 1999.
[6] E. Huang, S. Subbiah, J. Tsai, M. Levitt, "Using a Hydrophobic Contact
Potential to Evaluate Native and Near-Native Folds Generated by
Molecular Dynamics Simulations," J. Mol. Biol., Vol. 257, no. 3,
pp.716-725, 1996.
[7] MacCallum, "Striped Sheets and Protein Contact Prediction,"
Bioinformatics, Vol. 20, no. 8, pp.224-231, 2004.
[8] S. Miyazawa, R. Jernigan, "An empirical energy potential with a
reference state for protein fold and sequence recognition," Proteins,
Vol. 36, pp. 357-369, 1999.
[9] G. Pollastri, P. Baldi, "Improved Prediction of The Number of Residue
Contacts in Proteins By Recurrent Neural Networks," Bioinformatics,
Vol. 17, pp. 234-242, 2001.
[10] A. N. Tegge, Z. Wang, J. Eickholt, J. Cheng, "NNcon: Improved Protein
Contact Map Prediction Using 2D-Recursive Neural Networks," Nucleic
Acids Research, Vol. 37, pp. 515-518, 2009.
[11] Y. Zhao, G. Karypis, "Prediction of Protein Contact Maps Using
Support Vector Machines," presented at IEEE Symposium on
Bioinformatics and Bioengineering, Bethesda, MD, USA, March 10-12,
2003.
[12] H. Zhu, W. Braun, "Sequence specificity, statistical potentials, and
three-dimensional structure prediction with self-correcting," Protein
Sci., Vol. 8, pp. 326-342, 1999.