Molecular Evolutionary Analysis of Yeast Protein Interaction Network

To understand life as biological system, evolutionary understanding is indispensable. Protein interactions data are rapidly accumulating and are suitable for system-level evolutionary analysis. We have analyzed yeast protein interaction network by both mathematical and biological approaches. In this poster presentation, we inferred the evolutionary birth periods of yeast proteins by reconstructing phylogenetic profile. It has been thought that hub proteins that have high connection degree are evolutionary old. But our analysis showed that hub proteins are entirely evolutionary new. We also examined evolutionary processes of protein complexes. It showed that member proteins of complexes were tend to have appeared in the same evolutionary period. Our results suggested that protein interaction network evolved by modules that form the functional unit. We also reconstructed standardized phylogenetic trees and calculated evolutionary rates of yeast proteins. It showed that there is no obvious correlation between evolutionary rates and connection degrees of yeast proteins.

Predicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles

In this study, a high accuracy protein-protein interaction prediction method is developed. The importance of the proposed method is that it only uses sequence information of proteins while predicting interaction. The method extracts phylogenetic profiles of proteins by using their sequence information. Combining the phylogenetic profiles of two proteins by checking existence of homologs in different species and fitting this combined profile into a statistical model, it is possible to make predictions about the interaction status of two proteins. For this purpose, we apply a collection of pattern recognition techniques on the dataset of combined phylogenetic profiles of protein pairs. Support Vector Machines, Feature Extraction using ReliefF, Naive Bayes Classification, K-Nearest Neighborhood Classification, Decision Trees, and Random Forest Classification are the methods we applied for finding the classification method that best predicts the interaction status of protein pairs. Random Forest Classification outperformed all other methods with a prediction accuracy of 76.93%