Abstract: Bioinformatics methods for predicting the T cell
coreceptor usage from the array of membrane protein of HIV-1 are
investigated. In this study, we aim to propose an effective prediction
method for dealing with the three-class classification problem of
CXCR4 (X4), CCR5 (R5) and CCR5/CXCR4 (R5X4). We made
efforts in investigating the coreceptor prediction problem as follows: 1)
proposing a feature set of informative physicochemical properties
which is cooperated with SVM to achieve high prediction test
accuracy of 81.48%, compared with the existing method with
accuracy of 70.00%; 2) establishing a large up-to-date data set by
increasing the size from 159 to 1225 sequences to verify the proposed
prediction method where the mean test accuracy is 88.59%, and 3)
analyzing the set of 14 informative physicochemical properties to
further understand the characteristics of HIV-1coreceptors.
Abstract: Protein subchloroplast locations are correlated with its
functions. In contrast to the large amount of available protein
sequences, the information of their locations and functions is less
known. The experiment works for identification of protein locations
and functions are costly and time consuming. The accurate prediction
of protein subchloroplast locations can accelerate the study of
functions of proteins in chloroplast. This study proposes a Random
Forest based method, ChloroRF, to predict protein subchloroplast
locations using interpretable physicochemical properties. In addition
to high prediction accuracy, the ChloroRF is able to select important
physicochemical properties. The important physicochemical
properties are also analyzed to provide insights into the underlying
mechanism.
Abstract: The γ-turns play important roles in protein folding and
molecular recognition. The prediction and analysis of γ-turn types are
important for both protein structure predictions and better
understanding the characteristics of different γ-turn types. This study
proposed a physicochemical property-based decision tree (PPDT)
method to interpretably predict γ-turn types. In addition to the good
prediction performance of PPDT, three simple and human
interpretable IF-THEN rules are extracted from the decision tree
constructed by PPDT. The identified informative physicochemical
properties and concise rules provide a simple way for discriminating
and understanding γ-turn types.