Abstract: Protein subchloroplast locations are correlated with its
functions. In contrast to the large amount of available protein
sequences, the information of their locations and functions is less
known. The experiment works for identification of protein locations
and functions are costly and time consuming. The accurate prediction
of protein subchloroplast locations can accelerate the study of
functions of proteins in chloroplast. This study proposes a Random
Forest based method, ChloroRF, to predict protein subchloroplast
locations using interpretable physicochemical properties. In addition
to high prediction accuracy, the ChloroRF is able to select important
physicochemical properties. The important physicochemical
properties are also analyzed to provide insights into the underlying
mechanism.
Abstract: The γ-turns play important roles in protein folding and
molecular recognition. The prediction and analysis of γ-turn types are
important for both protein structure predictions and better
understanding the characteristics of different γ-turn types. This study
proposed a physicochemical property-based decision tree (PPDT)
method to interpretably predict γ-turn types. In addition to the good
prediction performance of PPDT, three simple and human
interpretable IF-THEN rules are extracted from the decision tree
constructed by PPDT. The identified informative physicochemical
properties and concise rules provide a simple way for discriminating
and understanding γ-turn types.