Abstract: Protein subchloroplast locations are correlated with its
functions. In contrast to the large amount of available protein
sequences, the information of their locations and functions is less
known. The experiment works for identification of protein locations
and functions are costly and time consuming. The accurate prediction
of protein subchloroplast locations can accelerate the study of
functions of proteins in chloroplast. This study proposes a Random
Forest based method, ChloroRF, to predict protein subchloroplast
locations using interpretable physicochemical properties. In addition
to high prediction accuracy, the ChloroRF is able to select important
physicochemical properties. The important physicochemical
properties are also analyzed to provide insights into the underlying
mechanism.
Abstract: Leo Breimans Random Forests (RF) is a recent
development in tree based classifiers and quickly proven to be one of
the most important algorithms in the machine learning literature. It
has shown robust and improved results of classifications on standard
data sets. Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques to the random forests. We
experiment the working of the ensembles of random forests on the
standard data sets available in UCI data sets. We compare the
original random forest algorithm with their ensemble counterparts
and discuss the results.