Abstract: Genome rearrangement is an important area in computational biology and bioinformatics. The basic problem in genome rearrangements is to compute the edit distance, i.e., the minimum number of operations needed to transform one genome into another. Unfortunately, unsigned genome rearrangement problem is NP-hard. In this study an improved ant colony optimization algorithm to approximate the edit distance is proposed. The main idea is to convert the unsigned permutation to signed permutation and evaluate the ants by using Kaplan algorithm. Two new operations are added to the standard ant colony algorithm: Replacing the worst ants by re-sampling the ants from a new probability distribution and applying the crossover operations on the best ants. The proposed algorithm is tested and compared with the improved breakpoint reversal sort algorithm by using three datasets. The results indicate that the proposed algorithm achieves better accuracy ratio than the previous methods.
Abstract: Understanding the cell's large-scale organization is an interesting task in computational biology. Thus, protein-protein interactions can reveal important organization and function of the cell. Here, we investigated the correspondence between protein interactions and function for the yeast. We obtained the correlations among the set of proteins. Then these correlations are clustered using both the hierarchical and biclustering methods. The detailed analyses of proteins in each cluster were carried out by making use of their functional annotations. As a result, we found that some functional classes appear together in almost all biclusters. On the other hand, in hierarchical clustering, the dominancy of one functional class is observed. In the light of the clustering data, we have verified some interactions which were not identified as core interactions in DIP and also, we have characterized some functionally unknown proteins according to the interaction data and functional correlation. In brief, from interaction data to function, some correlated results are noticed about the relationship between interaction and function which might give clues about the organization of the proteins, also to predict new interactions and to characterize functions of unknown proteins.
Abstract: A number of competing methodologies have been developed
to identify genes and classify DNA sequences into coding
and non-coding sequences. This classification process is fundamental
in gene finding and gene annotation tools and is one of the most
challenging tasks in bioinformatics and computational biology. An
information theory measure based on mutual information has shown
good accuracy in classifying DNA sequences into coding and noncoding.
In this paper we describe a species independent iterative
approach that distinguishes coding from non-coding sequences using
the mutual information measure (MIM). A set of sixty prokaryotes is
used to extract universal training data. To facilitate comparisons with
the published results of other researchers, a test set of 51 bacterial
and archaeal genomes was used to evaluate MIM. These results
demonstrate that MIM produces superior results while remaining
species independent.
Abstract: Detecting protein-protein interactions is a central problem in computational biology and aberrant such interactions may have implicated in a number of neurological disorders. As a result, the prediction of protein-protein interactions has recently received considerable attention from biologist around the globe. Computational tools that are capable of effectively identifying protein-protein interactions are much needed. In this paper, we propose a method to detect protein-protein interaction based on substring similarity measure. Two protein sequences may interact by the mean of the similarities of the substrings they contain. When applied on the currently available protein-protein interaction data for the yeast Saccharomyces cerevisiae, the proposed method delivered reasonable improvement over the existing ones.
Abstract: Understanding the cell's large-scale organization is an
interesting task in computational biology. Thus, protein-protein
interactions can reveal important organization and function of the
cell. Here, we investigated the correspondence between protein
interactions and function for the yeast. We obtained the correlations
among the set of proteins. Then these correlations are clustered using
both the hierarchical and biclustering methods. The detailed analyses
of proteins in each cluster were carried out by making use of their
functional annotations. As a result, we found that some functional
classes appear together in almost all biclusters. On the other hand, in
hierarchical clustering, the dominancy of one functional class is
observed. In brief, from interaction data to function, some correlated
results are noticed about the relationship between interaction and
function which might give clues about the organization of the
proteins.
Abstract: Classification is one of the primary themes in
computational biology. The accuracy of classification strongly
depends on quality of a dataset, and we need some method to
evaluate this quality. In this paper, we propose a new graphical
analysis method using 'Membership-Deviation Graph (MDG)' for
analyzing quality of a dataset. MDG represents degree of
membership and deviations for instances of a class in the dataset. The
result of MDG analysis is used for understanding specific feature and
for selecting best feature for classification.
Abstract: The paper attempts to elucidate the columnar structure
of the cortex by answering the following questions. (1) Why the
cortical neurons with similar interests tend to be vertically arrayed
forming what is known as cortical columns? (2) How to describe the
cortex as a whole in concise mathematical terms? (3) How to design
efficient digital models of the cortex?
Abstract: The paper discusses the mathematics of pattern
indexing and its applications to recognition of visual patterns that are
found in video clips. It is shown that (a) pattern indexes can be
represented by collections of inverted patterns, (b) solutions to
pattern classification problems can be found as intersections and
histograms of inverted patterns and, thus, matching of original
patterns avoided.
Abstract: Bioinformatics and computational biology involve
the use of techniques including applied mathematics,
informatics, statistics, computer science, artificial intelligence,
chemistry, and biochemistry to solve biological problems
usually on the molecular level. Research in computational
biology often overlaps with systems biology. Major research
efforts in the field include sequence alignment, gene finding,
genome assembly, protein structure alignment, protein structure
prediction, prediction of gene expression and proteinprotein
interactions, and the modeling of evolution. Various
global rearrangements of permutations, such as reversals and
transpositions,have recently become of interest because of their
applications in computational molecular biology. A reversal is
an operation that reverses the order of a substring of a permutation.
A transposition is an operation that swaps two adjacent
substrings of a permutation. The problem of determining the
smallest number of reversals required to transform a given
permutation into the identity permutation is called sorting by
reversals. Similar problems can be defined for transpositions
and other global rearrangements. In this work we perform a
study about some genome rearrangement primitives. We show
how a genome is modelled by a permutation, introduce some
of the existing primitives and the lower and upper bounds
on them. We then provide a comparison of the introduced
primitives.