Abstract: DNA Barcode provides good sources of needed
information to classify living species. The classification problem has
to be supported with reliable methods and algorithms. To analyze
species regions or entire genomes, it becomes necessary to use the
similarity sequence methods. A large set of sequences can be
simultaneously compared using Multiple Sequence Alignment which
is known to be NP-complete. However, all the used methods are still
computationally very expensive and require significant computational
infrastructure. Our goal is to build predictive models that are highly
accurate and interpretable. In fact, our method permits to avoid the
complex problem of form and structure in different classes of
organisms. The empirical data and their classification performances
are compared with other methods. Evenly, in this study, we present
our system which is consisted of three phases. The first one, is called
transformation, is composed of three sub steps; Electron-Ion
Interaction Pseudopotential (EIIP) for the codification of DNA
Barcodes, Fourier Transform and Power Spectrum Signal Processing.
Moreover, the second phase step is an approximation; it is
empowered by the use of Multi Library Wavelet Neural Networks
(MLWNN). Finally, the third one, is called the classification of DNA
Barcodes, is realized by applying the algorithm of hierarchical
classification.
Abstract: Bacterial strains capable of degradation of malathion
from the domestic sewage were isolated by an enrichment culture
technique. Three bacterial strains were screened and identified as
Acinetobacter baumannii (AFA), Pseudomonas aeruginosa (PS1),
and Pseudomonas mendocina (PS2) based on morphological,
biochemical identification and 16S rRNA sequence analysis.
Acinetobacter baumannii AFA was the most efficient malathion
degrading bacterium, so used for further biodegradation study. AFA
was able to grow in mineral salt medium (MSM) supplemented with
malathion (100 mg/l) as a sole carbon source, and within 14 days,
84% of the initial dose was degraded by the isolate measured by high
performance liquid chromatography. Strain AFA could also degrade
other organophosphorus compounds including diazinon, chlorpyrifos
and fenitrothion. The effect of different culture conditions on the
degradation of malathion like inoculum density, other carbon or
nitrogen sources, temperature and shaking were examined.
Degradation of malathion and bacterial cell growth were accelerated
when culture media were supplemented with yeast extract, glucose
and citrate. The optimum conditions for malathion degradation by
strain AFA were; an inoculum density of 1.5x 10^12CFU/ml at 30°C
with shaking. A specific polymerase chain reaction primers were
designed manually using multiple sequence alignment of the
corresponding carboxylesterase enzymes of Acinetobacter species.
Sequencing result of amplified PCR product and phylogenetic
analysis showed low degree of homology with the other
carboxylesterase enzymes of Acinetobacter strains, so we suggested
that this enzyme is a novel esterase enzyme. Isolated bacterial strains
may have potential role for use in bioremediation of malathion
contaminated.
Abstract: A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D, G, A, U, C}, where the letter D represent one or more hypothetical bases with unspecific pairing. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvements of a primitive DNA repair system could make possible the transition from the ancient to the modern genetic code. Our results suggest that the Watson-Crick base pairing and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as the transition from the former to the later. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences.
Abstract: Proteins or genes that have similar sequences are likely to perform the same function. One of the most widely used techniques for sequence comparison is sequence alignment. Sequence alignment allows mismatches and insertion/deletion, which represents biological mutations. Sequence alignment is usually performed only on two sequences. Multiple sequence alignment, is a natural extension of two-sequence alignment. In multiple sequence alignment, the emphasis is to find optimal alignment for a group of sequences. Several applicable techniques were observed in this research, from traditional method such as dynamic programming to the extend of widely used stochastic optimization method such as Genetic Algorithms (GAs) and Simulated Annealing. A framework with combination of Genetic Algorithm and Simulated Annealing is presented to solve Multiple Sequence Alignment problem. The Genetic Algorithm phase will try to find new region of solution while Simulated Annealing can be considered as an alignment improver for any near optimal solution produced by GAs.
Abstract: Background: Dialign is a DNA/Protein alignment tool
for performing pairwise and multiple pairwise alignments through the
comparison of gap-free segments (fragments) between sequence
pairs. An alignment of two sequences is a chain of fragments, i.e
local gap-free pairwise alignments, with the highest total score.
METHOD: A new approach is defined in this article which relies on
the concept of using three-dimensional fragments – i.e. local threeway
alignments -- in the alignment process instead of twodimensional
ones. These three-dimensional fragments are gap-free
alignments constituting of equal-length segments belonging to three
distinct sequences. RESULTS: The obtained results showed good
improvments over the performance of DIALIGN.
Abstract: Multiple sequence alignment is a fundamental part in
many bioinformatics applications such as phylogenetic analysis.
Many alignment methods have been proposed. Each method gives a
different result for the same data set, and consequently generates a
different phylogenetic tree. Hence, the chosen alignment method
affects the resulting tree. However in the literature, there is no
evaluation of multiple alignment methods based on the comparison of
their phylogenetic trees. This work evaluates the following eight
aligners: ClustalX, T-Coffee, SAGA, MUSCLE, MAFFT, DIALIGN,
ProbCons and Align-m, based on their phylogenetic trees (test trees)
produced on a given data set. The Neighbor-Joining method is used
to estimate trees. Three criteria, namely, the dNNI, the dRF and the
Id_Tree are established to test the ability of different alignment
methods to produce closer test tree compared to the reference one
(true tree). Results show that the method which produces the most
accurate alignment gives the nearest test tree to the reference tree.
MUSCLE outperforms all aligners with respect to the three criteria
and for all datasets, performing particularly better when sequence
identities are within 10-20%. It is followed by T-Coffee at lower
sequence identity (30%), trees scores of all methods
become similar.