Multiple Sequence Alignment Using Optimization Algorithms

Proteins or genes that have similar sequences are likely to perform the same function. One of the most widely used techniques for sequence comparison is sequence alignment. Sequence alignment allows mismatches and insertion/deletion, which represents biological mutations. Sequence alignment is usually performed only on two sequences. Multiple sequence alignment, is a natural extension of two-sequence alignment. In multiple sequence alignment, the emphasis is to find optimal alignment for a group of sequences. Several applicable techniques were observed in this research, from traditional method such as dynamic programming to the extend of widely used stochastic optimization method such as Genetic Algorithms (GAs) and Simulated Annealing. A framework with combination of Genetic Algorithm and Simulated Annealing is presented to solve Multiple Sequence Alignment problem. The Genetic Algorithm phase will try to find new region of solution while Simulated Annealing can be considered as an alignment improver for any near optimal solution produced by GAs.





References:
[1] N. M. Luscombe, D. Greebaum, M. Gerstein, (2001). What is
bioinformatics? A proposed definition and overview of the field.
Department of Molecular Biophysics and Biochemistry, Yale University,
USA.
[2] R. Musick, T. Slezak, T. Crithlow, (2001). An Overview of
Bioinformatics Research at Lawrence Livermore National Laboratory.
Joint Genome Institute / Computing Application Organization .Lawrence
Livermore National Laboratory.
[3] Online Lectures on Bioinformatics (September 2003) (Online),
Available: http://lectures.molgen.mpg.de/Biol/index.html.
[4] P. C. Turner, A. G. McLennan, A. D Bates, M. R. H. White, (1997).
Nucleic Acid Structure. Instant Notes in Molecular Biology C1:31-35.
Bios Scientific Publishers Limited, Liverpool.
[5] M. Tompa .(2000). Lecture Notes on Biological Sequence Analysis.
Technical Report #2000-06-01.Department of Computer Science and
Engineering, University of Washington.
[6] W. Zhong(2003). Using Travel Salesman Problem Algorithms To
determine Multiple Sequence Alignment Orders. Master Thesis,
University of Georgia: Athens, Georgia.
[7] C. Karostensky, G. Gonnet. Near Optimal Multiple Sequence Alignment
using a Traveling Salesman Problem approach. Swiss Federal Institute
of Technology, Institute of Scientific Computing. ETH Zurich,
Switzerland (2000).
[8] R. Choudry, (1999). Application of Evolutionary Algorithms for Multiple
Sequence Alignment. Stanford University.
[9] D. J. Lipman, S. F. Altschul, J. D. Kececioglun, (1989). A tool for
Multiple Sequence Alignment. Vol. 86.pp 4412-4415, Biochemistry.
Proc. Natl. Acad. Sci. USA.
[10] M. Gribskov, J. Devereux, (1991). Similarity and Homology. Sequence
Analysis Primer.3: 89-157. Stockton Press. NY.
[11] C. Notredame, (2002). Recent Progress in Multiple Sequence Alignment:
A Survey., Pharmacogenomics, 3(1):131-144.
[12] D. Higgins, (1999). Multiple Sequence Alignment. Genetics
Databases.9:165-183 Academic Press, London, UK .
[13] J. D. Thompson, F. Plewniak, and O. Poch. (1999). A comprehensive
comparison of multiple sequence alignment programs. Nucleic Acids
Research, 27(13):2682-2690.
[14] Introduction to hidden markov models (June 2003) (Online), Available: www.scs.leeds.ac.uk/scsonly/teachingmaterials/HiddenMarko
vModels/html dev/main.html.
[15] Introduction to HMM (April 2003) (Online) Available:
http://www.cs.brown.edu/research/ai/dynamics/tutorial/Documents/Hidd
enMarkovModels.html.
[16] S. R. Eddy, (1998). Profile Hidden Markov Models. Bioinformatics
Review. Vol. 14. 9: 755-763. Oxford University Press.
[17] P. Baldi, S. Brunak, Y. Chauven, A. Krogh (1997). Hidden Markov
Model For Human Genes: Periodic Pattern in Exon Sequence.
Theoretical and Computational Methods in Genome Research, 2: 15-32.
Plenum Press, New York.
[18] S. R. Eddy, (1996). Multiple Alignment Using Hidden Markov Models.
Dept. of Genetics, Washington University School of Medicine St. Louis,
US.
[19] M. Gen, R. Cheng. (2000). Genetic Algorithms and Engineering
Optimization. John Wiley & Sons: Canada.
[20] D. Beasley, D. R. Bull, R. R. Martin, (1993). An Overview of Genetic
Algorithms: Part 1, Fundamentals. Inter-University Committee on
Computing. 15(2) 58-69. [21] CCS 501 Neural Network and Genetic
Algorithm http:// office1.cs.usm.my.
[21] C. Notredame, and D. G. Higgins, (1996). SAGA: sequence alignment
by genetic algorithm, Nuc. Acids Res., 24(8), 1515-1524.
[22] M. Isokawa, M. Wayama, T. Shimizu, (1996). Multiple Sequence
Alignment using Genetic Algorithm. Department of Information Science,
Faculty of Science, Hirosaki University, Japan.
[23] J. T. Horng, C. N. Lin, B. H. Yang, and C. Y. Kao, (2001). A genetic
algorithm for multiple sequence alignment. In Poster in German
Conference on Bioinformatics.
[24] S. Kirkpatrick, C. D. J. Gellat, and M. P. Vecchi, (1983). Optimization
by Simulated Annealing, Science, 220 pp. 671-680.