Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture

The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.




References:
[1] El-Ghazawi T., Carlson W., Sterling T, Yelick K.: UPC: Distributed
Shared Memory Programming. Book, John Wiley and Sons Inc.,
NewYork. ISBN: 0-471-22048-5, 2005.
[2] Yap T. K., FriederO., Martino R. L.: Parallel Computation in Biological
Sequence Analysis. IEEE Transactions on Parallel and Distributed
Systems, Vol. 9, N 3, pp. 283-294, 1998.
[3] Needleman, S. B., Wunsch C. D: A general method applicable to the
search for similarities in the amino acid sequence of two proteins. Journal
of Molecular Biology, Vol. 47, pp. 443-453, 1970.
[4] Smith W., Waterman M.: Identification of Common Molecular Subsequences.
Journal of Molecular Biology Vol. 147, pp. 195-197, 1981.
[5] Gaber J.: Complexity Measure Approach for Partitioned Shared Memory
Model, Application to UPC. Research report RR-10-04. Universite de
Technologie de Belfort-Montbeliard, 2004.
[6] Lu M., Lin L.: Parallel algorithms for the Longest Common Subsequence
Problem. IEEE Transaction on Parallel and Distributed System, Vol.5,
pp.835-848, 1994.
[7] Valiant L.G.: A bridging model for parallel computation. Communication
of the ACM, Vol. 33, N 8, pp. 103-111, 1990.
[8] http://www.sun.com/servers/coolthreads/t1000/benchmarks.jsp
[9] Garcia T., Myoupo J-F., Seme D.: A Coarse-Grained Multi-computer
algorithm for the longest common subsequence problem. 11-th Euromicro
Conference on Parallel Distributed and Network based Processing, 2003.
[10] Alves C. E. R., Cceres E. N., Dehne F.: Parallel Dynamic Programming
for Solving the String Editing Problem on a CGM/BSP. In proceeding of
ACM SPAA-02, pp. 275-281, 2002.
[11] Alves C. E. R. Cceres E. N., Dehne F. , Song S. W.: A Parallel Wavefront
Algorithm for Efficient Biological Sequence Comparison. International
Conference on Computational Science and its Applications, Montreal,
Canada, May 18-21, Lecture Notes in Computer Science, V. 2668, pp.
249-258, 2003.
[12] Nicholas P. P.: Searching Biological Sequence Databases Using Distributed
Adaptive Computing. Master thesis, Master of Science in Computer
Engineering, Virginia Polytechnic Institute and State University,
2003, available at http://scholar.lib.vt.edu/theses/
[13] Chen Y., Wan A., Liu W.: A fast Parallel Algorithm for Finding
the Longest Common Sequence of Multiple Bio-sequences. Symposium
of Computations in Bioinformatics and Bioscience (SCBB06). In conjunction
with the International Multi-Symposiums on Computer and
Computational Sciences 2006 (IMSCCS06), 2006.
[14] Bader D.A., Madduri K.,: A Graph-Theoretic Analysis of the Human
Protein-Interaction Network Using Multicore Parallel Algorithms. Sixth
IEEE International Workshop on High Performance Computational Biology
(HiCOMB-07), 2007.
[15] Bader D. A., Kanade V., Madduri K.: SWARM, A Parallel Programming
Framework for Multicore Processors. First Workshop on Multithreaded
Architectures and Applications (MTAAP-07), 2007.
[16] Voss G., Schrder A., Mller-Wittig W. Schmidt B.: Biological
Sequence Alignment on Graphics Processing Units. Available at
http://www.ntu.edu.sg/home/asbschmidt/paper/BioGPU.pdf
[17] Kayi A., Yao Y., El-Ghazawi T., Newby G.: Experimental Evaluation
ofEmerging Multi-core Architectures, Workshop on Performance
Modeling, Evaluation, and Optimisation of Ubiquitous Computing and
Networked Systems (PMEO07) IPDPS07 Proceedings, pp. 1-6, 2007.
[18] Chen W-Y., Bonachea D., Duell J., Husbands P., Iancu C., Yelick K.:
A Perfor- mance Analysis of the Berkley UPC Compiler. In Annual
International Conference on Supercomputing (ICS), 2003.
[19] Cantonnet F., El Ghazawi T., Lorenz P., Gaber J.: Fast Address Translation
Tech- niques for Distributed Shared Memory Compilers. International
Parallel and Dis- tributed Processing Symposium IPDPS06, 2006.
[20] Bakhouya M., Gaber J., El-Ghazawi T.: Towards a Complexity Model
for Design and Analysis of PGAS-Based Algorithms, HPCC 2007 Proceedings,
LNCS 4782 Springer, ISBN 978-3-540-75443-5, pp.672-682,
2007.