A Practical Distributed String Matching Algorithm Architecture and Implementation
Traditional parallel single string matching algorithms
are always based on PRAM computation model. Those algorithms
concentrate on the cost optimal design and the theoretical speed.
Based on the distributed string matching algorithm proposed by
CHEN, a practical distributed string matching algorithm architecture
is proposed in this paper. And also an improved single string matching
algorithm based on a variant Boyer-Moore algorithm is presented. We
implement our algorithm on the above architecture and the
experiments prove that it is really practical and efficient on distributed
memory machine. Its computation complexity is O(n/p + m), where n
is the length of the text, and m is the length of the pattern, and p is the
number of the processors.
[1] Zheng Liu, Xin Chen, James Borneman and Tao Jiang, "A fast algorithm
for approximate string matching on gene sequences," in Symposium. 16th
Annu. Combinatorial Pattern Matching, LNCS, Springer-Verlag, vol.
3537, pp. 79-90, June 2005.
[2] N. Tuck, T. Sherwood, B. Calder and G. Varghese, "Deterministic
memory-efficient string matching algorithms for intrusion detection," in
Proc. IEEE INFOCOM, vol. 4, pp. 2628-2639, March 2004.
[3] D. E. Knuth, J. H. Morris, and V. R. Pratt, "Fast pattern matching in
strings," SIAM Journal on Computing, vol. 6, pp. 323-350, 1977.
[4] R. S. Boyer and J. S. Moore, "A fast string searching algorithm,"
Communications of the ACM, vol. 20, pp. 762-772, 1977.
[5] R. N. Horspool, "Practical fast searching in strings," Software - Practice
and Experience, vol. 10, pp. 501-506, 1980.
[6] D. M. Sunday, "A very fast substring search algorithm," Communications
of the ACM, vol. 33, pp. 132-142 ,1990.
[7] D. Cantone and S. Faro, "Fast-Search: A new efficient variant of the
Boyer-Moore string matching algorithm," in Proc. Second International
Workshop on Experimental and Efficient Algorithms, LNCS,
Springer-Verlag, Vol. 2647, pp. 47-58, May 2003.
[8] Z. Galil, "Optimal parallel algorithms for string matching," in Proc. 16th
Annu. ACM symposium on Theory of computing, pp. 240-248, 1984.
[9] U. Vishkin, "Optimal parallel matching in strings," Information and
control, vol. 67, pp. 91-113, 1985.
[10] Y. Takefuji, T. Tanaka, and K. C. Lee, "A parallel string search
algorithm", IEEE Trans. Systems, Man and Cybernetics, vol. 22, pp.
332-336, March-April 1992.
[11] CHEN Guo-liang, LIN-Jie, and GU Nai-jie, "Design and analysis of
string matching algorithm on distributed memory machine," Journal of
Software, vol. 11, pp. 771-778, 2000.
[12] C. Charras and T. Lecroq, "Exact string matching algorithms,"
Laboratoire d'Informatique de Rouen Université de Rouen. Available:
http://www-igm.univ-mlv.fr/~lecroq/string/
[13] Z. Galil, "On improving the worst case running time of the Boyer-Moore
string matching algorithm," Communications of the ACM, vol. 22, pp.
505-508, 1979.
[14] R. Cole. "Tight bounds on the complexity of the Boyer-Moore string
matching algorithm," SIAM Journal on Computing, vol. 23, pp.
1075-1091, 1994.
[15] GU Nai-jie, LI Wei and LIU Jing, "Fibonacci series-based multicast
algorithm," Chinese Journal of Computers, vol. 25, pp. 365-372, 2002.
[1] Zheng Liu, Xin Chen, James Borneman and Tao Jiang, "A fast algorithm
for approximate string matching on gene sequences," in Symposium. 16th
Annu. Combinatorial Pattern Matching, LNCS, Springer-Verlag, vol.
3537, pp. 79-90, June 2005.
[2] N. Tuck, T. Sherwood, B. Calder and G. Varghese, "Deterministic
memory-efficient string matching algorithms for intrusion detection," in
Proc. IEEE INFOCOM, vol. 4, pp. 2628-2639, March 2004.
[3] D. E. Knuth, J. H. Morris, and V. R. Pratt, "Fast pattern matching in
strings," SIAM Journal on Computing, vol. 6, pp. 323-350, 1977.
[4] R. S. Boyer and J. S. Moore, "A fast string searching algorithm,"
Communications of the ACM, vol. 20, pp. 762-772, 1977.
[5] R. N. Horspool, "Practical fast searching in strings," Software - Practice
and Experience, vol. 10, pp. 501-506, 1980.
[6] D. M. Sunday, "A very fast substring search algorithm," Communications
of the ACM, vol. 33, pp. 132-142 ,1990.
[7] D. Cantone and S. Faro, "Fast-Search: A new efficient variant of the
Boyer-Moore string matching algorithm," in Proc. Second International
Workshop on Experimental and Efficient Algorithms, LNCS,
Springer-Verlag, Vol. 2647, pp. 47-58, May 2003.
[8] Z. Galil, "Optimal parallel algorithms for string matching," in Proc. 16th
Annu. ACM symposium on Theory of computing, pp. 240-248, 1984.
[9] U. Vishkin, "Optimal parallel matching in strings," Information and
control, vol. 67, pp. 91-113, 1985.
[10] Y. Takefuji, T. Tanaka, and K. C. Lee, "A parallel string search
algorithm", IEEE Trans. Systems, Man and Cybernetics, vol. 22, pp.
332-336, March-April 1992.
[11] CHEN Guo-liang, LIN-Jie, and GU Nai-jie, "Design and analysis of
string matching algorithm on distributed memory machine," Journal of
Software, vol. 11, pp. 771-778, 2000.
[12] C. Charras and T. Lecroq, "Exact string matching algorithms,"
Laboratoire d'Informatique de Rouen Université de Rouen. Available:
http://www-igm.univ-mlv.fr/~lecroq/string/
[13] Z. Galil, "On improving the worst case running time of the Boyer-Moore
string matching algorithm," Communications of the ACM, vol. 22, pp.
505-508, 1979.
[14] R. Cole. "Tight bounds on the complexity of the Boyer-Moore string
matching algorithm," SIAM Journal on Computing, vol. 23, pp.
1075-1091, 1994.
[15] GU Nai-jie, LI Wei and LIU Jing, "Fibonacci series-based multicast
algorithm," Chinese Journal of Computers, vol. 25, pp. 365-372, 2002.
@article{"International Journal of Information, Control and Computer Sciences:58377", author = "Bi Kun and Gu Nai-jie and Tu Kun and Liu Xiao-hu and Liu Gang", title = "A Practical Distributed String Matching Algorithm Architecture and Implementation", abstract = "Traditional parallel single string matching algorithms
are always based on PRAM computation model. Those algorithms
concentrate on the cost optimal design and the theoretical speed.
Based on the distributed string matching algorithm proposed by
CHEN, a practical distributed string matching algorithm architecture
is proposed in this paper. And also an improved single string matching
algorithm based on a variant Boyer-Moore algorithm is presented. We
implement our algorithm on the above architecture and the
experiments prove that it is really practical and efficient on distributed
memory machine. Its computation complexity is O(n/p + m), where n
is the length of the text, and m is the length of the pattern, and p is the
number of the processors.", keywords = "Boyer-Moore algorithm, distributed algorithm,parallel string matching, string matching.", volume = "1", number = "10", pages = "3177-5", }