Optimal All-to-All Personalized Communication in All-Port Tori

All-to-all personalized communication, also known as complete exchange, is one of the most dense communication patterns in parallel computing. In this paper, we propose new indirect algorithms for complete exchange on all-port ring and torus. The new algorithms fully utilize all communication links and transmit messages along shortest paths to completely achieve the theoretical lower bounds on message transmission, which have not be achieved among other existing indirect algorithms. For 2D r × c ( r % c ) all-port torus, the algorithm has time complexities of optimal transmission cost and O(c) message startup cost. In addition, the proposed algorithms accommodate non-power-of-two tori where the number of nodes in each dimension needs not be power-of-two or square. Finally, the algorithms are conceptually simple and symmetrical for every message and every node so that they can be easily implemented and achieve the optimum in practice.





References:
[1] P.K. McKinley, Y.J. Tsai, and D.F. Robinson, "A Survey of Collective
Communication in Wormhole-Routed Massively Parallel Computers,"
Technical Report, MSU-CPS-94-35, Michigan State University, June
1994.
[2] S. Sur, H.W. Jin, and D.K. Panda, "Efficient and scalable all-to-all
personalized exchange for infiniband-based clusters," Proc. 2004 Int-l
Conf. on Parallel Processing (ICPP-04), pp. 275-282, Aug. 2004.
[3] C.C. Lam, C.H. Huang, and P. Sadayappan, "Optimal Algorithms for
All-to-All Personalized Communication on Rings and Two Dimensional
Tori," Journal of Parallel and Distributed Computing, No. 43, pp. 3-13,
1997.
[4] Y.C. Tseng and S.K.S. Gupta, "All-to-All Personalized Communication
in a Wormhole-Routed Torus", IEEE Trans. Parallel and Distributed
Systems, vol. 7, no. 5, pp. 498-505, May. 1996.
[5] Y.C. Tseng, T.H. Lin, S.K.S. Gupta and D.K. Panda,
"Bandwidth-Optimal Complete Exchange on Wormhole- Routed 2D/3D
torus Networks: A Diagonal-Propagation Approach," IEEE Trans.
Parallel and Distributed Systems, vol. 8, no. 4, pp. 380-396, Apr. 1997.
[6] Y.J. Suh and S. Yalamanchili, "All-to-All Communication with
Minimum Start-Up Costs in 2D/3D Tori and Meshes," IEEE Trans.
Parallel and Distributed Systems, vol. 9, no. 5, pp. 442-458, May 1998.
[7] Y.C. Tseng, S.Y. Ni, and J.P. Sheu, "Toward Optimal Complete
Exchange on Wormhole-Routed Tori," IEEE Trans. Computers, vol. 48,
no. 10, pp. 1065-1082, Oct. 1999.
[8] GU Naijie, "Efficient Indirect All-to-All Personalized Communication on
Rings and 2-D Tori," Journal of Computer Science and Technology, vol.
16, no. 5, pp. 480-483, Sept. 2001.
[9] N.S. Sundar, D.N. Jayasimha, D.K. Panda, and P. Sadayappan, "Hybrid
Algorithms for Complete Exchange in 2D Meshes," IEEE Trans. Parallel
and Distributed Systems, vol. 12, no. 12, pp. 1201-1218, Dec. 2001.
[10] Y.J. Suh and K.G. Shin, "All-to-All Personalized Communication in
Multidimensional Torus and Mesh Networks," IEEE Trans. Parallel and
Distributed Systems, vol. 12, no. 1, pp. 38-59, Jan. 2001.
[11] D.S. Scott, "Efficient All-to-All Communication Patterns in Hypercube
and Mesh Topologies," Proc. Sixth Conf. Distributed Memory
Concurrent Computers, pp. 398- 403, Portland, OR, April 1991.
[12] S.C. Chau and A.W.C. Fu, "An optical multistage interco- nnection
network for optimal all-to-all personalized exchange," Proc. fourth Int-l
Conf. Parallel and Distributed Computing, Applications and
Technologies (PDCAT'2003), pp. 292-295, 27-29 Aug. 2003.
[13] Y.Y. Yang and J.C. Wang, "Near-Optimal All-to-All Broadcast in
Multidimensional All-Port Meshes and Tori," IEEE Trans. Parallel and
Distributed Systems, vol. 13, no. 2, pp. 128-141!Feb. 2002.
[14] J.P. Jung and I. Sakho, "A methodology for devising optimal all-port
all-to-all broadcast algorithms in 2-dimensional tori," 28th Annual IEEE
Int-l Conf. Local Computer Networks (LCN'03), pp. 558-566, Oct. 2003.