SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Topology-Custom UGAL Routing on Dragonfly

Authors: Md Shafayat Rahman (Florida State University), Saptarshi Bhowmik (Florida State University), Yevgeniy Ryasnianskiy (Florida State University), Xin Yuan (Florida State University), Michael Lang (Los Alamos National Laboratory)

Abstract: The Dragonfly network has been deployed in the current generation supercomputers and will be used in the next generation supercomputers. The Universal Globally Adaptive Load-balance routing (UGAL) is the state-of-the-art routing scheme for Dragonfly. In this work, we show that the performance of the conventional UGAL can be further improved on many practical Dragonfly networks, especially the ones with a small number of groups, by customizing the paths used in UGAL for each topology. We develop a scheme to compute the custom sets of paths for each topology and compare the performance of our topology-custom UGAL routing (T-UGAL) with conventional UGAL. Our evaluation with different UGAL variations and different topologies demonstrates that by customizing the routes, T-UGAL offers significant improvements over UGAL on many practical Dragonfly networks in terms of both latency when the network is under low load and throughput when the network is under high load.

Presentation: file

