SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

HyperX Topology: First At-Scale Implementation and Comparison to the Fat-Tree

Authors: Jens Domke (RIKEN Center for Computational Science (R-CCS), RIKEN), Satoshi Matsuoka (RIKEN Center for Computational Science (R-CCS), RIKEN), Ivan Radanov Ivanov (Tokyo Institute of Technology), Yuki Tsushima (Tokyo Institute of Technology), Tomoya Yuki (Tokyo Institute of Technology), Akihiro Nomura (Tokyo Institute of Technology), Shin'ichi Miura (Tokyo Institute of Technology), Nic McDonald (Hewlett Packard Enterprise), Dennis Lee Floyd (Hewlett Packard Enterprise), Nicolas Dubé (Hewlett Packard Enterprise)

Abstract: The de-facto standard topology for modern HPC systems and data-centers are Folded Clos networks, commonly known as Fat-Trees. The number of network endpoints in these systems is steadily increasing. The switch radix increase is not keeping up, forcing an increased path length in these multi-level trees that will limit gains for latency-sensitive applications. Additionally, today's Fat-Trees force the extensive use of active optical cables which carries a prohibitive cost-structure at scale.

To tackle these issues, researchers proposed various low-diameter topologies, such as Dragonfly. Another novel, but only theoretically studied, option is the HyperX. We built the world's first 3 Pflop/s supercomputer with two separate networks, a 3-level Fat-Tree and a 12x8 HyperX. This dual-plane system allows us to perform a side-by-side comparison using a broad set of benchmarks. We show that the HyperX, together with our novel communication pattern-aware routing, can challenge the performance of, or even outperform, traditional Fat-Trees.

Presentation: file

Back to Technical Papers Archive Listing