Workshop: RDMA-Based Library for Collective Operations in MPI
Abstract: In most MPI implementations, abstraction layers separate the collective operation algorithms from the communication primitives, thus hindering its optimization with network acceleration technologies, such as RDMA. Open UCX is an RDMA-based point-ot-point communication library, that can reduce the latency between processes in MPI applications, particularly in large-scale system. This paper presents a design and implementation of a library for MPI collective operations, by extending Open UCX. Our approach is transparent to MPI applications, and can reduce the latency of repeated calls to such operations by an average of 8% for relatively small message sizes and as much as 90% for larger messages.