Parallel I/O in Practice
TimeSunday, 17 November 20198:30am - 5pm
DescriptionI/O on HPC systems is a black art. This tutorial sheds light on the state-of-the-art in parallel I/O and provides the knowledge necessary for attendees to best leverage I/O resources available to them. We cover the entire I/O software stack including storage and parallel file systems at the lowest layer, the role of burst buffers (NVRAM), intermediate layers (such as MPI-IO), and high-level I/O libraries (such as HDF5). We emphasize ways to use these interfaces that result in high performance and tools for generating insight into these stacks. Benchmarks on real systems are used throughout to show real-world results.
In the first third of the tutorial we cover the fundamentals of parallel I/O. We discuss storage technologies, both present and near-future. Our parallel file systems material covers general concepts and gives examples from Lustre, GPFS, PanFS, HDFS, Ceph, and Cloud Storage. Our second third takes a more application-oriented focus. We examine the upper library layers of the I/O stack, covering MPI-IO, Parallel netCDF, and HDF5. We discuss interface features, show code examples, and describe how application calls translate into PFS operations. Finally we discuss tools for capturing and understanding I/O behavior.