SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

LPCC: Hierarchical Persistent Client Caching for Lustre

Authors: Yingjin Qian (DataDirect Networks (DDN)), Xi Li (DataDirect Networks (DDN)), Shuichi Ihara (DataDirect Networks (DDN)), Andreas Dilger (Whamcloud Inc), Carlos Thomaz (DataDirect Networks (DDN)), Shilong Wang (DataDirect Networks (DDN)), Wen Cheng (Huazhong University of Science and Technology), Chunyan Li (Huazhong University of Science and Technology), Lingfang Zeng (Huazhong University of Science and Technology), Fang Wang (Huazhong University of Science and Technology), Dan Feng (Huazhong University of Science and Technology), Tim Suesst (Johannes Gutenberg University Mainz), Andre Brinkmann (Johannes Gutenberg University Mainz)

Abstract: Most high-performance computing (HPC) clusters today use a global parallel file system to enable high data throughput. The parallel file system is typically centralized, and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often additionally equipped with SSDs. The node internal storage media are rarely well-integrated into the I/O and compute workflows.

In this paper, we propose a hierarchical Persistent Client Caching (LPCC) mechanism for the Lustre file system. LPCC integrates with the Lustre HSM solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system for distributed persistent client caching. The evaluation results presented in this paper show LPCC's advantages for various workloads, enabling even speed-ups linear in the number of clients for several real-world scenarios.

Presentation: file

Back to Technical Papers Archive Listing