WORKS19 Keynote: Priority Research Directions for In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources

SC19 Proceedings

WORKS19 Keynote: Priority Research Directions for In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources

Workshop: WORKS19 Keynote: Priority Research Directions for In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources

Abstract: Scientific computing will increasingly incorporate a number of different tasks that need to be managed along with the main simulation or experimental tasks—ensemble analysis, data-driven science, artificial intelligence, machine learning, surrogate modeling, and graph analytics—all nontraditional applications unheard of in HPC just a few years ago. Many of these tasks will need to execute concurrently, that is, in situ, with simulations and experiments sharing the same computing resources.

There are two primary, interdependent motivations for processing and managing data in situ. The first motivation is the need to decrease data volume. The in situ methodology can make critical contributions to managing large data from computations and experiments to minimize data movement, save storage space, and boost resource efficiency—often while simultaneously increasing scientific precision. The second motivation is that the in situ methodology can enable scientific discovery from a broad range of data sources—HPC simulations, experiments, scientific instruments, and sensor networks—over a wide scale of computing platforms: leadership-class HPC, clusters, clouds, workstations, and embedded devices at the edge.

The successful development of in situ data management capabilities can potentially benefit real-time decision making, design optimization, and data-driven scientific discovery. This talk will feature six priority research directions that highlight the components and capabilities needed for in situ data management to be successful for a wide variety of applications: making in situ data management more pervasive, controllable, composable, and transparent, with a focus on greater coordination with the software stack, and a diversity of fundamentally new data algorithms.

Back to The 14th Workshop on Workflows in Support of Large-Scale Science (WORKS19) Archive Listing

Back to Full Workshop Archive Listing