Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now · Maps

Tutorial

: Node-Level Performance Engineering

Presenters

Georg Hager

Gerhard Wellein

Event Type

Tutorial

Registration Categories

Tags

TimeSunday, 17 November 20198:30am - 5pm

Location302

DescriptionThe advent of multi- and manycore chips has led to a further opening of the gap between peak and application performance for many scientific codes. This trend is accelerating as we move from petascale to exascale. Paradoxically, bad node-level performance helps to "efficiently" scale to massive parallelism, but at the price of increased overall time to solution. If the user cares about time to solution on any scale, optimal performance on the node level is often the key factor. We convey the architectural features of current processor chips, multiprocessor nodes, and accelerators, as far as they are relevant for the practitioner. Peculiarities like SIMD vectorization, shared vs. separate caches, bandwidth bottlenecks, and ccNUMA characteristics are introduced, and the influence of system topology and affinity on the performance of typical parallel programming constructs is demonstrated. Performance engineering and performance patterns are suggested as powerful tools that help the user understand the bottlenecks at hand and to assess the impact of possible code optimizations. A cornerstone of these concepts is the roofline model, which is described in detail, including useful case studies, limits of its applicability, and possible refinements.

Presenters

Georg Hager

University of Erlangen-Nuremberg

Erlangen Regional Computing Center

Gerhard Wellein

University of Erlangen-Nuremberg

Erlangen Regional Computing Center