BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163559Z
LOCATION:605
DTSTART;TZID=America/Denver:20191117T113000
DTEND;TZID=America/Denver:20191117T120000
UID:submissions.supercomputing.org_SC19_sess109_ws_exampi103@linklings.com
SUMMARY:Node-Aware Improvements to Allreduce
DESCRIPTION:Workshop\n\nNode-Aware Improvements to Allreduce\n\nBienz, Ols
 on, Gropp\n\nThe MPI_Allreduce collective operation is a core kernel of ma
 ny parallel codebases, particularly for reductions over a single value per
  process.  The commonly used allreduce recursive-doubling algorithm obtain
 s the lower bound message count, yielding optimality for small reduction s
 izes based on node-agnostic performance models.  However, this algorithm y
 ields duplicate messages between sets of nodes.  Node-aware optimizations 
 in MPICH remove duplicate messages through use of a single master process 
 per node, yielding a large number of inactive processes at each inter-node
  step.  In this paper, we present an algorithm that uses the multiple proc
 esses available per node to reduce the maximum number of inter-node messag
 es communicated by a single process, improving the performance of allreduc
 e operations, particularly for small message sizes.\n\nTag: Workshop Reg P
 ass, Exascale, MPI, Networks, Parallel Programming Languages, Libraries, a
 nd Models\n\nRegistration Category: Workshop Reg Pass, Exascale, MPI, Netw
 orks, Parallel Programming Languages, Libraries, and Models
URL:https://sc19.supercomputing.org/presentation/?id=ws_exampi103&sess=ses
 s109
END:VEVENT
END:VCALENDAR