BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163603Z
LOCATION:502-503-504
DTSTART;TZID=America/Denver:20191117T170000
DTEND;TZID=America/Denver:20191117T173000
UID:submissions.supercomputing.org_SC19_sess101_ws_dls120@linklings.com
SUMMARY:Strategies to Deploy and Scale Deep Learning on the Summit Superco
 mputer
DESCRIPTION:Workshop\n\nStrategies to Deploy and Scale Deep Learning on th
 e Summit Supercomputer\n\nYin, Gahlot, Laanait, Maheshwari, Morrison...\n\
 nThe rapid growth and wide applicability of Deep Learning (DL) frameworks 
 poses challenges to computing centers which need to deploy and support the
  software, and also to domain scientists who have to keep up with the syst
 em environment and scale up scientific exploration through DL. We offer re
 commendations for deploying and scaling DL frameworks on the Summit superc
 omputer, currently atop the Top500 list, at the Oak Ridge National Laborat
 ory Leadership Computing Facility (OLCF). We discuss DL software deploymen
 t in the form of containers, and compare performance of native-built frame
 works and containerized deployment. Software containers show no noticeable
  negative performance impact and exhibit faster Python loading times and p
 romise easier maintenance. To explore strategies for scaling up DL model t
 raining campaigns, we assess DL compute kernel performance, discuss and re
 commend I/O data formats and staging, and identify communication needs for
  scalable message exchange for DL runs at scale. We recommend that users t
 ake a step-wise tuning approach beginning with algorithmic kernel choice, 
 node I/O configuration, and communications tuning as best-practice.   We p
 resent baseline examples of scaling efficiency 87% for a DL run of ResNet5
 0 running on 1024 nodes (6144 V100 GPUs).\n\nTag: Workshop Reg Pass, Deep 
 Learning, Scientific Computing\n\nRegistration Category: Workshop Reg Pass
 , Deep Learning, Scientific Computing
URL:https://sc19.supercomputing.org/presentation/?id=ws_dls120&sess=sess10
 1
END:VEVENT
END:VCALENDAR

