BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163600Z
LOCATION:607
DTSTART;TZID=America/Denver:20191118T115000
DTEND;TZID=America/Denver:20191118T121000
UID:submissions.supercomputing.org_SC19_sess124_ws_lasalss103@linklings.co
 m
SUMMARY:Optimization of a Solver for Computational Materials and Structure
 s Problems on NVIDIA Volta and AMD Instinct GPUs
DESCRIPTION:Workshop\n\nOptimization of a Solver for Computational Materia
 ls and Structures Problems on NVIDIA Volta and AMD Instinct GPUs\n\nZubair
 , Warner, Wagner\n\nThe Scalable Implementation of Finite Elements by NASA
  (ScIFEN) is a software package developed to solve complex computational m
 aterials and structures problems using the finite element method (FEM). In
  this paper, we describe optimization techniques to speed up the linear so
 lver computation that occurs within the ScIFEN application. We consider GP
 Us from two different vendors, NVIDIA and AMD as our target platforms for 
 optimization and highlight differences in performance and optimization tec
 hniques. The NVIDIA GPU Volta V100 is used in the Summit system deployed a
 t Oak Ridge National Laboratory, and the new exascale system, Frontier,  w
 ill be using AMD Radeon Instinct GPU.  We evaluated the performance of var
 ious optimization techniques on test matrices, ranging in size from 100K t
 o 4M, that are representative of ScIFEN applications. The linear solver co
 mputation is memory-bound on both GPUs. Our experiments show that on the N
 VIDIA GPU we obtained up to 79% of the theoretical peak bandwidth, while t
 he AMD GPU achieved 59%.  Overall, the NVIDIA V100 GPU outperforms the AMD
  MI 25 GPU.  We observed an overall speedup of up to 37X on an NVIDIA V100
  compared to an Intel Skylake 12-core machine. The solver for a 4M degree 
 of freedom system took under 2.5 seconds.\n\nTag: Workshop Reg Pass, Algor
 ithms, Scalable Computing\n\nRegistration Category: Workshop Reg Pass, Alg
 orithms, Scalable Computing
URL:https://sc19.supercomputing.org/presentation/?id=ws_lasalss103&sess=se
 ss124
END:VEVENT
END:VCALENDAR

