BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163557Z
LOCATION:603
DTSTART;TZID=America/Denver:20191118T143000
DTEND;TZID=America/Denver:20191118T150000
UID:submissions.supercomputing.org_SC19_sess122_ws_pmbsf101@linklings.com
SUMMARY:CUDA Flux: A Lightweight Instruction Profiler for CUDA Application
 s
DESCRIPTION:Workshop\n\nCUDA Flux: A Lightweight Instruction Profiler for 
 CUDA Applications\n\nBraun, Fröning\n\nGPUs are powerful, massively parall
 el processors, which require a vast amount of thread parallelism to keep t
 heir thousands of execution units busy, and to tolerate latency when acces
 sing its high-throughput memory system.  Understanding the behavior of mas
 sively threaded GPU programs can be difficult, even though recent GPUs pro
 vide an abundance of hardware performance counters, which collect statisti
 cs about certain events.  Profiling tools that assist the user in such ana
 lysis for their GPUs, like NVIDIA's nvprof and cupti, are state-of-the-art
 .  However, instrumentation based on reading hardware performance counters
  can be slow, in particular when the number of metrics is large. Furthermo
 re, the results can be inaccurate as instructions are grouped to match the
  available set of hardware counters.\n\nTag: Workshop Reg Pass, Benchmarks
 , Performance, Scientific Computing, Simulation\n\nRegistration Category: 
 Workshop Reg Pass, Benchmarks, Performance, Scientific Computing, Simulati
 on
URL:https://sc19.supercomputing.org/presentation/?id=ws_pmbsf101&sess=sess
 122
END:VEVENT
END:VCALENDAR

