BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.com//scipy-2026//speaker//VKG8RE
BEGIN:VTIMEZONE
TZID:CST
BEGIN:STANDARD
DTSTART:20001029T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10;UNTIL=20061029T080000Z
TZNAME:CST
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
END:STANDARD
BEGIN:STANDARD
DTSTART:20071104T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:CST
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000402T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=4;UNTIL=20060402T090000Z
TZNAME:CDT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
END:DAYLIGHT
BEGIN:DAYLIGHT
DTSTART:20070311T030000
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:CDT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-scipy-2026-77PGCB@pretalx.com
DTSTART;TZID=CST:20260715T143500
DTEND;TZID=CST:20260715T150500
DESCRIPTION:Your GPU is fast\, so why does your Python code still feel slow
 ? This talk shows a practical\, Python-first profiling workflow with Nsigh
 t Systems\, Nsight Compute\, and NVTX for CuPy\, Numba\, PyTorch\, JAX\, a
 nd CUDA extensions. We will use timelines to find launch overhead\, hidden
  synchronizations\, and host-device copies\, then drill into kernel bottle
 necks like memory throughput and occupancy. You will leave with a repeatab
 le loop for turning profiles into measurable speedups.
DTSTAMP:20260622T121535Z
LOCATION:Johnson Great Room
SUMMARY:Profiling Python GPU Code - Bryce Adelstein Lelbach\, Bradley Dice
URL:https://pretalx.com/scipy-2026/talk/77PGCB/
END:VEVENT
END:VCALENDAR
