EuroSciPy 2026

Deploying and debugging GPU accelerated Python workloads
2026-07-23 , Room 1.19 (Ground Floor, Shannon)

Leveraging GPU acceleration is now a common necessity for scaling Python projects. NVIDIA GPUs offer unmatched speed and efficiency for data processing and model training, significantly reducing the time and cost associated with these tasks. GPU acceleration is already baked into many projects, or available via plugins. You can use PyData libraries including pandas, polars and networkx without needing to rewrite your code to get the benefits of GPU acceleration.

However, integrating GPUs into our workflow can be a new challenge where we need to learn about installation, dependency management, and deployment in the Python ecosystem. When writing code, we also need to monitor performance, leverage hardware effectively, and debug when things go wrong

This is where RAPIDS and its tooling ecosystem comes to the rescue. RAPIDS, is a collection of open source software libraries to execute end-to-end data pipelines on NVIDIA GPUs using familiar PyData APIs.

In this tutorial we will cover:
- Answers to questions like: “Where do I get a GPU?”, “How do I run a container on a VM with a GPU?”, “How do I install GPU packages into an existing environment?”, “What if I use uv pip?”, “What about conda? ”as well as follow along examples to get a GPU up and running.
- Troubleshooting and monitoring: Examples of performance analysis, diagnostics, and debugging. Showcasing of diagnostic tools like nvdashboard, nvtop, nsys, pynvml, etc.


Audience

This is a hands-on tutorial, participants should ideally have some experience using Python, pandas and sci-kit learn. We'll use cloud-based VMs, so familiarity with the cloud and resource creation is helpful but not required. No prior GPU knowledge is needed.

To maximize the tutorial's relevance, we will provide participants with the opportunity to submit their specific environment configurations ahead of time. Submissions received with adequate notice (between tutorial acceptance and conference date) will be integrated into the tutorial examples, allowing participants to see their real-world use cases addressed.

Key takeaways for participants will be:

  • An understanding of the GPU Python software stack from driver through core libraries to high-level Python libraries
  • How they can use their preferring tooling and package managers to install all the components they need
  • How to monitor their GPUs and understand how well they are using their hardware
  • How to attach debuggers to their GPU code or record traces and profiles for debugging later

Notes

This is a hands-on tutorial, we expect the audience to follow along with the material in an active manner. It will also include exercises to do during the tutorial.

Format

In the session we will be walking through the material as a lecture and students will be following along on their own VMs. So the whole thing is an interwoven mix of lecture and exercises. We want students to be as hands on as possible to get a deep understanding of the software environment they are setting up.

We also want students to direct the material so that it can be as close to their real world use cases. We will give students an opportunity ahead of the conference to tell us about their software environments so that we can tailor material to them. We will also have a "choose your own adventure" style in some sections where we can put more emphasis on one tool over another depending on who is in the room. For example when covering package managers we will have material for pip, conda, uv and pixi, but we will survey the room and then cover the relevant ones to the audience.

Internet requirements

Participants will be given access to a cloud VM which they will access via SSH or the Jupyter web UI. Both of these have very low bandwidth requirements, but will require an active connection.

Outline

  • 0 mins- Intro and Setup
  • Introduce common libraries and tools where GPUs are leveraged in Python
  • Show some quick demos of how to run Python code that uses the GPU
  • 15 mins - How do I get a GPU?
  • Give participants access to cloud GPU resources
  • While the VMs start we will talk about alternatives
  • Everyone has their own tool and vendor preferences but the principles are the same
  • 30 mins - An exploration of Python package managers
  • pip, conda, uv and pixi are just some of the popular package managers
  • You can install GPU software with all of them, but there are differences and nuances
  • Participants will set up various GPU Python environments with these tools to get an understanding of the need to know differences
  • 50 mins - Monitoring
  • Running GPU accelerated Python code is just like running normal code
  • We will see how to verify our code really is being GPU accelerated
  • Participants will run examples and gather metrics on utilization and memory use with various tools
  • 70 mins - Debugging
  • Once you run some code you need to understand how it is performing
  • When your code crashes you need to debug and inspect what went wrong
  • GPU libraries can abstract complexity away, making them harder to debug
  • Participants will install and use various debugging tools to explore how to debug GPU accelerated code examples
  • 90 mins - Close

Expected audience expertise: Domain: some Expected audience expertise: Python: some Supporting material: Supporting material Your relationship with the presented work/project: Original author or co-author, Active contributor, Developed the presented feature, Maintainer of the presented library/project, Developed original workshop or study course

Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source Kubernetes Python framework kr8s in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.