State of In-Browser ML: WebAssembly, WebGPU, and the Modern Stack PyCon DE & PyData 2026

State of In-Browser ML: WebAssembly, WebGPU, and the Modern Stack
.ical
2026-04-15 10:15, Platinum [2nd Floor]

What if you could run real data/ML workflows right in your browsers - sandboxed, with no installation or sending your data anywhere? Such an approach would have tons of benefits: it is easy to distribute, safer by default, and can scale almost infinitely with virtually no infrastructure costs.

This talk is a pragmatic overview of the current in-browser ML stack. We’ll cover what workflows are realistic today (from training of traditional ML models to on-device LLM inference), how packaging/loading works, and the constraints one should be aware of. By the end of the talk you will have a clear sense of when in-browser ML is a good fit, and when it isn’t.

Over the last few years, the tooling has matured enough to make "ML in a tab" worth taking seriously. Today, you can execute Python code in a sandboxed environment, ship interactive demos as a single URL, and even run LLM inference entirely on-device, without installations, servers, or sending data anywhere. In this talk, we will give a practical overview of the current in-browser ML stack, focusing on what is realistically possible today and the practical limits you still have to design around.

We will start with interactive environments such as JupyterLite and explain how they work under the hood via Pyodide: what it means to run CPython compiled to WebAssembly, how the filesystem and networking model differ from "normal" Python, and what that implies for performance, I/O, and package support.

We will then move from notebooks to applications with PyScript, showing how the same building blocks can be used to create shareable browser-based tools. We will also briefly cover the lower-level approach: using Pyodide directly and orchestrating it with JavaScript for granular control over loading, packaging, and data interchange.

Finally, we will cover in-browser inference workflows for both traditional and deep learning models (via ONNX), and LLMs (via wllama and WebLLM), and discuss how WebGPU can accelerate these pipelines.

By the end of the talk, attendees will have a clear overview of the in-browser ML ecosystem and the practical intuition to decide whether it's the right choice for your next project.

Target Audience:
This talk can be relevant for a broad audience. However, at least intermediate knowledge of ML / familiarity with Python ML ecosystem is required.

Outline:
- Introduction + Motivating examples [4 min]
- Running Python in WebAssembly [6 min]
- Overview of Pyodide [2 min]
- Package management [3 min]
- Runtime and memory constraints [1 min]
- Overview of interactive dev environments / JupyterLite [4 min]
- Building applications with PyScript and direct Pyodide bindings [7 min]
- On-device ML inference using ONNX, WebGPU, WebLLM, and wllama [5 min]
- Q&A [4 min]

Expected audience expertise in your talk's domain:: Intermediate Expected audience expertise in Python:: Intermediate

Oleh Kostromin

I am a Data Scientist primarily focused on Deep Learning and MLOps. In my spare time I contribute to several open-source python libraries.

Iryna Kondrashchenko

Iryna is a data scientist and co-founder of DataForce Solutions GmbH. At DataForce, the team is building LUML, an open-source, end-to-end AIOps platform that lets teams track experiments, version models, deploy, and monitor—all in one place.

State of In-Browser ML: WebAssembly, WebGPU, and the Modern Stack .ical 2026-04-15 10:15, Platinum [2nd Floor]

State of In-Browser ML: WebAssembly, WebGPU, and the Modern Stack
.ical
2026-04-15 10:15, Platinum [2nd Floor]