Machine Learning in the Browser: Fast Iteration with ONNX & WebAssembly
2025-10-01 , Louis Armand 1 - Est

Deploying ML models doesn’t have to mean spinning up servers and writing backend code. This talk shows how to run machine learning inference directly in the browser—using ONNX and WebAssembly—to go from prototype to interactive demo in minutes, not weeks.


Deploying a machine learning model usually means provisioning infrastructure, setting up APIs, dealing with authentication, cloud costs, and so on. And sometimes, all you wanted was to show someone your idea.

This talk is about skipping all of that.

We’ll walk through how to run ML models inference directly in the browser, using ONNX Runtime Web, WebAssembly, and WebGPU when needed. The result: secure, privacy-respecting, and lightning-fast deployments—served as simple static web pages.

We’ll cover:
- How to convert and export models to ONNX from your favorite frameworks such as Scikit-Learn and Tensorflow
- How to use onnxruntime-web to run models client-side with WASM or WebGPU
- A hands-on demo: turning a notebook model into a web-based ML app with no infrastructure
- How this approach accelerates iteration, demo cycles, and stakeholder feedback
- Why ONNX helps bridge experiments, demos, and production
- Extra scenarios: edge ML, offline use, and hardware-light deployments

You’ll leave with:
- A deploy-in-minutes workflow for ML demos and prototypes
- Tools to tighten the loop between modeling and feedback
- A mental model for using ONNX to unify the path from experimentation to shipping

This talk is for: ML engineers and data scientists who want to move faster, share working demos without blockers, and cut their time-to-feedback from weeks to hours.

Forget staging environments. Your next ML prototype could just be a shareable URL.

Romain Clement is a software engineer with over a decade of experience spanning data engineering, applied mathematics, and machine learning. Since 2018, he’s worked as an independent consultant, helping data teams streamline and productionize their workflows—bringing software engineering best practices into data science, MLOps, and beyond.

He’s an active open-source contributor, with personal projects and community involvement in ecosystems like Datasette. A regular speaker since 2019 and organizer of the Grenoble Python Meetup, he enjoys sharing pragmatic tools and techniques that make data work actually work.

Find out more on romain-clement.net

This speaker also appears in: