2025-09-12 –, Ballroom 1
I’ve used pandas for years, but as my data grew, my local workflows started to slow down. Joins got sluggish, memory errors showed up, and simple tasks became harder to manage. That’s when I found DuckDB, a fast, in-process SQL engine that brought the speed and flexibility I was missing.
This talk isn’t about replacing pandas. It’s about knowing when to reach for something different. I’ll share how DuckDB helped streamline my workflow, with real examples, side-by-side comparisons, and a quick intro to DuckLake, a SQL-based Lakehouse format that fits naturally into modern Python analytics.
This talk picks up where that shift began. I’ll walk through the challenges I hit with pandas and how DuckDB gradually became a core part of my workflow. We’ll explore how it handles real data tasks, complements existing tools, and helps simplify analysis. You’ll also see how local-first tools like DuckDB and DuckLake can streamline data work without needing extra infrastructure.
By the end of the session, you’ll:
- Recognize where pandas starts to struggle and why it’s okay to want more
- See how DuckDB can supercharge your workflow without changing everything
- Leave with real-world tips for faster analytics with no cluster required
Ankur is a Senior Data & Cloud Engineer at Innablr, working at the intersection of cloud infrastructure and modern data platforms. Based in Melbourne, he helps teams design efficient pipelines and scale analytics using tools like DuckDB, Databricks, and dbt.
Outside of work, he’s an anime nerd and a firm believer that ducks are objectively the coolest animals. Ankur thinks good tooling should feel like magic, bad tooling should be deleted, and Jupyter notebooks should come with a warning label.