PyCon AU 2025

Life Beyond Pandas: Workflows with DuckDB, Daft, Polars, and Datafusion
2025-09-12 , Ballroom 1

In this talk, we’ll explore four modern data engines ; Daft, DuckDB, Polars, and DataFusion, that offer varying levels of out-of-core execution. These engines allow you to work with datasets larger than memory without the need to rewrite everything for a distributed system.

They’re fast, expressive, and, above all, Pythonic—though SQL support might still be the deciding factor for many workflows.

Rather than comparing which engine is best (they’re all open source, and good ideas tend to spread quickly), this talk will highlight exciting recent developments. We’ll showcase, for example, how one workload saw a 2× performance boost in under a year, and how open table formats are bringing cloud data warehouse capabilities to local Python workflows.

Mimoune "Mim" Djouallah, a former Construction Planner, has been a member of the Microsoft Fabric CAT team since December 2023. Holding a BSc in Civil Engineering, he's been deeply involved with the Power BI stack since 2016. An early adopter of Fabric, Mim actively blogs about Python notebooks, GIS, and various data engines that integrate with Onelake.