PyCon DE & PyData 2026

Jonas Böer

Data Engineer at inovex since 2022, full-time software engineer since 2018, coder for as long as I can remember. With my experience working on data warehouses and machine learning applications from small-scale tests up to international deployments, I enjoy eliminating bugs and bottlenecks, getting cool systems online and writing beautiful code. Still proud of the time when a colleague complained that deploying to production has become too easy and is no longer a thrilling adventure because of me.


Session

04-16
16:25
30min
Rediscovering single-node processing: When does it make sense to move from Spark to Polars?
Jonas Böer

As data engineers, we are used to spinning up a Spark Cluster every time we want to do data processing and handle the overhead that comes with using such a mighty framework. But is this really necessary? In this talk I will argue that single-node processing with Polars is in many cases easier and cheaper. I will compare a typical ETL & Feature Engineering task in Spark and in Polars and offer a pragmatic opinion on when to use one or the other.

PyData: Data Handling & Data Engineering
Helium [3rd Floor]