PyCon DE & PyData 2025

Analyze data easily with duckdb - and the implications on data architectures
2025-04-24 , Zeiss Plenary (Spectrum)

duckdb is increasingly becoming a universal tool for accessing and analyzing data. In this talk I will show with slides and live demo what duckdb is capable of and will dive deeper in how it will influence modern data architectures.


duckdb - a lightweight database with a focus on data analysis and a fast query engine that can be used in a variety of ways:
- Analyze data, stored on your own hard drive or somewhere on the Internet, in the browser with SQL? No problem
- Quickly check all the JSON files in S3 using SQL? Nothing could be easier
- A huge parquet file, bigger than my working memory. And now I have to analyze it locally. Easy!
- Read csv from blob storage, process and save in a Postgres database. Just one command

duckdb is developing more and more into a universal tool for accessing and analyzing data.

In this talk I will show with slides and a live demo why it is so popular and why it belongs in the toolbox of every data scientist, ML engineer or data engineer.

But I will not stop at the useful tooling. I will dive deeper into the implications for data and software architectures that arise from the rise of the embedded OLAP systems like duckdb. I will especially focus on both moving the data closer to the user for faster analytics but also on accessing data without the explicit need to move it.

What you learn and see can be used immediately in your day-to-day work.


Expected audience expertise: Domain:

Novice

Expected audience expertise: Python:

Novice

Matthias Niehoff works as Head of Data and Data Architect for codecentric AG and supports customers in the design and implementation of data architectures. His focus is on the necessary infrastructure and organization to help data and ML projects succeed.