Version Everything: From Chaos to Order in Reproducible Python Projects PyCon Sweden 2025

Version Everything: From Chaos to Order in Reproducible Python Projects
.ical
2025-10-30 13:00–14:30, Tutorial Room

Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can't run your code nor reproduce your results. The client's environment throws mysterious errors. Sound familiar?

This hands-on workshop teaches you to build modern reproducible workflows using a holistic framework that addresses real challenges teams face when sharing code, collaborating on research, or deploying data pipelines.

You'll gain practical experience on:
* Modern tooling to manage Python environments and dependencies
* Version control your code and data
* Use configuration files to store your parameters
* Containerize your application for easy deployment
* Best practices for team collaboration

The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of "works on my machine" syndrome. You’ll gain hands-on experience with modern tools and practices that make Python workflows reproducible, maintainable, and easy to share, all while applying them to simply data science tasks.

This interactive 90-minute workshop provides hands-on experience building reproducible workflows using modern Python tools. Through practical exercises, you'll transform a typical "script chaos" project into a fully reproducible pipeline applicable to any data-driven Python discipline.

What We'll Build Together: Starting with a working but messy data analysis project, we'll systematically add reproducibility layers. You'll set up dependency management, use git to version control the code, use DVC to version control the data and outputs, use configuration files to store the parameters and lastly containerize everything with Docker. Each module includes guided coding exercises where you'll apply concepts immediately.

Key Modules:
1. Modern Dependency Management (20 min): Hands-on with uv/pixi, creating lock files, managing Python versions
2. Code & Configuration Versioning (20 min): Git for version control of the source code
3. Data Pipeline Versioning (20 min): DVC setup, pipeline definitions, experiment tracking
4. Hidden Reproducibility Challenges (10 min): Randomness and human error
5. Production Deployment (20 min): Docker for Python, artifact registries, deployment reproducibility

This workshop focuses on the tools and practices that make models and analyses reproducible and well-documented. The underlying methods will be introduced only briefly for context, while most of the time will be devoted to practical, hands-on work with the tooling.

Prerequisites:
* A laptop (admin root privileges are needed to install the necessary tooling)
* Basic knowledge of Python syntax
The necessary tooling can be installed during the workshop.

Installation of tooling
- uv (https://docs.astral.sh/uv/getting-started/installation/)
- git (https://git-scm.com/install)
- DVC (https://dvc.org/doc/install)
- Podman (https://podman.io/docs/installation)

Repos
https://github.com/anivorlis/pycon-workshop-uv
https://github.com/anivorlis/pycon-workshop-dvc
https://github.com/anivorlis/pycon-workshop-challenge

Aris Nivorlis

Aris Nivorlis is a researcher geophysicist and data steward at Deltares, where he uses data and tooling to answer complex questions about the subsurface.

He is passionate about promoting good practices in data management and scientific coding, helping teams build sustainable and reproducible workflows.

Outside of work, Aris is actively involved in the European Python community, contributing to the organization and support of conferences and community initiatives.

When he's not at his computer, you’ll likely find him dancing salsa.

This speaker also appears in:

The Python Quiz!

Nikos Chatzis

Dr. Nikos Chatzis is a Geologist with a Master’s Degree and a PhD in Applied Geophysics and Seismology from the Aristotle University of Thessaloniki. His research focuses on the processing and analysis of geophysical signals, and he has extensive experience using Python and other programming languages for data modeling and interpretation.

Version Everything: From Chaos to Order in Reproducible Python Projects .ical 2025-10-30 13:00–14:30, Tutorial Room

Version Everything: From Chaos to Order in Reproducible Python Projects
.ical
2025-10-30 13:00–14:30, Tutorial Room