2025-10-30 –, Tutorial Room
Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can't run your code nor reproduce your results. The client's environment throws mysterious errors. Sound familiar?
This hands-on workshop teaches you to build modern reproducible workflows using a holistic framework that addresses real challenges teams face when sharing code, collaborating on research, or deploying data pipelines.
You'll gain practical experience on:
* Modern tooling to manage Python environments and dependencies
* Version control your code and data
* Use configuration files to store your parameters
* Containerize your application for easy deployment
* Best practices for team collaboration
The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of "works on my machine" syndrome. You’ll gain hands-on experience with modern tools and practices that make Python workflows reproducible, maintainable, and easy to share, all while applying them to simply data science tasks.
This interactive 90-minute workshop provides hands-on experience building reproducible workflows using modern Python tools. Through practical exercises, you'll transform a typical "script chaos" project into a fully reproducible pipeline applicable to any data-driven Python discipline.
What We'll Build Together: Starting with a working but messy data analysis project, we'll systematically add reproducibility layers. You'll set up dependency management, use git to version control the code, use DVC to version control the data and outputs, use configuration files to store the parameters and lastly containerize everything with Docker. Each module includes guided coding exercises where you'll apply concepts immediately.
Key Modules:
1. Modern Dependency Management (20 min): Hands-on with uv/pixi, creating lock files, managing Python versions
2. Code & Configuration Versioning (20 min): Git for version control of the source code
3. Data Pipeline Versioning (20 min): DVC setup, pipeline definitions, experiment tracking
4. Hidden Reproducibility Challenges (10 min): Randomness and human error
5. Production Deployment (20 min): Docker for Python, artifact registries, deployment reproducibility
This workshop focuses on the tools and practices that make models and analyses reproducible and well-documented. The underlying methods will be introduced only briefly for context, while most of the time will be devoted to practical, hands-on work with the tooling.
Prerequisites:
* A laptop (admin root privileges are needed to install the necessary tooling)
* Basic knowledge of Python syntax
The necessary tooling can be installed during the workshop.
Aris Nivorlis is a researcher geophysicist and data steward at Deltares, where he uses data and tooling to answer complex questions about the subsurface.
He is passionate about promoting good practices in data management and scientific coding, helping teams build sustainable and reproducible workflows.
Outside of work, Aris is actively involved in the European Python community, contributing to the organization and support of conferences and community initiatives.
When he's not at his computer, you’ll likely find him dancing salsa.
Dr. Nikolaos Chatzis is a graduate Geologist and holds a Master’s Degree in Applied Geophysics and Seismology from the School of Geology at Aristotle University of Thessaloniki (A.U.Th.). In March 2024, he was awarded a PhD by the Department of Geology at A.U.Th. His research interests focus on various geophysical and seismological topics. He has participated in numerous fieldwork projects and he has extensive experience in the use and installation of modern digital recording instruments and on the signal processing using various programming languages.