PyCon UK 2022

Getting started with Apache Airflow
2022-09-17 , Room A

Learning Apache Airflow may seem daunting for those adventuring in the data world. This workshop aims to save engineers and scientists time. By the end of this session, attendees will have written two workflows which solve practical problems, running them locally and deploying them in a production-like environment.


Have you wondered about what to do when cronjobs are not enough? Have you heard about Apache Airflow but don't know where to start? Come and join us!

This session aims to smoothly introduce the open-source data orchestration platform, teaching critical concepts through practical examples. We'll be encouraging best practices in writing workflows with Python, including using the Task Flow API, dynamic tasks and the open-source library Astro Python SDK.

Some of the topics covered:

  • Building attractive DAGs (Direct Acyclic Graphs)
  • How to not miss the schedule
  • Running your DAGs locally
  • Deploying Airflow
  • Troubleshooting

The examples used in the session are based on open datasets.


Is your proposal suitable for beginners?: yes

Tatiana is a Staff Software Engineer at Astronomer and builds open-source authoring tools on top of Apache Airflow.
In 2002 she started to study Computer Engineering at Unicamp, Brazil. Her first job was to build a 3D visualisation software that helped surgeons plan complex medical procedures. She learned how to develop scalable software while working for Globo, between 2010 and 2014. She moved to the UK to build educational applications at Education First and later became a Principal Data Engineer at the BBC, where she helped the company to build its machine learning platform using open-source tools.

Kaxil is a committer and PMC member of the Apache Airflow Project. He is currently the Director of the Airflow Engineering Team @ Astronomer.

He was instrumental in adding DAG Serialization, support for Scheduler HA, Secrets Backend to Airflow and releasing Airflow 2.0.

He did his Masters in Data Science & Analytics from Royal Holloway, University of London. Started as a Data Scientist and then gained experience in Data Engineering, BigData and DevOps space. He began working on Airflow in 2017 while working at Data Reply as a BigData consultant and became PMC member in 2018.