2021-10-21 –, Workshops
Live Stream: https://youtu.be/qWvJSIgOcPU
With a lot of changes under the hood with Airflow 2.0, the workshop aims to give an overview on major updates in Airflow 2.0 from 1.0, major components and working of Airflow and hands-on demo of implementation and management of an end-to-end Machine Learning pipeline. Without a pipeline in-place, management of multiple Machine Learning stages in production can be difficult. This gives an overview of simplified process and management of Python based ML projects using Airflow.
Prerequisites
- Install Docker Desktop (with minimum 3GB memory allocated)
- Start Docker engine
- Clone the workshop repo with
git clone https://github.com/pycon-ml/airflow_workshop.git
- Run
docker-compose pull
inside repo folderairflow_workshop
Agenda
05 min: Introduction
05 min: Major changes in Airflow 2.0
05 min: Pre-requisites setup overview
10 min: Walkthrough of different backend components
10 min: Different stages of a DAG file – steps and operators
10 min: Dynamic DAG creation to improve parallelism
15 min: How to trigger Airflow DAG runs
15 min: Debug and clear Airflow task errors
10 min: Overview of production-level Airflow-based architecture
05 min: Wrap up questions
Alen Jacob is a Machine Learning Engineer at H&M and have a Masters' Degree in Computational Linguistics.
Scott Zhou is a competence lead for Machine Learning Engineers at H&M and a Machine Learning Engineer himself.
Lini Jose is a Machine Learning Engineer at H&M.
Nitin Bisht is a Software Engineer at H&M.