Airflow on Kubernetes: Containerizing your Workflows
07-07, 11:00–11:45 (US/Pacific), NYC Meetup [Session starts: Tuesday 07.07 12pm (Tuesday 07.07 9am PDT)]

I have been one of the engineers at Nieslen Digital leading our migration of ETLs to Airflow on Kubernetes. This talk will teach you the ins and outs of Airflow on Kubernetes, from deploying Airflow to best practices for DAG development in a containerized environment. Airflow on Kubernetes will ease your Airflow DAG development, minimize its infrastructure costs, avoid wasted resources, and providing tasks with the optimal infrastructure to run on all through Kubernetes features within Airflow.


At Nielsen Digital we have been moving our ETLs to containerized environments managed by Kubernetes. We have successfully transferred some of our ETLs to this environment in production. In order to do this we used the following technologies: Helm to easily deploy Airflow on to Kubernetes; Airflow's Kubernetes Executor to take full advantage Kubernetes features; and Airflow's Kubernetes Pod Operator in order to execute our containerized Tasks within our DAGs. To automate a lot of the deployment process we also used Terraform. Lastly, Kubernetes features were used to gain much more fine grained control of Airflows infrastructure. Join me in this talk to take an in depth look at how we used these technologies, why we used these technologies, and the results of using them so far. I will also briefly go over some features coming in Airflow 2.0 that we are considering to use in our workflows.

I graduated from Virginia Tech in Spring of 2019 as a Computer Science Major. I have been working at Nielsen for about a year as a Software engineer in their Emerging Technologies Program. I work with Nielsen Digitals Site Reliability Engineering team and Collections Platform team. Over the course of my time here I have deep dived into Kubernetes to enable us too more easily create, maintain, and deploy our workflows, while also having much more control of our resources to reduce cloud infrastructure costs. I am passionate about the movement to cloud native services on Kubernetes and am determined to contribute to it. I have actively contributed to Airflows open source stable helm chart and plan on contribute to more open source projects in the future.