Airflow as an Elastic ETL Tool
07-13, 23:00–23:45 (US/Pacific), Tokyo Meetup, [Sessions start: Tuesday 14.07 1 pm (Monday 13.07 9 pm PDT)]

In search of a better, modern, simplistic method of managing ETL's processes and merging them with various AI and ML tasks, we landed on Airflow. We envisioned a new user friendly interface that can leverage dynamic DAG's and reusable components to build an ETL tool that requires virtually no training.


We built several template DAG's and connectors for Airflow to typical data sources, like SQL Server. Then proceeded to build a modern interface on top that brings ETL build, scheduling and execution capabilities. Acknowledging Airflow is designed for task orchestration, we expanded our infrastructure to use K8 and Docker for elastic computing. Key to our solution is the ability to create ETL's using only open source tools, whilst executing on-par or faster than commercial solutions and an interface so simple that ETL's could be created in seconds.

Over the past decade, Hendrik has served 15 out of the top 20 US technology firms, helping organizations capitalize on their data assets. Currently, Hendrik is employed as Director of Analytics at Optum, part of Unitedhealth Group. At Optum, Hendrik's team leads research and innovation to identify data solutions that help make the healthcare system work better for everyone.