Machine Learning with Apache Airflow
2020-07-17 , Warsaw Meetup, [Sessions start: Friday 17.07 6 pm (Friday 17.07 9am PDT)]

This talk will discuss how to build an Airflow based data platform that can take advantage of popular ML tools (Jupyter, Tensorflow, Spark) while creating an easy-to-manage/monitor


As the field of data science grows in popularity, companies find themselves in need of a single common language that can connect their data science teams and data infrastructure teams. Data scientists want rapid iteration, infrastructure engineers want monitoring and security controls, and product owners want their solutions deployed in time for quarterly reports. This talk will discuss how to build an Airflow based data platform that can take advantage of popular ML tools (Jupyter, Tensorflow, Spark) while creating an easy-to-manage/monitor ecosystem for data infrastructure and support team.

In this talk, we will take an idea from a single-machine Jupyter Notebook to a cross-service Spark + Tensorflow pipeline, to a canary tested, production-ready model served on Google Cloud Functions. We will show how Apache Airflow can connect all layers of a data team to deliver rapid results.

Daniel Imberman is a full-time Apache Airflow committer, a digital nomad, and constantly on a search for the perfect bowl of ramen. Daniel received his BS/MS from UC Santa Barbara in 2015 and has worked for data platform teams ranging from early-stage startups, to large corporations like Apple and Bloomberg LP.

This speaker also appears in: