Building Reuseable and Trustworthy ELT pipelines (A templated approach) Apache Airflow Summit Online Edition 2020

Building Reuseable and Trustworthy ELT pipelines (A templated approach)
.ical
2020-07-15 09:00–09:45, Amsterdam Meetup, [Sessions start: Wednesday 15.07 6pm (Wednesday 15.07 9 am PDT)]

To improve automation of data pipelines, I propose a universal approach to ELT pipeline that optimizes for data integrity, extensibility, and speed to delivery. The workflow is built using open source tools and standards like Apache Airflow, Singer, Great Expectations, and DBT.

Templating ETLs is challenging! The creation and maintenance of data pipelines in production require hard work to manage bugs in code and bad data.

I like to propose a data pipeline pattern that can simplify building pipelines while optimizing for data integrity and observability. The workflow is built using open source tools like Singer, Great Expectations, and DBT.

Goals:

Make ELT simple and fast to implement
Validate your assumptions of the data before you make it available for use
Allow analysts/data scientists add pain-free contributions to ELT using SQL
Generate data documentation, failure logs for quick recovery, and fixes outages in your pipeline

Target Audience:

Approachable to any level of developer
Novice data personals interested in starting ELT workflow and learning about different tools of the ecosystem
Intermediate+ developers interested in supercharging their pipeline with Write Audit Publish pattern and reducing pipeline debt

Nehil Jain

Nehil Jain is a Senior Software Engineer at SnapTravel , a leader in the conversational commerce space that allows millions of users around the world to book travel via messaging. SnapTravel has raised over $22M and driven over $150M in hotel bookings, and Nehil is responsible for leading and scaling the team that handles all data and infrastructure.

Prior to SnapTravel, Nehil completed his undergraduate and graduate degrees in Engineering, conducted research at McGill University, and was an early-stage member of a genetics startup providing DNA analysis for athletics, injuries and nutrition. In addition to tackling the complex problems in scaling data within a hyper-growth tech startup, Nehil is an avid runner and enjoys pushing himself to new limits through competitive dragon boating.

Building Reuseable and Trustworthy ELT pipelines (A templated approach) .ical 2020-07-15 09:00–09:45, Amsterdam Meetup, [Sessions start: Wednesday 15.07 6pm (Wednesday 15.07 9 am PDT)]

Building Reuseable and Trustworthy ELT pipelines (A templated approach)
.ical
2020-07-15 09:00–09:45, Amsterdam Meetup, [Sessions start: Wednesday 15.07 6pm (Wednesday 15.07 9 am PDT)]