2021-10-21 –, Data
Live broadcast: https://www.youtube.com/watch?v=gwLJZVoXWlg
Apache Airflow has become one of the most popular data toolings. Due to its high
complexity, it could be challenging for all teams and companies. For example, how to
effectively construct an orchestrate architecture on diverse cloud platforms, how to
productively accelerate your engineering and machine learning workload at scale, and how
to smartly decouple your Python codebase for professional testing and easy maintenance.
In this session, we will present five recipes that have helped us scale data jobs and to reinforce as a data-driven AI-leading company.
Key take-ways:
- Orchestrate Architecture
- Auto-Build Airflow DAG
- Data Quality
- Auto-Cost Evaluation
- Auto-Cataloging
Qiang is a data engineer leading a team involved in building future data platforms across the entire fashion value chain from design and production to customer experience. Qiang has over 8 years of experience in creating enterprise analytics products applying Python, Spark, Airflow, and Cloud Services (AWS, GCP, and Databricks).
In the past, Qiang also shared his understanding of Data and AI in Global Summits (Data+AI Summit Europe 2020, Data Innovation Summit 2021, Pycon SE 2021, etc.). In addition, Qiang is a fashion lover and a part-time fashion designer.
Hey, I am Dahmane Sheikh, Data Engineer at heart, I am passionate about Data Analytics & Engineering projects that drive digital transformation in organizations.
I am also the founder of Analytics Minded, a company helping organisations leverage Big Data using a fusion of art & engineering to develop robust and scalable solutions.
Grzegorz Skibinski has around 10 years of experience in various data-related roles. Through 4 different countries, several different businesses he amassed a lot of knowledge working on different steps of the supply chain of data products. From working on quick Ad-Hoc solutions to sustainable data infrastructure.
His programming journey started with java, and then developed into python, and various data-crunching technologies. Grzegorz worked and managed SQL, NoSQL, structured, unstructured DBMS-s. He specializes in PySpark, pandas, and Dask solutions. He worked with various different schedulers but most recently airflow became his favored.
On top of that Grzegorz is a ninja - doing ninjutsu, and various other martial arts since he was 7 years old.