Daniel Ortiz

Daniel Ortiz is a Senior Software Engineer at Bloomberg, where he extensively uses Python and a broad range of data technologies to build scalable systems for orchestration, analytics, and workflow automation.

He has a background spanning full-stack application development and deep experience in data infrastructure and architecture. He enjoys working across the stack, from back-end systems to user-facing components, and strongly focused on delivering maintainable and high-impact solutions.

Daniel holds a bachelor’s degree in computer science from the University of Toronto and a master’s degree in applied computing from the University of London.


Session

17/10
16:00
120minutos
Orchestrating Data Pipelines in Python: From Generation to Quality
Daniel Ortiz, Juan Aragón

Working with data goes far beyond simply generating it. It involves tracking its origin, maintaining its integrity, and selecting the right tools for each stage of your workflow. With the rapid evolution of data tools, staying current can be challenging. Fortunately, Python offers a robust and accessible collection of tools, libraries, frameworks that can make your life easier.

In this workshop, we’ll introduce Dagster, a Python-based orchestration framework designed specifically to help manage data assets. Dagster provides native support for metadata, lineage, versioning, and also includes a powerful UI that brings clarity and structure to your workflows. We’ll also explore how you can integrate orchestration workflows with other popular Python libraries -- such as pandas, Pandera, and Soda-core -- to create efficient, end-to-end pipelines.

Whether you're experienced in data pipelining or are simply curious about learning more, this session will cover how to:

  • Manage orchestration and asset definitions within a unified repository
  • Use pandas to define and transform data assets
  • Apply Pandera to enforce data contracts and catch schema issues early
  • Integrate automated Quality Control for ongoing data quality monitoring and management

By the end of our session, you’ll walk away with a practical understanding of how these open source tools can be used together to help you build more maintainable data pipelines within a Python-native environment.

Ciencia de Datos e Ingeniería de Datos
Workshop 05, E45 A109