06/12/2025 –, Main Stream Langue: English
Behind every great data product is a perfectly balanced recipe of tools and methodologies, and Python is the master chef that can bring them all together.
Whether you're a data engineer, analyst, or developer, you’ll walk away with a practical blueprint for building scalable, modern data workflows with just the right ingredients, and no risk of burning the cake.
Today’s data teams are under pressure to deliver insights quickly, accurately, and at scale, without getting lost in complexity. Python makes this possible bringing ingestion, transformation, orchestration, and visualization together in a single, streamlined workflow.
In this talk, we’ll walk through building a complete sales analytics pipeline for an e-commerce business, showing how Python’s rich ecosystem enables rapid, reliable, and scalable data delivery.
The Pipeline Journey:
Ingestion
Extract raw sales data from multiple source, using Python’s powerful data-access libraries. Validate and clean it for smooth downstream processing.Storage
Store your data in File Storage + DuckDB, a high-performance OLAP database that integrates seamlessly with Python, offering enterprise-grade analytics.Transformation
Combine dbt for modular SQL transformations with Polars for lightning-fast DataFrame operations. Model essential KPIs with accuracy and speed.Orchestration
Automate with Dagster or Airflow. We’ll explore DAG structure, dependency handling, and robust error recovery.Visualization
Present insights in an interactive Streamlit dashboard, where users can filter by product, channel, or date and see results instantly.
Key Takeaways:
- A proven blueprint for modern Python-powered data pipelines.
- A production-ready reference architecture you can adapt immediately.
- Practical guidance on choosing the right tools for your team and workload.
Giulia Silvestro is a highly skilled data professional specializing in big data engineering, data processing pipelines, and advanced analytics solutions. Currently a Big Data Engineer at Agile Lab, she focuses on designing and optimizing data architectures that power scalable, high-performance applications.
