PyCon DE & PyData 2025

Scalable Python and SQL Data Engineering without Migraines
2025-04-24 , Europium2

This session is for data and ML engineers with a basic understanding of data engineering and Python. It shows how to easily use Python code in Snowflake Notebooks to create data pipelines. By the end, you’ll know how to build and process data pipelines with Python.


Data loading processes are complex and require effort to organize, often different tools are used and seamless processing is not ensured. Learn how to create pipelines efficiently and easily with Python in Snowflake Notebooks. Create and monitor tasks to continuously load data. Use third-party data directly to extend the data model without copying it. Harness the power of Python to quickly calculate values and write efficient stored procedures.

In this session you will see how to
- Load Parquet data to Snowflake using schema inference
- Setup access to Snowflake Marketplace data
- Create a Python UDF to convert temperature
- Create a data engineering pipeline with Python stored procedures to incrementally process data
- Orchestrate the pipelines with tasks
- Monitor the pipelines with Snowsight


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Dirk Jung has more than 20 years of experience in the IT industry. In his position as Senior Solution Engineer at Snowflake Computing, he supports companies in building modern data and analysis platforms in the cloud. In his professional career, he has held various positions at SAS Institute, Blue Yonder and Datameer, among others. He specializes in business intelligence, predictive analytics and data warehousing.