Large Scale Feature Engineering and Datascience with Python & Snowflake
04-17, 11:40–12:25 (Europe/Berlin), B07-B08

Snowflake as a data platform is the core data repository of many large organizations.
With the introduction of Snowflake's Snowpark for Python, Python developers can now collaborate and build on one platform with a secure Python sandbox, providing developers with dynamic scalability & elasticity as well as security and compliance.

In this talk I'll explain the core concepts of Snowpark for Python and how they can be used for large scale feature engineering and data science.


This talk is for technical people that would like to get a deep dive into how Snowflake enables large scale feature engineering and data science via Snowpark for Python.
During this talk we'll explore Snowflake's Python capabilities using a simple machine learning use case.

After this talk you will:

  • know how Snowpark avoids data movement and keeps existing security & governance intact,
  • understand the concept of the Snowpark DataFrame-API and how it enables accelerated performance compared to standard Pandas DataFrames,
  • know how to distribute Hyper Parameter Tuning and training of multiple models,
  • understand the concept of Vectorized User-Defined-Functions and how they can be used to perform large scale model inference.

Expected audience expertise: Domain

Intermediate

Expected audience expertise: Python

Intermediate

Public link to supporting material

https://github.com/michaelgorkow/snowpark_pycon2023

Abstract as a tweet

Learn how Snowpark for Python enables large scale feature engineering and data science!

Michael is Field CTO for Datascience at Snowflake where he helps organisations to implement state of the art machine learning solutions. As a data science professional, he is passionate about sharing with others how to go beyond standard use cases and implement machine learning techniques for big data. 
He is based out of Munich, Germany.