Large Scale Feature Engineering and Datascience with Python & Snowflake
2023-04-17 , B07-B08

Snowflake as a data platform is the core data repository of many large organizations.
With the introduction of Snowflake's Snowpark for Python, Python developers can now collaborate and build on one platform with a secure Python sandbox, providing developers with dynamic scalability & elasticity as well as security and compliance.

In this talk I'll explain the core concepts of Snowpark for Python and how they can be used for large scale feature engineering and data science.


This talk is for technical people that would like to get a deep dive into how Snowflake enables large scale feature engineering and data science via Snowpark for Python.
During this talk we'll explore Snowflake's Python capabilities using a simple machine learning use case.

After this talk you will:

  • know how Snowpark avoids data movement and keeps existing security & governance intact,
  • understand the concept of the Snowpark DataFrame-API and how it enables accelerated performance compared to standard Pandas DataFrames,
  • know how to distribute Hyper Parameter Tuning and training of multiple models,
  • understand the concept of Vectorized User-Defined-Functions and how they can be used to perform large scale model inference.

Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Public link to supporting material:

https://github.com/michaelgorkow/snowpark_pycon2023

Abstract as a tweet:

Learn how Snowpark for Python enables large scale feature engineering and data science!

Michael is Field CTO for Datascience at Snowflake where he helps organisations to implement state of the art machine learning solutions. As a data science professional, he is passionate about sharing with others how to go beyond standard use cases and implement machine learning techniques for big data. 
He is based out of Munich, Germany.