2024-04-23 –, A03-A04
A tutorial session on how to build scientific packages for numerical calculus and algorithms in Python and Rust. It walks through the process of packaging with a modern tool stack, introduces the concept of vectorization for efficient computation in Python in the context of classical Machine Learning, and shows how the package can be optimized with extensions written in Rust.
The Rust programming language gained a lot of attention over the last years, and began to slowly infiltrate the Python ecosystem with an ever-increasing number of tools and libraries in the Python ecosystem such as Ruff and Polars which are implemented in this language. Unlike Python, Rust is a system language optimized for performance and memory safety, and some consider it the spiritual successor of C++. Despite its steep learning curve, it is the perfect candidate for extending Python and its ecosystem when performance matters, in a modern and memory-safe language.
This session demonstrates the path of creating a scientific package in python (following best practices and modern tools) and gradually migrating parts of it to Rust for additional performance gains. The use case is a naive implementation of the "Expectation maximization for Gaussian Mixture Models" algorithm from scratch, a relatively simple yet efficient machine learning method. The session addresses the following points: How to build a Python package with a modern tools set, how to translate a numerical algorithm into vectorized Python, and optimize the package with a performant Rust implementation of the critical parts. Prior knowledge of Rust or the algorithm is not required. Note that the goal is not to learn Rust in this single session (this requires at least three days) but rather to provide a superficial overview on what makes this language so great and well-suited for extending Python.
Participants are advised to follow the clone the repository below and follow the installation instructions to avoid longer download times during the session.
https://github.com/StefanUlbrich/PyCon2024
Intermediate
Expected audience expertise: Python:Intermediate
Abstract as a tweet (X) or toot (Mastodon):A tutorial session on how to build scientific package for numerical calculus algorithm in Python and Rust.
I am a researcher and programmer with a passion for geometrical methods especially folding and unfolding in machine learning and robotics. My interests are computational cognition, robotics, bioinformatics, machine learning and data science. I am an experienced Pythonista and Rustacean