EuroSciPy 2024

Combining Python and Rust to create Polars Plugins
2024-08-27 , Room 5

Polars is a dataframe library taking the world by storm. It is very runtime and memory efficient and comes with a clean and expressive API. Sometimes, however, the built-in API isn't enough. And that's where its killer feature comes in: plugins. You can extend Polars, and solve practically any problem.

No prior Rust experience required, intermediate Python or general programming experience required. By the end of the session, you will know how to write your own Polars Plugin! This talk is aimed at data practitioners.


Have you ever had the experience of needing to write a really custom function? Did you end up using a custom Python lambda function and waiting endlessly whilst your code executed?

Learn how to put an end to that!

This tutorial is aimed at advanced dataframe users who want to go beyond what Polars offers them. The structure will be:
- 5 minutes motivation: example of a custom function which is painfully slow
- 30 minutes: the bare minimum Rust you need to know in order to write a Polars plugin
- 20 minutes: let's get something running! Starting from a cookiecutter template, let's glue pieces together and get a simple "pig-latinnifier" running
- 25 minutes: customising the basic "pig-latinnifier" to implement that same custom function as a plugin
- 5 minutes: let's glue things together, run the plugin, and observe how much faster it is!
- 5 minutes: assorted requests / Q&A

This may look ambitious - however, I have taught Polars Plugins professionally and have given talks about the topic before, so I'm confident that it is doable.

By the end of the session, attendees will know how to write their own Polars Plugin. This talk is aimed at data practitioners who have experience with Python and data analysis (however, no prior Rust experience is required!).

If you want to follow the tutorial on your own laptop, then you will need to come prepared with the following installed:
- Rust (see https://rustup.rs/)
- an IDE, ideally with the Rust Analyzer installed
- a Python3.9+ virtual environment, in which you should install Polars and Maturin

If you can follow the instructions at https://github.com/MarcoGorelli/cookiecutter-polars-plugins, you'll be off to a flying start!


Abstract as a tweet:

Rust? Python? Combine them and create a Polars Plugin!

Category [Data Science and Visualization]:

Data Analysis and Data Engineering

Expected audience expertise: Domain:

some

Expected audience expertise: Python:

some

Public link to supporting material:

https://marcogorelli.github.io/polars-plugins-tutorial/

Project Homepage / Git:

https://github.com/MarcoGorelli/polars-plugins-tutorial

Marco is a core dev of pandas and Polars and works at Quansight Labs as Senior Software Engineer. He also consults and trains clients professionally on Polars. He has also written the first Polars Plugins Tutorial and has taught Polars Plugins to clients.

He has a background in Mathematics and holds an MSc from the University of Oxford, and was one of the prize winners in the M6 Forecasting Competition (2nd place overall Q1).

This speaker also appears in: