EuroSciPy 2024

Introduction to Polars: Fast and Readable Data Analysis
2024-08-27 , Room 6

Polars is a new, powerful library for doing analysis on structured data. The library focuses on processing speed and a consistent and intuitive API. This tutorial will help you get started with Polars, by showing you how to read and write data and manipulate it with Polars' powerful expression syntax. You'll learn about how the lazy API is an important key to Polars' efficiency.


Polars is a new, lightning-fast library for analyzing structured data. The library focuses on processing speed and a consistent and intuitive API. Its syntax supports transformations like selection, filtering, and aggregation with dedicated and powerful expressions. Polars does lazy evaluation out-of-the-box with an advanced query planner.

In this tutorial, you'll learn how you can manipulate your data with Polars. You'll start by reading existing data into a Polars DataFrame and learn how to use tidy principles to organize your analysis workflow. After learning the basics of Polars, you'll start exploring Polars' lazy interface, which is where the library really shines.

With the lazy API, queries are only executed when the results are needed. This can improve performance significantly, as Polars can take advantage of several different optimizations. Throughout the tutorial, you'll gain experience working lazily. You'll learn how to inspect the optimized query plan and how to play to the library's strengths.

This tutorial is for anyone curious about Polars. You don't need previous experience with other libraries like pandas, but if you have used pandas earlier, you'll learn how Polars is different and how the libraries can play nicely together.


Abstract as a tweet:

Get started with Polars, the new and lightning-fast library for manipulating structured data.

Category [Data Science and Visualization]:

Data Analysis and Data Engineering

Expected audience expertise: Domain:

some

Expected audience expertise: Python:

some

Public link to supporting material:

https://github.com/gahjelle/polars_introduction

Project Homepage / Git:

https://github.com/gahjelle/polars_introduction

Geir Arne teaches Python at Real Python. He has a background in mathematics and has worked with data analysis in different fields, such as electricity markets, satellite geodesy, and computer vision. In his spare time, Geir Arne enjoys hammock camping, square roots, and aimless forest wandering.