Pandas 2.0 and beyond
2023-08-16 , Aula

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on


The pandas 2.0 release is targeted for the first quarter of 2023. This is a major milestone for the pandas project, and this talk will start with an overview of this release. Pandas 2.0 includes some new (experimental) features, but mostly means enforcing deprecations that have been accumulated in the 1.x series, along with some necessary breaking changes.

But that doesn’t mean there are no interesting features to talk about! The main part of the presentation will showcase some new features, both already released as opt-in features or to come in future releases.
Support for non-nanosecond resolution datetimes, allowing time spans ranging over a billion of years. Improved support for nullable data types, including easy opt-in options for I/O functions. Experimental integration with pyarrow to back columns of a DataFrame (beyond the string dtype).
A major change that is under way is a change to the copy and view semantics of operations in pandas (related to the well-known (or hated) SettingWithCopyWarning). This is already available as an experimental opt-in to test and use the new behaviour, and will probably be a highlight of pandas 3.0.


Abstract as a tweet:

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas

Category [Data Science and Visualization]:

Data Analysis and Data Engineering

Expected audience expertise: Domain:

some

Expected audience expertise: Python:

none

Project Homepage / Git:

https://github.com/pandas-dev/pandas/

I am a core contributor to Pandas and Apache Arrow, and maintainer of GeoPandas. I did a PhD at Ghent University and VITO in air quality research and worked at the Paris-Saclay Center for Data Science. Currently, I work at Voltron Data, contributing to Apache Arrow, and am a freelance teacher of python (pandas).

This speaker also appears in:

I am a core contributor to pandas. I earned a PhD in Mathematics from Michigan State University studying Arithmetic Geometry and am now a Director of Data Science at 84.51° specializing in large scale optimization problems.