Pandas 2.0 and beyond
2023-04-17 , Kuppelsaal

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on.


The pandas 2.0 release is targeted for the first quarter of 2023. This is a major milestone for the pandas project, and this talk will start with an overview of this release. Pandas 2.0 includes some new (experimental) features, but mostly means enforcing deprecations that have been accumulated in the 1.x series, along with some necessary breaking changes.

But that doesn’t mean there are no interesting features to talk about! The main part of the presentation will showcase some new features, both already released as opt-in features or to come in future releases.
Support for non-nanosecond resolution datetimes, allowing time spans ranging over a billion of years. Improved support for nullable data types, including easy opt-in options for I/O functions. Experimental integration with pyarrow to back columns of a DataFrame (beyond the string dtype).
A major change that is under way is a change to the copy and view semantics of operations in pandas (related to the well-known (or hated) SettingWithCopyWarning). This is already available as an experimental opt-in to test and use the new behaviour, and will probably be a highlight of pandas 3.0.


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Abstract as a tweet:

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on.

See also: Slides

I am a core contributor to Pandas and Apache Arrow, and maintainer of GeoPandas. I did a PhD at Ghent University and VITO in air quality research and worked at the Paris-Saclay Center for Data Science. Currently, I work at Voltron Data, contributing to Apache Arrow, and am a freelance teacher of python (pandas) at Ghent University.

This speaker also appears in:

I am a member of the pandas core team since early 2021. I am a regular contributor of pandas since early 2020. I am currently working at Coiled as a Senior Software Engineer. I hold a Masters degree in Mathematics and I am currently studying towards a Software Engineering degree.

This speaker also appears in: