EuroSciPy 2024

NumPy's new DType API and 2.0 transition
2024-08-29 , Room 5

NumPy 2 had some significant changes in its API and required many downstream libraries and users to adapt.
One of the larger new features is that the new DType API is now public. This C-API allows more powerful user defined DTypes, for which the new StringDType is an example. In the first part, I will give a brief overview of this API.

Since many downstream projects needed to adapt and publish new versions, in the second part I recap the current and past difficulties in transitioning to NumPy 2. This part of the session will be a forum for open discussion to gauge the challenges faced by users in making this transition.


One of the new features of NumPy 2.0 is a new variable length StringDType. This DType was written using the new DType C API which had been only experimentally available previously.
In this talk I will introduce the new concepts for creating user DTypes.
What is the C API to construct such a new DType and what are the most important methods that need to be implemented?
The StringDType and similar dtypes experiments in https://github.com/numpy/numpy-user-dtypes will serve as examples for this.

In the second part, I will recap how long maintainers needed to release downstream libraries compatible with NumPy 2.
Unfortunately, availability of downstream libraries may not be a good indicator of end-user difficulties, which are harder to predict.
Undoubtedly, this transition is still ongoing in some parts of the ecosystem and this session is a good opportunity to discuss it.

How hard was it to adapt to the Python API changes, the promotion changes due to NEP 50, and the requirement to recompile with NumPy 2?
Discussing these will help us make future decisions about similar breaking changes.


Abstract as a tweet:

NumPy 2: A brief introcution into the new DType API and open discussion about the NumPy 2 transition challenges.

Category [Scientific Applications]:

Other

Expected audience expertise: Domain:

none

Expected audience expertise: Python:

expert

Sebastian has been a NumPy developer for about 10 years now. After a PhD in phsyics he worked at as a postdoc at the Berkeley Institute for Datascience on NumPy as grants byt the Alfred P. Sloan Foundation and the Gordon and Betty Moore Foundation. Since 2022 he has been a software engineer at NVIDIA where he continues to contribute to NumPy.

This speaker also appears in: