Joris Van den Bossche
I am a core contributor to pandas and Apache Arrow, and a maintainer of GeoPandas. I did a PhD at Ghent University and VITO in air quality research and worked at the Paris-Saclay Center for Data Science. Currently, I work at Voltron Data, contributing to Apache Arrow, and am a freelance teacher of python (pandas) at Ghent University.
Session
04-24
13:00
90min
A deep dive into the Arrow Columnar format with pyarrow and nanoarrow
Joris Van den Bossche, Raúl Cumplido, Alenka Frim
Apache Arrow has become a de-facto standard for efficient in-memory columnar data representation. You might have heard about Arrow or using Arrow, but do you understand the format and why it’s so useful? This tutorial will dive deep into the details of the Arrow columnar format, the different types and buffer layouts, and explore those details interactively using the pyarrow and nanoarrow libraries.
PyData: PyData & Scientific Libraries Stack
A03-A04