PyConDE & PyData Berlin 2024

Alenka Frim

My software development journey started with open source and Apache Arrow project. More specifically, I started with contributing to the Arrow R package in 2021. After that I have contributed to other open source projects connected to the Python dataframe API standard while on Quansight and became a Apache Arrow committer in 2022 after being a regular contributor to Apache Arrow (Python) since 2021. I am currently working at Voltron Data as a Software Engineer.


Session

04-24
13:00
90min
A deep dive into the Arrow Columnar format with pyarrow and nanoarrow
Joris Van den Bossche, Raúl Cumplido, Alenka Frim

Apache Arrow has become a de-facto standard for efficient in-memory columnar data representation. You might have heard about Arrow or using Arrow, but do you understand the format and why it’s so useful? This tutorial will dive deep into the details of the Arrow columnar format, the different types and buffer layouts, and explore those details interactively using the pyarrow and nanoarrow libraries.

PyData: PyData & Scientific Libraries Stack
A03-A04