Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem PyCon DE & PyData Berlin 2023

Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem
.ical

2023-04-19 14:00–14:30, B05-B06

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing, and is becoming the de facto standard for tabular data. This talk will give an overview of the recent developments both in Apache Arrow itself as how it is being adopted in the PyData ecosystem (and beyond) and can improve your day-to-day data analytics workflows.

The Apache Arrow (https://arrow.apache.org/) project specifies a standardized language-independent columnar memory format for tabular data. It enables shared computational libraries, zero-copy shared memory, efficient (inter-process) communication without serialization overhead, etc. Nowadays, Apache Arrow is supported by many programming languages and projects, and is becoming the de facto standard for tabular data.

But what does that mean in practice? There is a growing set of tools in the Python bindings, PyArrow, and a growing number of projects that use (Py)Arrow to accelerate data interchange and actual data processing. This talk will give an overview of the recent developments both in Apache Arrow itself as how it is being adopted in the PyData ecosystem (and beyond) and can improve your day-to-day data analytics workflows.

Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Abstract as a tweet:

Connecting and accelerating dataframe libraries across the PyData ecosystem with Apache Arrow. Learn about the recent developments in Arrow and its adoption, and how it can improve your day-to-day data analytics workflows.

Public link to supporting material:

https://arrow.apache.org/

Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem .ical 2023-04-19 14:00–14:30, B05-B06

Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem
.ical

2023-04-19 14:00–14:30, B05-B06