PyCon DE & PyData 2025

Einat Orr

Dr. Einat Orr has 20+ years of experience building R&D organizations and leading the technology vision at multiple companies, the latest being Similarweb, that IPO in NYSE last May. Currently she serves as Co-founder and CEO of Treeverse, the company behind lakeFS, an open source platform that delivers a git-like experience to object-storage based data lakes. She received her PhD. in Mathematics from Tel Aviv University, in the field of optimization in graph theory.


LinkedIn

linkedin.com/in/einatorr

X / Twitter

https://x.com/einatorr


Session

04-24
15:00
45min
Distributed file-systems made easy with Python's fsspec
Einat Orr

The cloud native revolution has impacted all aspects of engineering, and data engineering is not exempt. One of the ongoing challenges in the data engineering world remains the local and distributed cloud native storage. In this talk we’ll explore working with distributed file systems in Python, through an intro to fsspec: a popular python library that is well-positioned to address the growing challenge of interacting with storage systems of different kinds in a consistent way.

In this talk we’ll show hands-on examples of working with fsspec with some of the most popular data tools in the Python community: Pandas, Tensorflow and PyArrow. We’ll demonstrate a real world implementation of fsspec and how it provides easy extensibility through open source tooling.

You’ll come away from this session with a better understanding for how to implement and extend fsspec to work with different cloud native storage systems.

PyData: Data Handling & Engineering
Titanium3