Einat Orr
Dr. Einat Orr has 20+ years of experience building R&D organizations and leading the technology vision at multiple companies, the latest being Similarweb, that IPO in NYSE last May. Currently she serves as Co-founder and CEO of Treeverse, the company behind lakeFS, an open source platform that delivers a git-like experience to object-storage based data lakes. She received her PhD. in Mathematics from Tel Aviv University, in the field of optimization in graph theory.
Session
The cloud native revolution has impacted all aspects of engineering, and data engineering is not exempt. One of the ongoing challenges in the data engineering world remains the local and distributed cloud native storage. In this talk we’ll explore working with distributed file systems in Python, through an intro to fsspec: a popular python library that is well-positioned to address the growing challenge of interacting with storage systems of different kinds in a consistent way.
In this talk we’ll show hands-on examples of working with fsspec with some of the most popular data tools in the Python community: Pandas, Tensorflow and PyArrow. We’ll demonstrate a real world implementation of fsspec and how it provides easy extensibility through open source tooling.
You’ll come away from this session with a better understanding for how to implement and extend fsspec to work with different cloud native storage systems.