EuroSciPy 2025

Francesc Alted

I am a curious person who studied Physics and Applied Maths. I spent over a year at CERN for my MSc in High Energy Physics. However, I found maths and computer sciences equally fascinating, so I left academia to pursue these fields. Over the years, I developed a passion for handling large datasets and using compression to enable their analysis on commodity hardware accessible to everyone.

I am the CEO of ironArray SLU and also leading the Blosc Development Team. I am very excited in working in providing a way for sharing Blosc2 datasets in the network in an easy and effective way via Caterva2, and Cat2Cloud, a software as a service that we are introducing.

As an Open Source believer, I started the PyTables project more than 20 years ago. After 25 years in this business, I started several other useful open source projects like Blosc, Caterva2 and Btune; those efforts won me two prizes that mean a lot to me:

You can know more on what I am working on by reading my latest blogs.


Your pronouns

Him

Affiliation

ironArray SLU (https://ironarray.io)

Position / Job

CEO

Homepage

https://blosc.org

X handle

@FrancescAlted

GitHub/GitLab profile URL

https://github.com/FrancescAlted

LinkedIn

https://www.linkedin.com/in/francesc-alted-74a7481/

Photo euroscipy-2025/question_uploads/Francesc-Nets-small_maXPCUY.png

Sessions

08-18
15:30
90min
Compress, Compute, and Conquer: Python-Blosc2 for Efficient Data Analysis
Francesc Alted, Luke Shaw

Have you ever experienced the frustration of not being able to analyze a dataset because it's too large to fit in memory? Or perhaps you've encountered the memory wall, where computation is hindered by slow memory access? In this hands-on tutorial, you'll learn how to overcome these common challenges using Python-Blosc2.

Python-Blosc2 (https://www.blosc.org/python-blosc2/) is a high-performance, multi-threaded, multi-codec array container, with an integrated compute engine that allows you to compress and compute on large datasets efficiently. You'll gain practical experience with Python-Blosc2's latest features, including its seamless integration with NumPy and the broader Python data ecosystem. Through guided exercises, you'll discover how to tackle data challenges that exceed your available RAM while maintaining high performance.

By the end of this tutorial, you'll be able to implement Python-Blosc2 in your own workflows, dramatically increasing your ability to process large datasets on standard hardware. Participants should have basic familiarity with NumPy and Python data processing.

Computational Tools and Scientific Python Infrastructure
Large Room
08-21
15:30
20min
Python-Blosc2: Compress Better, Compute Bigger!
Francesc Alted, Luke Shaw

Have you ever experienced the frustration of not being able to analyze a dataset because it's too large to fit in memory? Or perhaps you've encountered the memory wall, where computation is hindered by slow memory access? These are common challenges in data science and high-performance computing.

Python-Blosc2 (https://www.blosc.org/python-blosc2/) is a high-performance, multi-threaded, multi-codec array container, with an integrated compute engine that allows you to compress and compute on large datasets efficiently. In this talk, we will explore the latest features of Python-Blosc2, including its seamless integration with NumPy, and the Python Data ecosystem in general, and how it can help you tackle data challenges that exceed the limits of your available RAM, all while maintaining high performance.

Computational Tools and Scientific Python Infrastructure
Large Room