Enhancing Machine Learning Workflows with skore
2025-10-01 , Gaston Berger

Discover how skore, a new-born open-source Python library, can elevate your machine learning projects by integrating recommended practices and avoiding common pitfalls. This talk will introduce skore's key features and demonstrate how it can streamline your model evaluation and diagnostics processes.


Topic and Relevance:
The talk will focus on skore, an open-source Python library designed to help data scientists apply recommended practices and avoid common methodological pitfalls in scikit-learn. Skore is particularly relevant for data scientists and machine learning practitioners who use scikit-learn and are looking to enhance their model evaluation and inspection, as well as diagnostics workflows (for now. We hope it to be larger by September, but as suggested in the guidelines, not counting on it). This talk also promotes more guided data science, with less tedious boilerplate code thanks to integrated one-liners, as skrub (https://skrub-data.org/stable/) could also do for instance.

Audience:
This talk is aimed at data scientists, machine learning engineers, and anyone interested in improving their machine learning workflows. No prior knowledge of skore is required (obviously).

Type of Talk:
The talk will be informative, it will include both slides and notebooks to provide practical and real-life examples. The notebooks will be coded beforehand, it will not be live-coding. The tone will be engaging and educational, aimed at providing actionable insights that attendees can apply to their own projects. We will also try to add interaction with the public (although with the room size, it might only be some sort of poll questions where people will have to raise their hand).

Takeaways:
By the end of the talk, attendees will:

  1. Understand the key features of skore.
  2. Learn how to use skore for automated model evaluation and diagnostics.
  3. Gain insights into best practices for avoiding common methodological pitfalls in machine learning.
  4. Be equipped to integrate Skore into their existing machine learning workflows for more robust and effective model development.

Outline:

  1. Hello! (2 minutes)
    1. who the speakers are,
    2. what is probabl (the goal is not to have to do a business intro, but rather let people understand where skore comes from)
  2. Introduction to Skore (3 min):
    • Brief overview of skore and its mission.
      • Skore contains two parts: a commercial product called skore hub, and a free open source part called skore lib. In this talk, we will focus on skore lib.
    • Why skore is a valuable addition to the scikit-learn ecosystem.
  3. Feature focuses (core of the talk):
    1. Evaluating a model (6 min)
      1. elements about metrics and evaluation graphs
    2. Comparing models (6 min)
      1. how can the various models tried in the previous phase can be compared efficiently, taking into account several metrics
    3. Inspecting model before deployment (6 min)
      1. before deployment, one should be able to have deep insights into the chosen model.

For each feature focus, we will present how skore can help data scientists in each part of their data science workflow. There will be a couple of theoretical elements, and mostly code examples.

The three parts will be demoed by using the same end-to-end use case to ensure consistency.

The three features already exist in skore. They will most likely strongly evolve until September.

  1. Take aways, opening & conclusion (2 min)
  2. Q&A (5 min)

Currently Product Engineer at Probabl, Marie is also co-organizer of Women in Machine Learning and Data Science Paris.