EuroSciPy 2025

Loïc Estève

Loïc has a Particle Physics background, which is how he discovered Python towards the end of his PhD.

He is a scikit-learn and joblib core contributor and has been involved in a number of Python open-source projects in the past 10 years, amongst which Pyodide, dask-jobqueue, sphinx-gallery and nilearn.


Affiliation

:probabl.

Position / Job

Open-Source engineer

GitHub/GitLab profile URL

https://github.com/lesteve

LinkedIn

https://www.linkedin.com/in/loicesteve/


Session

08-20
14:05
30min
PyPI in the face: running jokes that PyPI download stats can play on you
Loïc Estève

We all love to tell stories with data and we all love to listen to them. Wouldn't it be great if we could also draw actionable insights from these nice stories?

As scikit-learn maintainers, we would love to use PyPI download stats and other proxy metrics (website analytics, github repository statistics, etc ...) to help inform some of our decisions like:
- how do we increase user awareness of best practices (please use Pipeline and cross-validation)?
- how do we advertise our recent improvements (use HistGradientBoosting rather than GradientBoosting, TunedThresholdClassifier, PCA and a few other models can run on GPU) ?
- do users care more about new features from recent releases or consolidation of what already exists?
- how long should we support older versions of Python, numpy or scipy ?

In this talk we will highlight a number of lessons learned while trying to understand the complex reality behind these seemingly simple metrics.

Telling nice stories is not always hard, trying to grasp the reality behind these metrics is often tricky.

Computational Tools and Scientific Python Infrastructure
Small room