Loïc Estève
Loïc has a Particle Physics background, which is how he discovered Python towards the end of his PhD.
He is a scikit-learn and joblib core contributor and has been involved in a number of Python open-source projects in the past 10 years, amongst which Pyodide, dask-jobqueue, sphinx-gallery and nilearn.
:probabl.
Position / Job –Open-Source engineer
GitHub/GitLab profile URL – LinkedIn –Session
We all love to tell stories with data and we all love to listen to them. Wouldn't it be great if we could also draw actionable insights from these nice stories?
As scikit-learn maintainers, we would love to use PyPI download stats and other proxy metrics (website analytics, github repository statistics, etc ...) to help inform some of our decisions like:
- how do we increase user awareness of best practices (please use Pipeline and cross-validation)?
- how do we advertise our recent improvements (use HistGradientBoosting rather than GradientBoosting, TunedThresholdClassifier, PCA and a few other models can run on GPU) ?
- do users care more about new features from recent releases or consolidation of what already exists?
- how long should we support older versions of Python, numpy or scipy ?
In this talk we will highlight a number of lessons learned while trying to understand the complex reality behind these seemingly simple metrics.
Telling nice stories is not always hard, trying to grasp the reality behind these metrics is often tricky.