2024-09-26 –, Gaston Berger
Machine Learning practitioners build predictive models from "noisy" data resulting in uncertain predictions. But what does "noise" mean in a machine learning context?
Over years of developing scikit-learn and exchanging with both applied data scientists and academics, I have progressively refined my understanding on topics such as the fundamental sources of "noisy" data, how to evaluate and improve the design of predictive models with respect to ranking power and probabilistic calibration, how to handle predictive uncertainty and turn probabilistic forecasts into optimal decisions with respect to an application-specific utility function.
The goal of this keynote to share what I learned with you and hopefully help you reflect on how to quantify the value of predictions and better use scikit-learn and other predictive modeling tools.
Slides: https://docs.google.com/presentation/d/1EBCSCDQ3nTPaKZGx9ZLWXfvkD1Y-ODo9j_ETAnx5zLQ/edit?usp=sharing
Olivier is an Open Source Fellow at probabl and a core contributor to the scikit-learn machine learning library.