2026-07-22 –, Room 1.38 (Ground Floor, Turing)
Class imbalance is a common challenge in real-world machine learning. This course explores why standard approaches fail and how to build reliable classifiers using scikit-learn's calibration and threshold-tuning tools.
We cover practical solutions including resampling strategies, probabilistic calibration with CalibratedClassifierCV, and decision threshold optimization using TunedThresholdClassifierCV. You'll learn to evaluate models appropriately with calibration curves and confusion matrices.
The course also addresses prevalence shift or in other words when your training data doesn't reflect the target population. We demonstrate weight-based training corrections and post-hoc probability adjustments applicable to any binary classifier.
Class imbalance is a common challenge in real-world machine learning. This course explores why standard approaches fail and how to build reliable classifiers using scikit-learn's calibration and threshold-tuning tools.
We cover practical solutions including resampling strategies, probabilistic calibration with CalibratedClassifierCV, and decision threshold optimization using TunedThresholdClassifierCV. You'll learn to evaluate models appropriately with calibration curves and confusion matrices.
The course also addresses prevalence shift or in other words when your training data doesn't reflect the target population. We demonstrate weight-based training corrections and post-hoc probability adjustments applicable to any binary classifier.
Guillaume is an open-source software engineer working at :probabl. He is a core maintainer of the scikit-learn and imbalanced-learn libraries.
I'm an open-source software developer with a background in computational linguistics and a contributor to scikit-learn.