Mones Raslan PyCon DE & PyData 2026

Mones Raslan
.ical

Session

How to compare apples with oranges: Proper evaluation of article-level demand forecasts

How do you evaluate performance when you predict more than 10 million time series each day? While a good plot can be worth more than a thousand metrics for a single time series, with large-scale machine learning models implemented with LightGBM and PyTorch we have to resort to meaningful aggregations. We will share insights and learnings from the past 2 years of deploying and operating our article-level demand forecasting models at the pricing department of Zalando.
This talk moves beyond basic metrics to showcase the pitfalls of aggregated error measures and the best practices we’ve developed to keep our stakeholders informed and our models accurate.

PyData: Machine Learning & Deep Learning & Statistics

Titanium [2nd Floor]

Mones Raslan .ical

Session

Mones Raslan
.ical