PyData London 2026

Don’t Call It “The Forecast”: Designing Prediction Systems at Scale
2026-06-07 , Hardwick Hub

Sailors avoid the word ‘rope’. Once it has a job, it becomes a line with a specific name: halyard, sheet or warp. In forecasting, we often do the opposite — projections, baselines, scenarios and targets all end up being called ‘the forecast’.

In practice, forecasts live in a high-dimensional space. They vary by origin date, prediction horizon, scenario assumptions, uncertainty representation, reconciliation level and decision context. Treating them as a single artefact creates ambiguity, semantic drift and misaligned expectations.

In this talk, I’ll show how we reframed forecasting at Spotify as a structured prediction problem rather than simply a modelling task. I’ll cover practical design patterns for representing forecast objects across multiple origins and scenarios, handling probabilistic outputs, implementing hierarchical reconciliation and tracking lineage and versioning in Python-based systems.

Aimed at data scientists and ML engineers working with production systems, this talk offers a framework for thinking about forecast dimensionality and concrete implementation patterns you can apply in your own forecasting platforms.


Sailors avoid the word ‘rope’. Once it has a purpose, it earns a precise name. A halyard is not a sheet, and neither is just a rope. The distinction enables coordination at scale.

In many forecasting systems, we lack this semantic discipline. We refer to ‘the forecast’ as if it were a single, well-defined object. In practice, forecasts vary along multiple dimensions: origin date, prediction horizon, scenario assumptions, uncertainty representation, reconciliation level and decision context. A baseline projection is not a stress scenario. A probabilistic model output is not a financial guide. Collapsing these into a single artefact creates ambiguity, semantic drift and brittle workflows.

This talk frames forecasting as a data modelling and systems design problem. Drawing on production systems built at Spotify, I’ll describe how we:

  • Represent forecasts as structured, high-dimensional objects
  • Encode forecast dimensions explicitly in Python data models
  • Manage multiple origins and scenarios reproducibly
  • Implement hierarchical reconciliation workflows
  • Track forecast lineage and versioning across runs to ensure reproducibility and auditability

The emphasis is on architecture and implementation rather than modelling technique. Examples draw from Python-based production workflows, including Pandas data structures, schema design and storage patterns.

Outline:
- 0–5 min: The rope -> line analogy and the forecasting naming problem
- 5–15 min: The dimensional structure of forecasts
- 15–30 min: Implementation patterns in Python
- 30–35 min: Organisational impact and lessons learned
- 35–40 min: Q&A

Target audience: Data scientists and ML engineers building or maintaining production forecasting systems.

Takeaway: Making forecast dimensions explicit enables prediction systems that scale, reduce semantic drift and support clearer decisions.

Thomas Ogden is a Senior ML Engineer in Financial Engineering at Spotify. He builds tools, mostly with probabilistic machine learning on sequences and graphs. He once did a PhD in Quantum Optics theory and still thinks about physics a lot.