PyCon DE & PyData 2025

Benchmarking Time Series Foundation Model with sktime
2025-04-23 , Helium3

Recent time series foundation models such as LagLlama, Chronos, Moirai, TinyTimesMixer promise zero-shot forecasting for arbitrary time series. One central claim of foundation models is their ability to perform zero-shot forecasting, that is, to perform well with no training data. However, performance claims of foundation models are difficult to verify, as public benchmark datasets may have been a part of the training data, and only the already trained weights are available to the user.

Therefore, performance in specific use cases must be verified on the use case data itself, to ensure a reliable assessment of forecasting performance. sktime allows users to easily produce a performance benchmark of any collection of forecasting models, foundation models, simple baselines, or custom methods, on their internal use case data.


In the past years, time series foundation models emerged. They have the potential to change time series forecasting. For example, multiple time series models such as LagLlama, Chronos, Moirai, TinyTimesMixer promise zero-shot forecasting for arbitrary time series. Furthermore, also sktime started to unify the interfaces of the various foundation models to make the usage of those models easy. 
However, whether these time series foundation models provide added value to various forecasting applications is still unclear. Thus, benchmarking is necessary. In sktime, we have implemented a benchmarking module enabling easy comparison of those time series foundation models on custom datasets and with arbitrary metrics.

Our talk will outline how sktime’s benchmarking module works and how users can use it to evaluate time series foundation models. 
We will show how to combine the benchmarking module with the time series foundation models.
We will show the results of a small benchmarking study using time series foundation models and statistical time series models. 
We will outline our roadmap for time series foundation models. 

sktime is developed by an open community with the aim of ecosystem integration in a commercially neutral, charitable space. We welcome contributions or donations and seek to provide opportunities for anyone worldwide.


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Novice

Public link to supporting material, e.g. videos, Github, etc.:

https://www.sktime.net/en/stable/get_started.html

I completed my PhD in deep learning based time series forecasting in 2023 with the Karlsruhe Institute of Technology. In sktime, I am focusing on forecasting methods (mainly deep learning based ones) and implementing pipelines.