Foundational Models for Time Series Forecasting: are we there yet?
2024-09-26 , Louis Armand 1 - Est

Transformers are everywhere: NLP, Computer Vision, sound generation and even protein-folding. Why not in forecasting? After all, what ChatGPT does is predicting the next word. Why this architecture isn't state-of-the-art in the time series domain?

In this talk, you will understand how Amazon Chronos and Salesforece's Moirai transformer-based forecasting models work, the datasets used to train them and how to evaluate them to see if they are a good fit for your use-case.


This is a pragmatic talk for Generative AI and ML practitioners alike, eager to understand how researchers managed to adapt transformer-based architectures to forecasting problems, as well as how to evaluate new models for specific use-cases.

The first part of the talk explains how researchers discretised time series into a finite "dictionary" and the theoretical limitations of this approach, such as how to deal with different sampling frequencies during training.

We will then make a parallel with Large Language Models (LLMs) scaling laws to evaluate the data strategies used to train such universal forecasting models. In other words, we will ask the question whether we have enough publicly available time-series data to train foundational models, and how this can affect such models' evaluation.

Finally, we will display how we attempted to benchmark those models against a robust baseline of models, and where our experiment is compared to the publicly available results.

No specific prior knowledge is required for attendance. Though forecasting practitioners stand to gain the most from this talk, every practitioner in machine learning or generative AI can follow along and draw the key conclusions.

Outline
Minutes 1-3. Problem statement: challenges in adapting transformers to the time-series domain.
Minutes 3-10. How Chronos and Moirai are implemented.
Minutes 10-15. Do we have enough data to train universal forecasters?
Minutes 15-25. Lessons learnt from evaluating transformer-based models for a specific use-case.
Minutes 25-30. Wrap up and Q&A.

ML Engineer @xtream

Data Scientist @ xtream