sktime - python toolbox for time series: next-generation AI – deep learning and foundation models
2024-09-26 , Louis Armand 1 - Est

sktime is a widely used scikit-learn compatible library for learning with time series. sktime is easily extensible by anyone, and interoperable with the pydata/numfocus stack.

This talk presents progress, challenges, and newest features off the press, in extending the sktime framework to deep learning and foundation models.

Recent progress in generative AI and deep learning is leading to an ever-exploding number of popular “next generation AI” models for time series tasks like forecasting, classification, segmentation.

Particular challenges of the new AI ecosystem are inconsistent formal interfaces, different deep learning backends, vendor specific APIs and architectures which do not match sklearn-like patterns well – every practitioner who has tried to use at least two such models at the same time (outside sktime) will have their individual painful memories.

We show how sktime brings its unified interface architecture for time series modelling to the brave new AI frontier, using novel design patterns building on ideas from hugging face and scikit-learn, to provide modular, extensible building blocks with a simple specification language.


sktime’s technical mission is to integrate the ecosystem of time series modelling behind a simple, unified, scikit-learn like first-order modelling language. After a years-long journey of integration and stringent architecture, sktime provides unified interfaces for hundreds of algorithms for tasks such as time series forecasting or classification, which can be used as building blocks in a highly composable interface.

sktime compatible algorithms – 1st party in the package itself, or 3rd party in private extensions – follow consistent, exchangeable interfaces, which are kept robust and extensible through strong API checks and testing tools.

While sktime, as a framework and marketplace, is nowadays a go-to place for the “canon” of statistical, econometric, ML models, a new, rapidly evolving “wild west” has emerged, with a growing wave of time series foundation models being released, and renewed interest in deep learning based algorithms.

In the talk, we will discuss specific challenges that need to be addressed, on the framework layer, estimator specific dependency management layer, related to availability of algorithms, and project governance:
- The need to cover pre-training, fine-tuning, which implies handling of pre-trained weights - e.g., from hugging face - as well as extending sklearn-like interfaces to fine-tuning style update or extending task models such as to global forecasting
- The need to cover back-end abstractions for data containers, deep learning frameworks, and possibly vendor specific APIs; implying interfaces and layers usually not found in scikit-learn-like libraries; and patterns that require closer integration across packages
- As an explicitly neutral, non-commercial library, dealing with extreme commercial pressures and related behavioural patterns, e.g., companies wanting to establish “their algorithm” and trying to nudge users towards – and lock them in to – proprietary APIs rather than framework APIs
- The interaction with open research and data ecosystems, as foundation models require closer integration with data corpora for training and tuning

We think the above pose a new set fundamental challenges, similar to the initial gap that sktime had to bridge, between a world without a comprehensive framework for time series ML, and the current state in which sktime exists (and in which multiple projects take inspiration from its design ideas).

Our talk outlines new features, current roadmap, and partial progress towards building a framework in which next generation algorithms do not only coexist with classical ML, but become fully interoperable, comparable, and composable.

sktime is developed by an open community, with aims of ecosystem integration in a commercially neutral, charitable space. We welcome contributions or donations, and seek to provides opportunity for anyone worldwide.

I completed my PhD in deep learning based time series forecasting in 2023 with the Karlsruhe Institute of Technology. In sktime, I am focusing on forecasting methods (mainly deep learning based ones) and implementing pipelines.