2020-07-29 –, Red Track
TemporalGPs.jl provides a single-function API to make inference in Gaussian processes (GPs) from Stheno.jl dramatically more efficient in the time-series setting. It makes it feasible to perform (almost) exact inference with 10s of millions of data points quickly on a laptop, and plays nicely with StaticArrays.jl and Zygote.jl.
Gaussian processes (GPs) are flexible probabilistic models for real-valued functions, and are standard tool for nonlinear regression. Their default implementation involves a costly Cholesky decompostion, yielding cubic computational costs in N (dataset size).
There is an increasingly-broad literature on a collection of techniques that cast certain classes of GPs for time-series (almost-exactly) as linear stochastic differential equations. This exposes a Markov structure that can be exploited to make the complexity of inference scale linearly in N.
TemporalGPs.jl implements the above techniques, and provides a function that takes a Stheno.jl GP and constructs an (almost) equivalent linear SDE, which in turn exposes the same API as a Stheno.jl GP (modulo some restrictions).
The only other publicly available implementation of these techniques is available in the famous GPML Matlab package. Moreover, at the time of writing there isn't a publicly available implementation on these techniques that can be used in conjunction with reverse-mode algorithmic differentiation (to the best of the author's knowledge).
In this talk I'll present the user-facing API, show some basic benchmarking / empirical results, briefly discuss some technical challenges, and conclude with the package's future direct.
Will is a PhD student in the Machine Learning Group at the University of Cambridge. His research focuses on the scalability of Gaussian processes, how software should be implemented to make using them a pleasant experience, and how they can be utilised in climate science.