Generative Models with Latent Differential Equations in Julia
2021-07-29 , Green

Scientific Machine Learning (SciML) is the branch of scientific computing that combines domain-aware and interpretable models with powerful machine learning techniques. The Julia language has been a key enabler of this burgeoning field, thanks to its unique SciML ecosystem. In this talk, we will present a contribution in this direction: an easy and flexible implementation of generative latent differential equations models.


Scientific Machine Learning (SciML) is a very promising and exciting field that has been emerging in the past few years, with particular strength within the Julia community given the thriving SciML ecosystem. It consists of a growing set of diverse tools focused on combining traditional scientific modeling with novel machine learning (ML) techniques. The former is usually based on the long-established field of differential equations (DE) models, while the latter, though more recent, provides powerful general-purpose tools, and has demonstrated remarkable achievements in many applications.

Both approaches, of course, have their advantages and drawbacks: traditional modeling is far from trivial, since building an adequate model for a given problem usually requires educated guesses and approximations based on a deep understanding of the system being studied. Often in practice, it is only possible to build partial models and have access only to an incomplete set of the considered variables, sometimes even in a different unknown coordinate system. On the other hand, using orthodox ML models on poor-quality and scarce scientific data can be disadvantageous because of the lack of interpretability of these models, and the dependence on large amounts of training data to achieve good generalization.

SciML is a bridge between these two worlds, taking the best from each. A perfect example of such hybrid solutions is the case of Universal Differential Equations [1], where prior scientific insight is used to build some parts of a DE model, filling the unknown terms with neural networks (NN). They jointly optimize the DE parameters and NN weights using automatic differentiation and sensitivity analysis algorithms. This powerful approach was developed by members of the Julia community and is readily available to use in the DiffEqFlux.jl package. However, this method only works when one has direct measurements of the state variables of the DEs models, which is not always the case.

There exists a class of approaches that tackles this issue by constructing latent DE models, where other NN layers learn transformations from the input space to a latent DE space, usually with lower dimensionality. Some examples of this approach are LatentODEs [2,3] and GOKU-nets [4]. In a broad view, these models consist of a Variational Autoencoder structure with DEs inside. Their decoders contain the DEs, whose initial conditions (and in some cases, parameters) are sampled from distributions learned by the encoders. In the case of LatentODEs, NNs are used to approximate the latent ODE, while in the case of GOKU-nets, one can use prior knowledge to provide some ODE model for the latent dynamics.

Currently, Flux.jl and the SciML ecosystem have all the functionalities to build these latent DE models, but this process can be time-consuming and possibly has a steep learning curve for people without a background in machine learning. Our goal is to provide a package that makes latent differential equation models readily accessible with high flexibility in architecture and problem definition.

In this presentation, we will introduce the basic background and concepts behind latent differential equation models, in particular, presenting the GOKU-net architecture. We will then show our implementation structure via a simple example: given videos of pendulums of different lengths, learn to reconstruct them by passing through their latent DE representation. We anticipate that our presentation shall be a user-friendly introduction to latent differential equations models for the Julia community.

Work done in collaboration with:

Jean-Christophe Gagnon-Audet¹*
Mahta Ramezanian¹
Vikram Voleti¹
Irina Rish¹
Pranav Mahajan²
Guillermo Cecchi³
Silvina Ponce Dawson⁴
Guillaume Dumas¹

*creator of the beautiful diagrams that you will see in the presentation
¹ Mila & Université de Montréal
² University of Pilani
³ IBM Research
⁴ CONICET & University of Buenos Aires

[1] Rackauckas, C., Ma, Y., Martensen, J., Warner, C., Zubov, K., Supekar, R., ... & Edelman, A. (2020). Universal differential equations for scientific machine learning. arXiv preprint arXiv:2001.04385.

[2] Chen, R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. (2018). Neural ordinary differential equations. arXiv preprint arXiv:1806.07366.

[3] Rubanova, Y., Chen, R. T., & Duvenaud, D. (2019). Latent odes for irregularly-sampled time series. arXiv preprint arXiv:1907.03907.

[4] Linial, O., Eytan, D., & Shalit, U. (2020). Generative ODE Modeling with Known Unknowns. arXiv preprint arXiv:2003.10775.

PhD student in Physics at University of Buenos Aires.