2024-07-10 –, Else (1.3)
In this talk we will show you how to boost your latent SciML model, significantly increasing its performance and data efficiency.
In particular, we will present GOKU-UI, an evolution of the continuous-time generative model GOKU-net, which incorporates attention mechanisms and a novel multiple shooting training strategy in the latent space. On both simulated and empirical brain data, it achieves enhanced performance in reconstruction and forecast tasks while effectively capturing brain dynamics.
Scientific Machine Learning (SciML) is a burgeoning field that synergistically combines domain-aware and interpretable models with agnostic machine learning techniques, often yielding increased interpretability, generalizability, and data efficiency [1, 2, 3]. Latent Ordinary Differential Equations (Latent ODEs)[4, 5] are VAE-like generative models that encode time series data into a latent space ruled by a differential equation which is parametrized by a neural network. Building on Latent ODEs, Linial et al. [6] introduced GOKU-nets (Generative ODE Modeling with Known Unknowns), which fundamental difference with the former is the inclusion of a predefined differential equation structure as a prior for the latent dynamics. Compared to LSTM and Latent-ODE on pendulum videos and cardiovascular system modeling, GOKU-net excelled in reconstruction, forecasting, reduced training data needs, and offered better interpretability.
While leveraging the power of the Julia SciML Ecosystem and Flux, we not only implemented the GOKU-nets in Julia, broadening the original model's spectrum to incorporate other classes of differential equations, but also introduced two key enhancements: (1) the addition of attention mechanisms to the main part of the model that infers the parameters of the differential equations and (2) a novel training strategy based on the multiple shooting technique [7, 8] in the latent space.
These modifications have led to a significant increase in its performance in both reconstruction and forecast tasks, as demonstrated by our evaluation on simulated and empirical data. Our GOKU-nets with Ubiquitous Inference (GOKU-UI) outperformed all baseline models on synthetic datasets even with a training set 16-fold smaller, underscoring its remarkable data efficiency. Furthermore, when applied to empirical human brain data, while incorporating stochastic Stuart-Landau oscillators into its dynamical core, our proposed enhancements markedly increased the model's effectiveness in capturing complex brain dynamics. GOKU-UI demonstrated a reconstruction error five times lower than other baselines, and the multiple shooting method reduced the GOKU-nets prediction error for future brain activity up to 15 seconds ahead. By training GOKU-UI on resting state fMRI data, we encoded whole-brain dynamics into a latent representation, learning a low-dimensional dynamical system model that could offer insights into brain functionality and open avenues for practical applications such as the classification of mental states or psychiatric conditions. Ultimately, our research provides further impetus for the field of Scientific Machine Learning, showcasing the potential for advancements when established scientific insights are interwoven with modern machine learning.
During this presentation, attendees will develop a comprehensive understanding of both the foundational GOKU-net model and its advanced iteration, GOKU-UI. We will showcase through experimental evidence the performance improvements these advancements bring, while also explaining the rationale and specific details behind the enhancements. We hope that attendees will learn from our experience, enabling them to implement similar strategies to rise the Ki of their own SciML models.
Paper: https://openreview.net/forum?id=uxNfN2PU1W
Codes: https://github.com/gabrevaya/TMLR_GOKU-UI
[1] Baker, Nathan, et al. Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence. USDOE Office of Science (SC), Washington, DC (United States), 2019.
[2] Rackauckas, Christopher, et al. "Universal differential equations for scientific machine learning." arXiv preprint arXiv:2001.04385 (2020).
[3] Shen, Chaopeng, et al. "Differentiable modelling to unify machine learning and physical models for geosciences." Nature Reviews Earth & Environment 4.8 (2023): 552-567.
[4] Chen, Ricky TQ, et al. "Neural ordinary differential equations." Advances in neural information processing systems 31 (2018).
[5] Rubanova, Yulia, Ricky TQ Chen, and David K. Duvenaud. "Latent ordinary differential equations for irregularly-sampled time series." Advances in neural information processing systems 32 (2019).
[6] Linial, Ori, et al. "Generative ode modeling with known unknowns." Proceedings of the Conference on Health, Inference, and Learning. 2021.
[7] Ribeiro, Antônio H., et al. "On the smoothness of nonlinear system identification." Automatica 121 (2020): 109158.
[8] Turan, Evren Mert, and Johannes Jäschke. "Multiple shooting for training neural differential equations on time series." IEEE Control Systems Letters 6 (2021): 1897-1902.
I am a Physics PhD student at the University of Buenos Aires, in constant collaboration with Mila and IBM. My research is centered on Scientific Machine Learning (SciML), specifically on how to effectively incorporate prior knowledge into time series machine learning models through differential equations in order to enhance performance, interpretability, and data efficiency.
I've been applying my models to decode and predict human brain dynamics, but these general methods can be applied to many other domains such as wearables, biomarkers, digital twins, climate, finance, and other applications involving complex high dimensional time series data.
I am currently in the last months of my PhD program and open to interesting job opportunities.