2026-08-12 –, Room 3
The SINDBAD framework, with Sindbad.jl, SindbadTEM, OmniTools.jl, TimeSamplers.jl, and ErrorMetrics.jl packages offers a user‑friendly, Julia‑based system for terrestrial model–data integration. It enables scalable, differentiable experiments across spatial and temporal scales, supporting next‑generation understanding of vegetation–water–carbon interactions.
The Julia ecosystem continues to expand with powerful, composable tools for scientific computing, data analysis, and simulation. In this talk, we introduce the Strategies to Integrate Data and Biogeochemical Models (SINDBAD) model–data‑integration framework, which comprises five complementary packages that together form a lightweight but versatile foundation for the terrestrial ecosystem modeling community. The packages are intentionally designed to be user‑friendly for low‑tech, high Earth‑system domain expertise, while fully leveraging the computational excellence of the Julia programming language. This lowers the barrier of entry for the next generation of Earth‑system modellers.
Sindbad.jl provides the umbrella framework for building and executing terrestrial ecosystem modeling experiments. It emphasizes modularity, reproducibility, and clarity, enabling users to carry out scientific analyses across spatial and temporal scales. SindbadTEM implements the core formulations for major ecosystem processes of the water and carbon cycles, and can be independently integrated into other Earth‑system modeling systems.
OmniTools.jl complements this by offering a curated collection of general‑purpose utilities ranging from filesystem helpers to data‑structure conveniences and helper functions, that extend beyond modeling applications. TimeSamplers.jl implements allocation‑free resampling of N‑dimensional arrays using a date‑indexed view. ErrorMetrics.jl provides a small but robust and extensible set of performance and accuracy metrics commonly used in model–data‑integration approaches.
Together, this set of SINDBAD packages enables the construction of modeling experiments that span spatial and temporal scales, especially benefiting from full differentiability enabled by Julia. Traditionally, such models have been limited to narrow ranges of scales; SINDBAD in Julia helps overcome this constraint.
We use this opportunity to demonstrate how SINDBAD can be applied to understand interactions among vegetation, water, and carbon cycle processes across scales. To do so, we create different realizations of the framework with varying levels of process complexity and coupling. These realizations are parameterized with different assumptions that lead to distinct model formulations and responses, each constrained by observational data appropriate to its scale:
- a global scale model focused on vegetation’s structural influence on the water cycle;
- a regional scale model with physiological coupling of water and carbon cycles, emphasizing interannual variability of vegetation fraction;
- an ecosystem scale model with a prognostic carbon cycle that allows additional constraints from Earth‑observation data;
- a hybrid machine‑learning–physically‑based modeling approach toward a global parameterization that links local ecosystem properties to global parameter fields using neural‑network‑based prediction of spatial parameter variability in an end‑to‑end learning system.
At the global scale, we find that incorporating observation‑based vegetation indices into a simple hydrological model improves simulations of monthly runoff and terrestrial water storage variations. At the regional scale, using vegetation‑fraction data from geostationary satellites in a photosynthetically coupled water–carbon model significantly improves simulations of gross primary productivity variability. At the ecosystem scale, a model linking C–H₂O fluxes and states, by prognostically coupling primary productivity, transpiration, and root allocation, benefits further from remote‑sensing observations of carbon states, even beyond eddy‑covariance constraints. These examples demonstrate that, with appropriate observational constraints, an across‑scale approach supports hypothesis testing for terrestrial C–H₂O processes.
However, direct comparisons across scales reveal that model–observation discrepancies at a given scale are often quantitatively comparable to differences among observational products themselves. To address this, we implement the hybrid modeling experiment and show that such workflows perform comparably to in‑situ parameter inversions, though their ability to generalize parameters remains limited, in particular for ecosystem processes with sparse observational constraints.
Dr. Sujan Koirala is a Research Project Group Leader at the Max Planck Institute for Biogeochemistry in Jena, where he leads work on data‑driven and process‑based modeling of the global carbon and water cycles. His research spans machine learning, terrestrial ecosystem modeling, climate extremes, groundwater resources, and large‑scale hydrology. He leads the development of the SINDBAD model–data‑integration framework and contributes to major community efforts such as FLUXCOM and ESMValTool. Dr. Koirala holds a PhD in Civil Engineering from The University of Tokyo and has extensive experience in high‑performance computing, scientific software development, and mentoring early‑career researchers.
I am a physicist by training and am currently studying Global Biogeochemical Cycles in the Earth System using Remote Sensing, Meteorological and other data sets based at the Max-Planck-Institute for Biogeochemistry, Jena, Germany.
My first commit to my first Julia package dates back to the year 2012 and since then I have authored and contributed to packages in the Julia Geodata and processing ecosystem, examples are NetCDF.jl, Zarr.jl, DiskArrays.jl, YAXArrays.jl EarthDataLab.jl and others. Some may know me under my github tag @meggart
A scientist at the Max Planck Institute for Biogeochemistry, advancing earth system models through hybrid modeling, integrating process-based models with machine learning. Through open, reproducible research and compelling visualizations, I bridge the gap between cutting-edge research and societal impact.