2024-07-10 –, Else (1.3)
Julia's SciML is a large ecosystem of numerical solvers, modeling tools, utility libraries, and much more. What are the guiding principles of what makes "a SciML library"? In this talk we will go into detail about the contribution and code practices of the SciML organization. We will highlight how we've made the interfaces more concrete, the continuous benchmarking practices, high-level error messages, and other improvements seen throughout the last few years.
The focus of this talk is not on speed and numerics but on infrastructure: how is it done and why. SciML's contribution format is codified in the SciML Style and COLPRAC which specify a repeatable format for large-scale reproducible numerical ecosystems in the Julia ecosystem. We want to share our processes and practices so that other Julia organizations can adopt the policies and procedures we are using to scale.
The first discussion is on documentation. The SciML documentation greatly differs from a standard Julia package documentation because it's documenting hundreds of packages and attempting to not silo them but emphasize how they should be used together. We will discuss some of the high-level design principles involved, how Multidocumenter.jl is used to piece it all together, some of the major wins seen with this approach, and some of the downsides.
Next we will focus on the testing infrastructure of the SciML organization. The SciML organization might have some of the most expensive and comprehensive testing on Github for open source projects in general, and we will discuss why this is required for detailed studies of numerical solvers. We will discuss the various kinds of tests that are used, including ones not standard in other repos like downstream testing and downgrade testing, all to build a more stable system. All of this discussion is easily repeatable by audience members by simply going to one of the repos and copying the .github/workflows scripts!
Next we will talk about error messages and interfaces. But Julia doesn't have interfaces? Well with SciML we have over the years built many interface packages to thoroughly document our interfaces and requirements in order to make it easy to statically check for high level correctness directly from type definitions. We will describe why ArrayInterface.jl, StaticArrayInterface.jl, and RecursiveArrayTools.jl exist, the interesting questions like "if you have an array A, how do you make the best Jacobian matrix for it?", and the new and improved error messages that SciML throws if interface incompatibility is detected. You will likely learn a lot about the edge cases of generic programming and how the SciML libraries have been made to support "any wild inputs", but importantly, understand how that has evolved from the wild west into a maintainable and testable system with strong guarantees on generic behavior.
Finally, we will discuss the continuous benchmarking practices of the SciML ecosystem. We believe benchmarks are never complete, they are always evolving, and they are not a tool to win arguments but to improve code. Thus what really matters is making benchmarks that evolve over time to continue help you learn more about what the right strategies should be. We will describe how the SciML generic interfaces facilitate easier benchmarking and where we are currently lacking.
Dr. Chris Rackauckas is the VP of Modeling and Simulation at JuliaHub, the Director of Scientific Research at Pumas-AI, Co-PI of the Julia Lab at MIT, and the lead developer of the SciML Open Source Software Organization. For his work in mechanistic machine learning, his work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award. See more at https://chrisrackauckas.com/. He is the lead developer of the Pumas project and has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP.