JuliaCon 2026

Gradients aren't always great -- a case study with MixedModels.jl
2026-08-12 , Room 2

The advent of convenient automatic differentiation has made gradient-based optimization the default strategy for many challenging problems and has revolutionized statistical practice.
At the same time, MixedModels.jl uses a gradient-free approach to optimization and remains best in class for linear mixed models.
Using MixedModels.jl as a case study, we will explore the tradeoffs of using the gradient and why gradient-free approaches remain relevant even in a world of easy autodiff.


The developers of MixedModels.jl are often asked why they don't use gradient-based optimization and GPUs, which have fueled many recent advances in statistics and machine learning, to make the package even faster.
In this talk, we'll focus on the first aspect: the use of the gradient in the optimization and use MixedModels.jl as a case study to discuss why gradients, even with modern automatic differentiation, may not provide much benefit or even be slower than gradient-free approaches. We'll look at why the evaluation of the gradient itself can be expensive enough that gradient-based optimization suffers from dramatically slower step speed.
We'll also discuss how particular objective functions can be "compatible" with particular gradient-free optimizers in a way that results in very rapid convergence, such that gradient-free approaches may not require substantially more iteration steps than gradient-based approaches.
We'll explore these properties with examples from our attempts to take advantage of the gradient in MixedModels.jl.
We'll see why we haven't (yet) moved to gradient-based approaches and, more generally, that gradient-free approaches still have a role to play, even in a world of convenient, accessible autodiff.

Phillip is a neuroscientist and contributor to the MixedModels.jl ecosystem.