2020-07-31 –, Green Track
The ChainRules project allows package authors to write rules for custom sensitivities (sometimes called custom adjoints) in a way that is not dependent on any particular autodiff (AD) package.
It allows authors of AD packages to access a wealth of prewritten custom sensitivities, saving them the effort of writing them all out themselves.
ChainRules is the successor to DiffRules.jl and is the native rule system currently used by ForwardDiff2, Zygote and soon ReverseDiff
A perhaps counterintuitive requirement for differentiable programming is easy hand-coded rules for determining derivatives. You might think: “I thought the whole point of differentiable programming was to use AD, so I didn’t have to write all these derivatives by hand.”. Indeed you don’t have to, but that doesn’t mean you shouldn't be allowed to, and it doesn’t mean you can't get advantages out of doing do. Custom sensitivities allow programmers to insert domain knowledge that no autodiff system could ever figure out. Further custom rules, let you work around any bugs in the AD system, and fix performance issues.
So being able to write custom rules is important, and doing it once for every AD system is win on deduplicating effort.
A secondary advantage of ChainRules is that it provides a set of differential types to be used by AD systems. The differential types provided by ChainRules are very expressive, more expressive in-fact than is required for any current AD system. These types allow ChainRules to act as a lingua franca between AD systems. If it is advantageous because of some properties of your system to AD one part with ForwardDiff2 (via forward-mode), another part with Zygote (via source code transformation reverse mode) and another via Nabla (via overloading tape-based reverse mode) then you can; and each part can understand the derivative types returned by the other.
The ChainRules project has 3 packages:
- ChainRulesCore.jl: the minimum stuff required to implement custom rules for your package. Think of it like RecipesBase for Plots.jl. It should be used by all packages wanting to support rules.
- ChainRules.jl: a repository of rules for functions defined in Base and the Standard Libraries. This was separated out from ChainRulesCore to minimize load time. It should be used by AD packages wanting to consume rules.
- ChainRulesTestUtils.jl: robust testing utilities based on finite differencing. Its a test-time dependency for packages defining rules.
This talk will cover:
- An introduction to AD, including terminology such as pullback, custom sensitivity etc.
- The details the use and design of the ChainRules packages
- An explanation of some of the open questions in autodiff and our resolutions to them including: natural vs structural derivatives, mutating reverse-mode AD, chunked AD / change of basis, one-to-one vs many-to-many relationships between differential and primal types.
I am a research software engineer at Invenia Labs (Cambridge, UK). I help researchers use machine learning, constrained optimization, and generally tools from the technical computing domain to optimize the power grid. I get to do all the best parts of being a software developer and all the best parts of being a researcher, its great. I am a long-term contributor to the open-source JuliaLang ecosystem. I am passionate about building the tools to do research better.