JuliaCon 2020 (times are in UTC)

JuliaCon 2020 (times are in UTC)

Automatic gradient and scale for high dimensional optimization

Optimization and machine learning must tune models to fit data. Large-scale problems are typically optimized using a variant of gradient descent, where the gradient is calculated automatically, but having such limited information about function behavior slows progress. I will describe mathematics and code for extracting the gradient and "scale"—an upper bound on the diagonal quadratic remainder in a Taylor series expansion—and how having both quantities available enhances optimization.