Optimization and machine learning must tune models to fit data. Large-scale problems are typically optimized using a variant of gradient descent, where the gradient is calculated automatically, but having such limited information about function behavior slows progress. I will describe mathematics and code for extracting the gradient and "scale"—an upper bound on the diagonal quadratic remainder in a Taylor series expansion—and how having both quantities available enhances optimization.

Tim Holy
I am the Alan A. and Edith L. Wolff Professor of Neuroscience at Washington University in St. Louis.
I contribute to Julia and its package ecosystem, particularly in developer tools and image processing.