Constrained Control with Neural Feedback Policies in DiffEqFlux JuliaCon 2021 (times are UTC)

Constrained Control with Neural Feedback Policies in DiffEqFlux
.ical

We explore the neural ODE approach to solve nonlinear optimal feedback control problems with constraints using DiffEqFlux.
The original controls of the problem definition are substituted by the output of a neural network feedback policy whose parameters become the new control variables.
Constraint satisfaction is enforced on the states with barrier methods, while constraints in controls are enforced by saturation from the last activation of the neural controller.

We explore the differentiable control approach implemented in DiffEqFlux for solving general constrained nonlinear optimal control problems with neural network controllers.
The original control variables of the problem are substituted for the output of an embedded neural controller, turning the weights of the neural network the time-independent optimization variables.
The controller is of closed-loop form, being only a function of time through its feedback dependence on state at each timepoint.
Leveraging DiffEqFlux, adjoint sensitivity analysis is used to estimate the gradient of the Bolza-type optimal control problem with respect to the parametrization of the neural network and first order optimization methods are then used to iteratively approximate an optimal solution.
We emphasize the relation of this problem setting with the origins of the backpropagation algorithm through the Kelley-Bryson gradient procedure and with control vector iteration as its continuum time counterpart in the numerical optimal control literature.
The capabilities of the technique to handle state constraints are explored through relaxed logarithmic barrier functions that act as a running penalty functional, while the control constraints are naturally enforced through saturation from the final activation function of the neural network policy.
The effectiveness of the technique is showcased with nonlinear control problems in bioprocesses, including setpoint tracking objectives and economic running costs.

Ilya Orson Sandoval

I am a PhD student in Imperial College working at the intersection of reinforcement learning, optimal control and process systems engineering. Previously, I worked in data science and software engineering within the energy and food industries in Mexico. I have a background in physics.

Constrained Control with Neural Feedback Policies in DiffEqFlux .ical

Constrained Control with Neural Feedback Policies in DiffEqFlux
.ical