Abstract
The time-invariant LQ (linear-quadratic) optimal regulator problem is studied from a view point of learning control based on a gradient method, and fast convergent learning is discussed.
A key relation is that some important sensitivity functions related to the performance function of the unknown system are given by inner products of response signals. It is also shown that the matrix Riccati differense equation is rewritten by the sensitivity functions.
A fast convergent learning algorithm to optimize the state feedback gains for the unknown system is derived on the bases of these relations. Its convergent condition and the convergent speed are given by those of the Riccati difference equation.
From a numerical simulation for a discrete-time second order system, the system is stabilized typically within several steps of the system motion, and the gain is nearly optimized within 20∼40 steps even if the original system is unstable.