0% found this document useful (0 votes)
97 views

Time-Optimal Control Problems

1) The document discusses time-optimal control problems and using the Maximum Principle to find optimal controls. 2) For the example of bringing a car to rest at the origin in minimal time using bounded acceleration, the Maximum Principle shows the optimal control is "bang-bang", alternating between the extreme values of ±1. 3) More generally, optimal controls will be bang-bang for linear time-invariant systems if the system is "normal", meaning the pairs (A,bi) are controllable for each input channel i. Bang-bang also holds for some nonlinear systems.

Uploaded by

va3ttn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Time-Optimal Control Problems

1) The document discusses time-optimal control problems and using the Maximum Principle to find optimal controls. 2) For the example of bringing a car to rest at the origin in minimal time using bounded acceleration, the Maximum Principle shows the optimal control is "bang-bang", alternating between the extreme values of ±1. 3) More generally, optimal controls will be bang-bang for linear time-invariant systems if the system is "normal", meaning the pairs (A,bi) are controllable for each input channel i. Bang-bang also holds for some nonlinear systems.

Uploaded by

va3ttn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Lecture 17

4.4 Time-optimal control problems

Goal: illustrate the use of the Maximum Principle , but also to reveal fundamental theoretical features
of these problems (bang-bang, connections with Lie brackets).
Will go from more specific to more general.

Example 8 (Soft landing [Sussmann, Handout 4, p. 7]). Bring a car to rest at the origin in minimal
time using bounded acceleration (parking problem).

x = u, 1 u 1
State-space equations:

x1 = x2
x2 = u

where x1 (0), x2 (0) are given. It is clear that the optimal control exists because:

The system is controllable (well, tricky for |u| 1, but clear here) can hit 0;

Control is bounded cannot do it arbitrarily fast.

So we expect that there exists a control u that achieves the transfer in some optimal time t .
More formal discussion of existencelater.
Let us use the Maximum Principle to narrow down the candidates. L = 1, H = p0 + p1 x2 + p2 u,
p0 = const, ! " ! " ! "
p1 Hx1 | 0
= =
p2 Hx2 | p1
So p1 = const, p2 = p1 p2 (t) = p1 t + p2 (0) (linear function of time). By the Hamiltonian
#$%&
const
maximization condition:
1
if p2 (t) > 0

u (t) = sgn(p2 (t)) = 1 if p2 (t) < 0 (4.9)


? if p2 (t) = 0
(when p2 = 0, H doesnt depend on u, and so the Maximum Principle gives no info about u ).
If p2 = 0 at isolated points, no problem. Can p2 be identically 0 over some time interval? Suppose
p2 0 on some [t1 , t2 ]. Then p2 = p1 0, so (p1 , p2 ) = 0. But we saw earlier that this is impossible,
because H 0 (free terminal time) would then imply p0 = 0 and the Maximum Principle says that
p0 , p1 , p2 cannot all vanish simultaneously. So, (4.9) defines u uniquely everywhere except when p2
crosses the 0 value.
How many such crossings can we have? p2 (t) is linear at most 1!

105
Optimal control patterns: u = 1, then 1 (full acceleration, followed by full braking). Or the other
way around. Or just 1, or just 1. This depends on the initial condition. (Note that if x1 (0) < 0 then
we have to go right, and if x1 (0) > 0 then we have to go leftthis symmetry accounts for double the
number of cases.)
The property that u only takes the extreme values 1 is intuitively natural and important. Such
controls are called bang-bang.
Is this enough to uniquely determine the optimal control?
Yes. Lets plot solutions of x = u for u = 1. u = 1 x = t + a x = 12 t2 + at + b (a, b are
constants). Easy to see that x = 12 x2 + c, c is a constant dependent on a, b (c = b a2 /2). This is a
parabola in the (x, x)-plane. Similarly, for u = 1 we have x = 12 x2 + c.
We need to hit 0, and only two among these two families of trajectories do that. (Can view as
flowing back from all points on the switching curve, covering R2 .) Discuss what the car is doing. Switch
onceat the switching curve.
x x x

x = 2x

x x x


x = 2x

Figure 4.15: Bang-bang time-optimal control of the double integrator: (a) trajectories for u = 1, (b) trajectories
for u = 1, (c) the switching curve

In this case we have a complete description of x , and u is a state feedback. (This is piecewise
smooth regular feedback synthesis.)

For what general classes of systems do we have:

bang-bang property? lets see this next;


Regular feedback synthesis? hard, wont go into this ([Sussmanns survey]).

Exercise 14 (due Apr 4) Repeat all the above steps for the modified problem in which we want
x(t ) = 0 but without any constraints on final velocity (slam into a wall). Still 1 u 1, still
minimal time problem. Use the Maximum Principle to find the optimal control. Discuss whether its
bang-bang. Address feedback synthesis (describe optimal trajectories in the (x, x)-plane).

4.4.1 The bang-bang principle

x = Ax + Bu, x Rn , u U Rm

106
LTI, but generalization to LTV is straightforward [Knowles].
Control set:
U = {u Rm : 1 ui 1, i = 1, . . . , m}
Hypercubenatural generalization of the previous example, reasonable (independent constraints on
actuators). Can also consider a more general convex setthe bang-bang principle still holds, but the
formulas for u are more explicit in the case of hypercube.
Objective: steer x from a given initial state x0 to a given target state x1 in minimal time.
We assume throughout that there exists some control u that drives x0 to x1 .
This is a kind of controllability assumption (but remember that U is bounded).

Hamiltonian: H = p0 + pT (Ax + Bu).


Adjoint equation: p = AT p.
the Maximum Principle : u = arg maxuU pT (t)Bu.

(p (t), Bu (t)) (p (t), Bu(t)) t, u(t) U

or
n
+
(p (t), B(u (t) u(t)) = (p (t), bi (ui (t) ui (t))) 0
i=1

where b1 , . . . , bm are columns of B. So, to maximize H we must have

ui (t) = sgn((p (t), bi )) i

(otherwise, can produce u(t) by changing u (t) in component(s) for which the above formula is false
and decrease H). We see that ui = 1 almost everywhere, unless (p (t), bi ) 0 on some interval of
time for some i.
T
From p = AT p we have p (t) = eA t p (0) = (eAt )T p (0). Thus if (p , bi ) 0, then
(p (0), eAt bi ) 0. Differentiating (at t = 0)in fact if this is 0 on some interval then it is 0
everywhere:
(p (0), bi ) = (p (0), Abi ) = = (p (0), An1 bi ) = 0
i.e., p (0) is orthogonal to bi , Abi , . . . , An1 bi . As we know, p (0) ,= 0 (see Example 6).
This will not happen if (A, bi ) is a controllable pair (controllability with respect to the i-th input
channel).
So, we have the bang-bang property if (A, bi ) is controllable i. Such systems are called normal.
Then u is unique up to measure 0, takes values in the vertices of the hypercube U, and actually has a
finite number of switches (by analyticity).

4.4.2 Singular optimal controls and Lie brackets

Does the bang-bang principle also hold for nonlinear systems?


Not always.

107
Example 9

x1 = 1 x22
(4.10)
x2 = u

1 u 1. Go from (0, 0) to (1, 0) in minimal time.


The optimal control is u (t) 0.
(Transfer time is 1; any other control would make x2 deviate from 0, which would slow down the
growth of x1 .)
Note that 0 is an interior point of the control set [1, 1] bang-bang principle doesnt hold.
We suspect that some function that determines the sign of u vanishes herelets take a look.
H = p0 + p1 (1 x22 ) + p2 u. p1 = 0, p2 = 2p1 x2 , u = sgn(p2 ). Now the canonical equations are more
coupled than before.
But the optimal trajectory is contained in x1 -axis x2 0 p2 0. So if p2 (1) = 0 then
p2 0. Fixed endpoint no constraints on p (t ). So, the above optimal control doesnt contradict
the Maximum Principle .

Optimal controls that are not bang-bang are called singular. Here we have a singular arc.

108
Lecture 18

More generally: consider the affine control system


m
+
x = f (x) + gi (x)ui , x Rn , 1 ui 1, i = 1, . . . , m
i=1

or, in matrix form,


x = f (x) + G(x)u, u Rm
, t1
J(u) = L(x)dt
t0

Hamiltonian: +
H = p0 L + (p, f + gi ui )
i

the Maximum Principle :


ui = sgn(p (t), gi (x (t)))
Call the above inner product i (t)the switching function (always defined along the optimal trajectory
x (t), p (t)).
To investigate the bang-bang property, we need to study zeros of the functions i (t).
To simplify calculations, lets assume that m = 1 (only one u, one ) and L 1 (the time-optimal
problem as before).
(t) = (p (t), g(x (t)))

x = f + gu
- -
T-T-
p = Hx | = (fx ) - p (gx ) - p u

Well omit | , understood.


Lets compute the derivative of (t).

= (p , g) + (p , gx (f + gu ))
. / . /
= (fx )T p , g (gx )T p u , g + (p , gx f ) + (p , gx g) u
# $% &
=$p ,gx g%u

= (p , gx f fx g)

gx f fx g is a new vector field, interestingwhat is it?

Example 10 f and g are linear vector fields; f (x) = Ax, g(x) = Bx. This corresponds to a bilinear
(not linear!) system x = Ax + Bxu. We have fx = A, gx = B,

gx f fx g = BAx ABx = (BA AB)x = [B, A]x

[B, A] is commutator, or Lie bracket.

109
A Lie bracket of general vector fields is defined as

[f, g](x) := gx (x)f (x) fx (x)g(x)

(here gx and fx are matrices).


Note the sign difference.
Meaning of Lie bracket (justification of the term commutator):
x = f (x)
x0
2 [f, g](x0 )
x(4)
x = g(x)
x = g(x)

x = f (x)

Figure 4.16: Geometric interpretation of the Lie bracket

In particular, if f and g commute (i.e., we come back to x0 ) then [f, g] = 0.


The equation
(t) = (p (t), [f, g](x (t)))
reveals a fundamental relation between Lie brackets and optimal control (read Sussmann). Recall:

(t) = (p (t), g(x (t)))

So to have a singular optimal trajectory, we must have

0 p (t) g(x (t))]


0 p (t) [f, g](x (t))]

has terms [gi , gj ] in the multiple input case.


But higher derivatives of must also vanish.
What is ? (Ask them to guess.)
Rather than differentiate again, lets examine our derivation of . For any function h(x), we can
show by the same calculation that
d
(p , h) = (p , [f, h]) + (p , [g, h])u
dt
We had h = g so [g, h] was 0. This time we have h = [f, g], so

= (p , [f, [f, g]](x )) + (p , [g, [f, g]](x ))u

(iterated Lie brackets!) If 0, then (p , g(x )) = 0, (p , [f, g](x )) = 0, (p , [f, [f, g]](x )) +
(p , [g, [f, g]](x ))u = 0, and so on.

110
Recall that p (t) ,= 0 for free-time problems with L ,= 0.
We see that for n = 2, we can rule out singularity if g and [f, g] are linearly independent along the
optimal trajectory. If this is not the case, then the first two equations can hold, and then

(p , [f, [f, g]](x ))


u =
(p , [g, [f, g]](x ))

is potentially a singular optimal control. However, it should meet control constraints. If the constraint
is |u| 1, and if we assume, e.g., that

[g, [f, g]](x) = (x)g(x) + (x)[f, g](x) + (x)[f, [f, g]](x) x

where |(x)| < 1, then the above u would not be admissible.


...
But then we have to consider the case where third-order brackets vanish and look at , and so on.
This is the game one plays to derive conditions guaranteeing bang-bang property [Sussmann].
Do Lie brackets explain all results we derived earlier (before discussing Lie brackets)?

Linear systems:
f = Ax, g = b (single input). [f, g] = Ab, [f, [f, g]] = A2 b, [g, [f, g]] = 0, [f, [f, [f, g]]] = A3 b. If
rank(b, Ab, . . . , An1 b) = n then cannot be 0, so controllability comes out and confirms our
earlier result.

Singular example:
System (4.10).
0 21 01
f = 1x
0
2 , g = 0 .
1
! " ! "! " ! "
0 2x2 0 2x2 0 2x2
fx = [f, g] = =
0 0 0 0 1 0

We see that on the x1 -axis (x2 = 0), g and [f, g] are not linearly independent! So need to look at
higher-order brackets.
Next, lets calculate [f, [f, g]]. We didnt do this in class.
! "
0 2
[f, g] =
x 0 0
! "! " ! "! " ! "
0 2 1 x22 0 2x2 2x2 0
[f, [f, g]] = = !
0 0 0 0 0 0 0
If u 0, then the formula for implies 0. All subsequent brackets are 0 we are done,
is indeed 0.
So, Lie brackets indeed contain all information about singularities.

Singular controls are not necessarily complexu 0 in the example is quite simple.
For time-optimal problems in the plane, for x = f + gu, with 1 u 1, can show that all optimal
trajectories are concatenations of a finite number of bang trajectories (u = 1 or u = 1) and nice
(analytic) singular arcs.

111
Is this true in general (for n > 2, or for not necessarily time-optimal problems, even in R2 )?
Example: Fullers problem
Studied by Fuller and others since 1960 [Jurdjevic], [Ryan, IJC, 1987], [Sussmanns survey].

x1 = x2
x2 = u

As before, 1 u 1. Goal: bring x from a given x0 to the origin (0, 0) and minimize the cost
, tf
J= x21 (t)dt, tf free
t0

(tf can be fixed and large enough [Jurdjevic]doesnt really matter since we can rest at 0 once we get
there).
2
(If we had J = x22 then we would have no Fuller phenomenon [Jurdjevic].)

H = p0 x21 + p1 x2 + p2 u

the Maximum Principle : u = sgn(p2 ) as before.


Adjoint equation (normalizing p0 to 1):

p1 = 2x1
p2 = p1

so things are more complicated than before (e.g., in the soft landing example?) because x and p are
coupled.
To have a singular trajectory we must have

p2 0 p1 0 x1 0 x2 0

(trivial).
Lets then look at possible bang-bang trajectories: u = 1. Integrating: x2 = t + a, x1 =
12 t2 + at + b,
1
p1 = t3 + at2 + 2bt + c
3
1 1
p2 = t4 at3 bt2 ct d
12 3
where a, b, c, d are constants. Switching takes place on the surface {p2 = 0}.
The above formulas imply the following interesting fact: there is no solution that goes from this
switching surface to (x1 , x2 , p1 , p2 ) = (0, 0, 0, 0) in any time t > 0.
So, bang-bang controls are also excluded. And if we are at {p2 = 0}, then to get a singular trajectory
we must first hit (0, 0, 0, 0), and no constant controls in finite concatenation can achieve this!
So what actually happens?
The only answer is that we have an infinite number of switches, with an accumulation point. This
is called Zeno behavior.

112
x2

x1

sw
itc
hin
gc
urv
e

Figure 4.17: Optimal trajectory for Fullers problem

Infinite number of switches in finite time! Switching intervals decrease in geometric progression. It
turns out that such a switching strategy can reach the origin in finite time, if switchings occur on the
right curve. Switching curve: {x1 + |x2 |x2 = 0}, 0.445.
Note that his behavior is somewhat related to the time-optimal problem for the same double inte-
grator, but the nature of switching is drastically different (there, we have = 1/2).
Relation of Fullers problem to time-optimal problems:

2t
Can consider the parameterized class of problems x = u, J = t0f |x| dt, is a positive real
number. Goalsame: transfer x0 to 0, minimize J. For = 0 we recover the time-optimal case.
For = 2 we recover Fullers problem. It turns out that there exists a value 0.35 (bifurcation
value) such that:

For [0, ] the optimal control is bang-bang with at most 1 discontinuity (switch);
For > we have Zeno behavior (Fullers phenomenon).

Exercise 15 (due Apr 4) Give an example of a system of the form

x = f (x) + g(x)u, x Rn , n > 2, 1 u 1

such that the problem of transferring a given initial state x0 to 0 in minimal time has a solution u which
involves infinitely many switchings between u = 1 and u = 1. Hint: base this on Fullers example.

This shows that Theorem 8.4.1 from Sussmanns survey cannot be extended to R3 .
Fullers phenomenon is observed in other problems too, e.g., Dubins car [Agrachev-Sachkov].

4.5 Existence of optimal controls

Lets now examine an issue that weve been dodging for a while.
Perrons paradox:
Reference: [Young]

113
Let n be the largest positive integer.
Claim: n = 1.
Proof. Otherwise, n2 > n, a contradiction.
This is silly, of course, because the largest positive integer doesnt exist. But what does this paradox
mean in our context?
Finding the largest positive integer is an optimization problem. The above argument says: n is
optimal n = 1. I.e., n = 1 is a necessary condition for optimality! (So is the Maximum Principle .)
Thus a necessary condition can be uselesseven misleadingunless we know the solution exists.
Assume we have some optimal control problem, with system x = f (x, u), cost J(u), and target set
S R Rn . How can we ensure the existence of an optimal control? First, we must obviously assume
that there exists at least one control u that drives x from x0 at t0 to the target setotherwise the
problem is ill-posed.
This is a kind of controllability assumption (nontrivial to check, especially for bounded U ), but we
usually tacitly assume it.
Is this enough? No!

Example 11 x = u, x, u R, problem: transfer x from x0 = 0 to x1 = 1 in minimal time.


What is u ? Doesnt exist: t = 0 is impossible, but any t > 0 can be achieved.

In the above example, the problem was that the control set U was unbounded, which led to arbitrarily
fast transfer.
Would boundedness of U be enough to fix it?

Example 12 Same system, same goal, but u U := [0, 1).


What is t ? t = 1 cannot be achieved, but can transfer from 0 to 1 in any time t > 1 again, no
optimal solution.

So if U is not closed we still have a problem. Beginning to guess that we need both closedness and
boundedness (compactness).
But note that existence of optimal controls has to do not just with U, but with the state trajectories
produced by controls in U. It turns out that this issue is closely related to reachable sets.

114
Lecture 19

Denote by Rt (x0 ) the set of states reachable from x0 using all admissible u (t is an arbitrary fixed
time). The solution of an optimal control problem is related to reachable setswe saw this in the proof
of the Maximum Principle : y (t ) must be on the boundary of Rt (y0 ), because if its in the interior
then we can decrease the cost:

(Though in the proof we didnt work with Rt directly, but with infinitesimal directionsand the

reason is that Rt can be complicated.)
And this boundary point must belong to Rt ! (More accurate phrasing: it should be a boundary
point of Rt .)
Or consider a time-optimal problem:

xf
x0

Figure 4.18: Propagation of reachable sets Rt (x0 )



xf is a boundary point of Rt (x0 )otherwise we could reach it sooner. (See web handout on
bang-bang principle.)
We can interpret the above two examples in terms of this property:

In Example 11, Rt (x0 ) is the entire R (not bounded) t x = 1 couldnt be on the boundary for
any t.

In Example 12, R1 (x0 ) = [0, 1), and x = 1 is outside (boundary not included in R1 (x0 )), so we
couldnt reach it. And for any t > 1, x = 1 is already in the interior of Rt (x0 ). R1 (x0 ) is not
closed.

(Can generalize this discussion to target sets instead of fixed terminal points, even moving target
sets.)
So, a refined guess is that for existence of optimal controls, we should have compactness of the
reachable set (rather than control set). As we said, reachable sets can be quite complicatedbut there
exists a general result:

Theorem 8 (Filippov) Given x = f (t, x, u), assume:

On a given interval [t0 , tf ], solutions exist u;

(t, x), the set {f (t, x, u) : u U } is compact and convex.

Then Rt (x0 ) is compact t [t0 , tf ].

115
Note: the 1st condition is not implied by the 2nd. Example: x = x2 + u.
We give no proof but make some comments:

Filippovs theorem actually makes a stronger statement (from which compactness of Rt follows):
the set of all trajectories of the system, as a subset of C([t0 , tf ], Rn ), is compact in topology induced
by the norm 1 10 , i.e., topology of uniform convergence. (Compact closed and bounded for
each t its slice is also closed and boundednot hard to prove. Or use sequential compactness.)

Convexity is not necessary to have boundedness of Rt (though its not easy to prove boundedness
without convexity [Lin-Sontag-Wang]), but convexity is crucial for closedness of Rt . The argument
showing closedness relies on separation property of convex sets ([Bressans online lecture notes],
on class website).
Begin optional:
We can see the role of the convexity assumption from the following example:

x = u, u {1, 1}

The trajectory x 0 is the uniform limit of a sequence of system trajectories, but it is not a valid
trajectory of the system the set of trajectories is not closed.
Here Rt (x0 ) are actually compact, but we can build a better example:

x1 = u
x2 = x21

u {1, 1}, x0 = (0, 0). Using the above controls, we can reach points arbitrarily close to (0, 0),
of the form (0, ), but (0, 0)
/ Rt because > 0 always (x1 cannot remain at 0).
If we convexify the problem by letting U := co{1, 1} = [1, 1], then we get a larger reachable
set which will now be closed by the above theorem.
In fact, this larger reachable set will be exactly the closure of the original one, i.e., the original
one is dense in the new one.
This is a special case of the general relaxation theorem about density of the set of solutions of the
original system in the set of solutions of the convexified (relaxed) system.
For affine systems, can see Filippovs Theorem more directly: LU [0, T ] is weakly compact,
and the map u 2 x(T ) is continuous with respect to strong topology for x Rt (x0 ) is compact
as an image of compact set U [Sontag].
End optional

Filippovs Theorem applies to useful classes of systems:

Nonlinear systems affine in controls

x = f (x) + G(x)u
u Ucompact, convex. For each x, the set {f + Gu : u U } is the image of U under an affine
map its compact and convex. But need to assume existence of solutions on the time interval
we work on! x = x2 + u, u [1, 1] blows up in finite time!

116
Linear systems

x = Ax + Bu
(Special casecan be LTV too.)
In fact, in this case Rt (x0 ) is also convexcan easily show using the variations-of-constants
formula , t
x(t) = eAt x0 + eA(ts) Bu(s)ds
0
(preserves convex combinations).

OK, how can we use Filippovs Theorem to show existence of optimal controls?
The simplest case is Mayer problem (terminal cost only). Assume that:

x = f (x, u) satisfies the conditions of Filippovs Theorem (always true for linear systems and
forward complete affine nonlinear systems);
Cost is J = K(x(tf )), where K is a continuous function, tf fixed;

Then the problem admits an optimal solution.


Reason: Weierstrass Theorem!
K is a continuous function on a compact set Rtf (x0 ) it has a global minimum (or maximum).
For more general problems, with free terminal time and running cost, we need to work harder (know
that the set of trajectories is compact but dont know if J is a continuous functional on it). However,
there exist useful classes of problems where compactness (actually, below well just use closedness) of
reachable sets implies existence of optimal controls. Lets see this for linear time-optimal problems:
x = Ax + Bu
u Ucompact and convex, x0 , xf given, problem: transfer in minimal time.

Theorem 9 Assume xf Rt (x0 ) for some t. Then there exists a time-optimal control.

Proof. [Sontag], [Knowles] (same argument in both) By assumption, the set {t : xf Rt (x0 )} is
nonempty and bounded from below by 0 (actually its bounded away from 0 provided xf ,= x0 ). We
also know that this set is nonempty. So, t := inf{t : xf Rt (x0 )} is well defined.


Well be done if we show that xf Rt (x0 ), i.e., that t is actually a minimum. This will mean
that there exists a control u (bang-bang, by the way, if U is a cube or another convex polyhedron) that
transfers x0 to xf in time t , and by definition of t no control does it faster.
(Recall Example 12 at start of lecture: could reach in any time t > 1 but couldnt reach in t = 1;
here it doesnt happen.)
By definition of infimum, there exists a sequence tk 3 t such that xf Rtk (x0 ) k. Each k satisfies
xf Rtk (x0 ) for some control uk , we have
, tk
xf = e x0 +
Atk
eA(tk s) Buk (s)ds
0

117
We want to use closedness of reachable sets, but cant work with Rtk for different klets truncate:
, t
At s)
xkf := e x0 + eA(t Buk (s)ds
0

These are different points nowobtained by uk acting on a truncated interval [0, t ]but they all

belong to the same set Rt (x0 ).
Claim: xkf xf (with respect to Euclidean norm on Rn ).
k

In view of the claim and the fact that Rt (x0 ) is closed, we have xf Rt (x0 ) as needed.

Exercise 16 (due Apr 11) Prove the claim.

Existence results such as the above theorem are not constructive, but they justify the application of
the Maximum Principle which, as we have seen, often allows one to actually find the optimal control.

118

You might also like