Skip to content

Commit 522a8d7

Browse files
committed
add documentation for discrete time optimal control problems
1 parent 33fafa1 commit 522a8d7

File tree

2 files changed

+52
-14
lines changed

2 files changed

+52
-14
lines changed

control/optimal.py

Lines changed: 45 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ class OptimalControlProblem():
6666
`(fun, lb, ub)`. The constraints will be applied at each time point
6767
along the trajectory.
6868
terminal_cost : callable, optional
69-
Function that returns the terminal cost given the current state
69+
Function that returns the terminal cost given the final state
7070
and input. Called as terminal_cost(x, u).
7171
trajectory_method : string, optional
7272
Method to use for carrying out the optimization. Currently supported
@@ -287,12 +287,16 @@ def __init__(
287287
# time point and we use a trapezoidal approximation to compute the
288288
# integral cost, then add on the terminal cost.
289289
#
290-
# For shooting methods, given the input U = [u[0], ... u[N]] we need to
290+
# For shooting methods, given the input U = [u[t_0], ... u[t_N]] we need to
291291
# compute the cost of the trajectory generated by that input. This
292292
# means we have to simulate the system to get the state trajectory X =
293-
# [x[0], ..., x[N]] and then compute the cost at each point:
293+
# [x[t_0], ..., x[t_N]] and then compute the cost at each point:
294294
#
295-
# cost = sum_k integral_cost(x[k], u[k]) + terminal_cost(x[N], u[N])
295+
# cost = sum_k integral_cost(x[t_k], u[t_k])
296+
# + terminal_cost(x[t_N], u[t_N])
297+
#
298+
# The actual calculation is a bit more complex: for continuous time
299+
# systems, we use a trapezoidal approximation for the integral cost.
296300
#
297301
# The initial state used for generating the simulation is stored in the
298302
# class parameter `x` prior to calling the optimization algorithm.
@@ -325,8 +329,8 @@ def _cost_function(self, coeffs):
325329
# Sum the integral cost over the time (second) indices
326330
# cost += self.integral_cost(states[:,i], inputs[:,i])
327331
cost = sum(map(
328-
self.integral_cost, np.transpose(states[:, :-1]),
329-
np.transpose(inputs[:, :-1])))
332+
self.integral_cost, states[:, :-1].transpose(),
333+
inputs[:, :-1].transpose()))
330334

331335
# Terminal cost
332336
if self.terminal_cost is not None:
@@ -954,7 +958,22 @@ def solve_ocp(
954958
transpose=None, return_states=True, print_summary=True, log=False,
955959
**kwargs):
956960

957-
"""Compute the solution to an optimal control problem
961+
"""Compute the solution to an optimal control problem.
962+
963+
The optimal trajectory (states and inputs) is computed so as to
964+
approximately mimimize a cost function of the following form (for
965+
continuous time systems):
966+
967+
J(x(.), u(.)) = \int_0^T L(x(t), u(t)) dt + V(x(T)),
968+
969+
where T is the time horizon.
970+
971+
Discrete time systems use a similar formulation, with the integral
972+
replaced by a sum:
973+
974+
J(x[.], u[.]) = \sum_0^{N-1} L(x_k, u_k) + V(x_N),
975+
976+
where N is the time horizon (corresponding to timepts[-1]).
958977
959978
Parameters
960979
----------
@@ -968,7 +987,7 @@ def solve_ocp(
968987
Initial condition (default = 0).
969988
970989
cost : callable
971-
Function that returns the integral cost given the current state
990+
Function that returns the integral cost (L) given the current state
972991
and input. Called as `cost(x, u)`.
973992
974993
trajectory_constraints : list of tuples, optional
@@ -990,8 +1009,10 @@ def solve_ocp(
9901009
The constraints are applied at each time point along the trajectory.
9911010
9921011
terminal_cost : callable, optional
993-
Function that returns the terminal cost given the current state
994-
and input. Called as terminal_cost(x, u).
1012+
Function that returns the terminal cost (V) given the final state
1013+
and input. Called as terminal_cost(x, u). (For compatibility with
1014+
the form of the cost function, u is passed even though it is often
1015+
not part of the terminal cost.)
9951016
9961017
terminal_constraints : list of tuples, optional
9971018
List of constraints that should hold at the end of the trajectory.
@@ -1044,9 +1065,19 @@ def solve_ocp(
10441065
10451066
Notes
10461067
-----
1047-
Additional keyword parameters can be used to fine-tune the behavior of
1048-
the underlying optimization and integration functions. See
1049-
:func:`OptimalControlProblem` for more information.
1068+
1. For discrete time systems, the final value of the timepts vector
1069+
specifies the final time t_N, and the trajectory cost is computed
1070+
from time t_0 to t_{N-1}. Note that the input u_N does not affect
1071+
the state x_N and so it should always be returned as 0. Further, if
1072+
neither a terminal cost nor a terminal constraint is given, then the
1073+
input at time point t_{N-1} does not affect the cost function and
1074+
hence u_{N-1} will also be returned as zero. If you want the
1075+
trajectory cost to include state costs at time t_{N}, then you can
1076+
set `terminal_cost` to be the same function as `cost`.
1077+
1078+
2. Additional keyword parameters can be used to fine-tune the behavior
1079+
of the underlying optimization and integration functions. See
1080+
:func:`OptimalControlProblem` for more information.
10501081
10511082
"""
10521083
# Process keyword arguments
@@ -1116,7 +1147,7 @@ def create_mpc_iosystem(
11161147
See :func:`~control.optimal.solve_ocp` for more details.
11171148
11181149
terminal_cost : callable, optional
1119-
Function that returns the terminal cost given the current state
1150+
Function that returns the terminal cost given the final state
11201151
and input. Called as terminal_cost(x, u).
11211152
11221153
terminal_constraints : list of tuples, optional

doc/optimal.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,13 @@ can be on the input, the state, or combinations of input and state,
6565
depending on the form of :math:`g_i`. Furthermore, these constraints are
6666
intended to hold at all instants in time along the trajectory.
6767

68+
For a discrete time system, the same basic formulation applies except
69+
that the cost function is given by
70+
71+
.. math::
72+
73+
J(x, u) = \sum_{k=0}^{N-1} L(x_k, u_k)\, dt + V \bigl( x_N \bigr).
74+
6875
A common use of optimization-based control techniques is the implementation
6976
of model predictive control (also called receding horizon control). In
7077
model predictive control, a finite horizon optimal control problem is solved,

0 commit comments

Comments
 (0)