Arxiv Mocp

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/383120089
Multi-Phase Optimal Control Problems for Efﬁcient Nonlinear Model

Predictive Control with acados
Preprint · August 2024

DOI: 10.22541/au.172448762.20363040/v1
CITATIONS READS
0 89
4 authors, including:
Jonathan Frey Gianluca Frison

University of Freiburg University of Freiburg
39 PUBLICATIONS 479 CITATIONS 54 PUBLICATIONS 1,501 CITATIONS
SEE PROFILE SEE PROFILE
Moritz Diehl
University of Freiburg
578 PUBLICATIONS 22,386 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jonathan Frey on 20 August 2024.
The user has requested enhancement of the downloaded file.

Abstract
Computationally efficient nonlinear model predictive control relies on elaborate discrete-time op-
timal control problem (OCP) formulations trading off accuracy with respect to the continuous-time
problem and associated computational burden. Such formulations, however, are in general not easy
to implement within specialized software frameworks tailored to numerical optimal control. This
paper introduces a new multi-phase OCP interface for the open-source software acados allowing to
conveniently formulate such problems and generate fast solvers that can be used for nonlinear model
predictive control (NMPC). While multi-phase OCP (MOCP) formulations occur naturally in many
applications, this work focuses on MOCP formulations that can be used to efficiently approximate
standard continuous-time OCPs in the context of NMPC. To this end, the paper discusses advanced
control parametrizations, such as closed-loop costing and piecewise polynomials with varying degree,
as well as partial tightening and formulations that leverage models of different fidelity. An introduc-
tory example is presented to showcase the usability of the new interface. Finally, three numerical
experiments demonstrate that NMPC controllers based on multi-phase formulations can efficiently
trade-off computation time and control performance. 1
1
Multi-Phase Optimal Control Problems for Efficient
Nonlinear Model Predictive Control with acados
Jonathan Frey1,2 , Katrin Baumgärtner1 , Gianluca Frison1 , and Moritz Diehl1,2
1
Department of Microsystems Engineering (IMTEK), University Freiburg, Germany
2
Department of Mathematics, University Freiburg, Germany
Preprint. Article submitted to journal Optimal Control Applications and

Methods on July 12, 2024
1 Introduction
High-performance algorithms for nonlinear model predictive control (NMPC) have extended its real-
time applicability from chemical processes to a huge variety of application areas[1, 2, 3, 4, 5, 6, 7, 8].
Most NMPC implementations use a discrete-time model with a fixed time step that is equal to the
sampling time. However, more elaborate optimal control problem (OCP) formulations have enormous
potential to reduce the computational burden associated with the online optimization. Nonuniform
time grids, which have been shown to work especially well with accurate cost integration [9], are
an easy first step in this direction. However, more sophisticated OCP formulations, like piecewise
polynomial control parametrizations of varying degree [10], closed-loop-costing formulations [11], or
using models of different fidelity for parts of the horizon, are rarely used in practice, due to the
additional implementation effort. Such formulations require the problem functions as well as input
and state dimensions to vary over the horizon. To this end, this paper presents a new feature of the
open-source software acados [12] which allows for a convenient formulation of multi-phase OCPs via
its Python interface.
We use the term multi-phase OCP (MOCP) to describe OCP formulations that may have struc-
turally different models, constraints and cost formulations on parts of the horizon. In contrast the
term multi-stage was used in previous works but is used in different ways, e.g. in the context of
tree-structured OCPs [13, 14, 15] and to describe OCP formulations which only allow to vary cost
and constraint functions [16]. The proposed multi-phase formulation is similar to the one tackled by
GPOPS-II [17], which is a commercial Matlab software package for solving multi-phase OCPs using
state-of-the-art sparse NLP solvers, like IPOPT [18] and SNOPT [19]. GPOPS-II has been successfully
used for MOCPs [20].
However, the focus of this paper is on MOCP formulations with OCP-structure exploiting algo-
rithms. MOCPs have been previously tackled in the direct multiple shooting package MUSCOD-II
[21], which allows multi-phase formulations with piecewise polynomial control parametrizations and
coupled processes. While MUSCOD-II focused on problems from process control with time scales in
the second to minute range, [21, 22], the acados software package has been successfully deployed on
systems with much shorter time scales [2, 3, 4, 5, 6, 7, 8]. Both MUSCOD-II and acados use a direct
multiple shooting based formulation.
The open-source software package acados implements efficient algorithms for embedded optimal
control [12]. It is written in C and relies on the linear algebra package BLASFEO which provides
performance-optimized routines for small to medium sized matrix operations [23]. The nonlinear
OCPs can be solved with acados using variants of sequential quadratic programming (SQP), the
real-time iteration (RTI) [24] and advanced-step real-time iteration (AS-RTI) scheme [25], as well
as differential dynamic programming (DDP) [26], which was recently added [27]. In acados, the
quadratic subproblems are solved exploiting the block structure of optimal control problems. It has
been shown that OCP structure exploiting solvers can be many times faster compared to general
sparse solvers or dense ones used with condensing [28, 29]. A variety of quadratic programming (QP)
solvers, such as HPIPM, qpOASES, DAQP, OSQP, qpDUNES, [30, 31, 32, 33, 34] are interfaced, which either
tackle the OCP-structured QP directly or after applying full or partial condensing to it [35, 36]. The
∗ The authors want to thank Rudolf Reiter for fruitful discussions.
This research was supported by DFG via Research Unit FOR 2401 and projects 424107692, 504452366, by BMWK
03EN3054B, and by the EU via ELO-X 953348.
2
system dynamics can be handled with integration methods which are able to propagate forward and
adjoint sensitivities efficiently in addition to the nominal result [12, 37].
The remainder of this introduction aims at giving a brief overview on software frameworks for
NMPC other than acados. The tool CasADi [38] offers a convenient symbolic framework, algorithmic
differentiation and interfaces to many state-the-art NLP solvers, like IPOPT [18] and SNOPT [19]. A
variety of software projects specialized on OCP formulations have been developed on top of CasADis
symbolic framework. The package do-mpc [14], provides Python functionality to allow fast prototyp-
ing of NMPC, moving horizon estimation (MHE) and supports tree-structured OCP formulations. In
terms of solvers it relies on the ones available in CasADi. The rockit package [39], allows to conve-
niently formulate OCPs in Python and Matlab and also covers MOCP formulations. In addition to the
CasADi solvers, it also provides interfaces to acados and fatrop [29], which is an NLP solver highly
inspired by IPOPT that exploits the optimal control problem structure. The OpEn [40] software package
allows to conveniently formulate single-phase OCP problems and implements the proximal averaged
Newton-type method for optimal control (PANOC) as a Rust solver and allows to generate solvers
for user defined problems from Python and Matlab using the CasADi symbolics. The GRAMPC [41]
software package can generate embedded solvers from Matlab and uses a gradient-based augmented
Lagrangian method. The commercial solver FORCESPRO [42] implements competitive algorithms for
NMPC which support varying dimensions between the stages [42, 43]. However, benchmark results
created with the academic license of FORCESPRO could not be disclosed in past research [28, 7].
Outline. The paper introduces the multi-phase OCP formulation in Section 2 and discusses how
it can be handled using direct multiple shooting. Section 3 motivates using advanced OCP formula-
tions which may be cast as MOCPs to design efficient NMPC controllers. In particular, it discusses
piecewise polynomial control parameterizations, partial and progressive tightening formulations, and
the use of models of different fidelity within a single OCP. Moreover, it presents and conceptually
extends closed-loop costing formulations. Section 4 discusses the efficient treatment of multi-phase
formulations within the acados software package and presents a tutorial example. Section 5 demon-
strates how NMPC controllers based on the MOCP formulations detailed in Section 3 are able to
trade off computation time and control performance in ways that are not accessible when one is
limited to single-phase OCPs.
2 Multi-Phase Optimal Control Problem Formulations

This section presents a continuous-time multi-phase optimal control problem (MOCP) formulation
in Section 2.1 and describes how it can be discretized using direct multiple shooting in Section 2.2.
2.1 Continuous-time multi-phase optimal control problem

Throughout this paper, we will treat continuous-time multi-phase optimal control problems (MOCP),
which can be stated as
M Z
X tk+1
minimize ℓk (ξk (t), vk (t)) dt + Ek (ξk (tk+1 ), ηk ) (1a)
ξ1 (·),...,ξM (·), tk
v1 (·),...,vM (·), k=1
η1 ,...,ηM
subject to 0 = x̄0 − ξ1 (t1 ), (1b)

0 = fk (t, ξk (t), ξ˙k (t), vk (t)), (1c)
0 ≥ gk (ξk (t), vk (t)), (1d)
0 = Γk (ξk (tk+1 ), ηk ) − ξk+1 (tk+1 ), (1e)
0 ≥ g e (ξM (T )), (1f)
for t ∈ [tk , tk+1 ), k = 1, . . . , M.
The finite time horizon [0, T ] is split into M fixed subintervals [tk , tk+1 ] with t1 = 0 and tM +1 = T .
Each interval defines a phase k ∈ {1, . . . , M }. For each phase k, an implicit ODE fk is given, which
defines the state trajectory ξk : [tk , tk+1 ] → Rnx,k for a given control trajectory vk : [tk , tk+1 ] → Rnv,k
and initial state. The initial state of the first phase is given by x̄0 , while the initial state for the
subsequent phases is given by the transition functions Γk : Rnx,k × Rnη,k → Rnx,k+1 , which map the
terminal state of phase k and discrete decision variables ηk ∈ Rnη,k to the initial state of the next
phase, allowing for the state dimension to vary between the phases. The functions ℓk and Ek define
the path and terminal cost terms for phase k. The functions gk summarize the inequalities imposed
on states and controls in phase k. Additionally, the function g e summarizes the terminal constraints
imposed at the end of the horizon.
3
The transition formulation (1e) is similar to the one in MUSCOD-II [44] with the addition of the
discrete decision variable ηk . Additionally, the formulation in the work by Leineweber et al. [44] con-
siders global variables and the time grid points ti as optimization variables. The global variables are
implemented as separate variables on each shooting interval and are constrained to be equal. Global
variables can be formulated in (1) by state augmentation on the full horizon. In order to have time
grid points as optimization variables, the dynamics can be augmented with a clock state and a speed
of time variable acting as a control. The differences in problem formulation can be motivated by the
different solution methods implemented in acados and MUSCOD-II, respectively. While MUSCOD-
II deploys sparse linear algebra on blocks and allows linear couplings between stages, acados uses
specialized algorithms for purely OCP-structured problems in which the dynamics constraints are the
only coupling between stages.
MOCP formulations occur naturally in different application when formulating OCPs where the
dynamic behavior qualitatively changes at a certain point in time. For example, when considering
walking robots [45], chemical plants, where some amount of substance is added at a certain point
in time, e.g. when considering recycled waste cuts [46], or multi-train scheduling [20]. However, the
main focus of this paper is on MOCP formulations which can be derived to approximate the solution
of a continuous-time infinite-horizon problem in the context of NMPC as discussed next.
2.2 Multi-phase multiple shooting discretization

One can discretize MOCP (1) using direct multiple shooting [47] dividing each time interval [tk , tk+1 ]
into Nk shooting intervals. The resulting nonlinear program (NLP) can be written as
k −1
M NX
X
minimize
x,u,η
Lk,j (xk,j , uk,j , ∆tk,j ) + Ek (xk,Nk , ηk ) (2a)
k=1 j=0
subject to x1,0 = x̄0 , (2b)

xk+1,0 = Γk (xk,Nk , ηk ), k = 1, . . . M − 1, (2c)
xk,j+1 = ϕk,j (xk,j , uk,j ), j = 0, . . . , Nk − 1, k = 1, . . . M , (2d)
0 ≥ gk,j (xk,j , uk,j ), j = 0, . . . , Nk − 1, k = 1, . . . M , (2e)
0 ≥ g e (xM,NM ), (2f)
where x = (x1,0 , . . . , x1,N1 , x2,0 , . . . , xM,NM ), u = (u1,0 , . . . , u1,N1 −1 , u2,0 , . . . , uM,NM −1 ), and η =
(η1 , . . . , ηM ). The discrete values xk,j ∈ Rnx,k represent the values of ξk (·) at the corresponding
shooting nodes. The controls uk,j ∈ Rnu,k act on the jth shooting interval of phase k. The cost func-
tions Lk,j are called stage costs and reflect the cost integrated over the shooting interval [tk,j , tk,j+1 )
with ∆tk,j = tk,j+1 − tk,j . The initial state of the first phase is given by (2b) and for subsequent
stages by (2c). The discrete-time dynamics are described by the functions ϕk,j , which represent an
integration scheme applied to the continuous-time dynamic system in (1c). Dedicated functions that
compute the numerical integration of continuous-time dynamics and the solution sensitivities are an
essential component of for efficient solution of OCPs and available in acados [37].
Finally, the constraint functions gk,j represent the constraints gk on shooting interval [tk,j , tk,j+1 ).
Most commonly, the constraints are only enforced at the initial point of the shooting interval, i.e.
for xk,j , uk,j . However, it is possible to also impose them on intermediate points. Similarly, the
stage cost Lk,j often corresponds to a simple Euler integration of the continuous-time cost ℓk over a
shooting interval. Especially, when using longer intervals, higher order integration of the cost term are
necessary to retain a good approximation quality. The cost integration can be performed efficiently
together with the integration of the dynamics (1c). More details on this can be found in Section 3.4.1.
3 Multi-phase OCP formulations for Model Predictive

Control
This section motivates the use of multi-phase optimal control problem (MOCP) formulations for
NMPC applications, in which an ideal controller would apply the exact solution to a continuous-time
infinite-horizon optimal control problem, which is presented in Section 3.1. Moreover, Section 3.1
introduces the concept of the cost-to-go function, the importance of its approximation in NMPC, and
why MOCP formulations should be considered in this regard.
The following subsections provide examples of optimal control problem formulations which are
suitable to approximate parts of the cost-to-go term corresponding to different parts of the time
horizon in successively more approximate ways. These formulations can be cast as multi-phase
problems and can thus be solved efficiently using acados. In particular, Section 3.2 motivates using
models of different fideltiy for different parts of the horizon, Section 3.3 discusses piecewise polynomial
4
control parameterizations, Section 3.4 discusses closed-loop costing and Section 3.5 presents partial
tightening.
3.1 Continuous-time optimal control problem, model predictive con-

trol and cost-to-go
The standard infinite-horizon continuous-time optimal control problem (OCP) which we aim at ap-
proximating in various ways in this paper can be written as
Z ∞
V (x̄0 ) = min ℓ(x(t), v(t)) dt (3a)
x(·),v(·) 0
s.t. x̄0 = x(0), (3b)
0 = f (x(t), ẋ(t), v(t)), t ∈ [0, ∞), (3c)
0 ≥ g(x(t), v(t)), t ∈ [0, ∞), (3d)
where x : [0, ∞) → Rnx , v : [0, ∞) → Rnv are the state and control trajectories respectively, x̄0 is the
initial state value, the function f describes the implicit system dynamics and g denotes the inequality
constraints. The optimal value of OCP (3) is defined as V (x̄0 ). The function V (·) is called cost-to-go
and corresponds to the value function in the field of reinforcement learning.
The goal of model predictive control (MPC) is to operate a system, which typically applies a
constant control input for a fixed sampling time ∆t. This practical constraint can be formalized as
v(t) = u0 for t ∈ [0, ∆t]. (4)
Regarding OCP (3) from a dynamic programming point of view allows us to split the infinite
horizon in different parts which might be approximated in different ways. Using the principle of
optimality and accounting for (4), we can rewrite (3) as
Z ∆t
minimize ℓ(x(t), u0 ) dt + V (x(∆t))
x(·),u0 0
subject to x̄0 = x(0), (5)
0 = f (x(t), ẋ(t), u0 ), t ∈ [0, ∆t],
0 ≥ g(x(t), v(t)), t ∈ [0, ∆t),
which is again an OCP but with a much shorter horizon of length ∆t. Of course, the whole com-
plexity of the problem is shifted into the minimization of the cost-to-go term with this reformulation.
However, this shows that approximating the cost-to-go is key in the developement of an efficient
NMPC controller.
In order to obtain an OCP formulation for MPC, one typically selects a finite time horizon,
replaces the infinite integral in (3) with a finite one and adds a terminal cost on the terminal state.
This approximation of the cost-to-go together with (5) can be discretized using multiple shooting,
where the first shooting interval typically corresponds to (5).
Additionally, the control trajectory to approximate the closed-loop cost could be parameterized
in different ways. In particular, piecewise polynomial control parameterizations and closed-loop
costing are dicussed in Section 3.3 and Section 3.4. These parameterizations correspond to piecewise
continuous functions, that are not applicable to the real system, given the practical constraint (4).
However, since the cost-to-go is anyway approximated, such control parametrizations can lead to
better approximations of the cost-to-go when using longer intervals.
In the following, we derive and discuss various multiple shooting based approximate versions of
(3) which apply structurally different approximations for parts of the infinite horizon in (3) and can
be be phrased as MOCPs.
3.2 Models of different fidelity

In many physical systems, models of different fidelity are available and choosing an appropriate one
for NMPC might depend on the control frequency, desired time horizon and the available solvers.
The consideration of high-fidelity models is typically beneficial for MPC performance. However, such
models might be computationally demanding and can become unstable for long time horizons, which
are important to consider in OCPs in order to capture the evolution of slow dynamics and economic
cost terms. Multi-phase OCP formulations allow one to choose different models for specific parts of
the horizon, allowing for many more degrees of freedom when deriving a discrete-time finite horizon
problem. Low-fidelty models are typically cheaper to integrate with a certain accuracy, not only due
to their potentially reduced dimensionality, but also because they are usually less stiff and thus can
5
be handled by computationally cheaper integration schemes, such as explicit Runge-Kutta (ERK)
methods. On the other hand, high fidelty models typically comprise both fast and slow modes, which
causes them to be stiff. Such stiff models need to be handled with computationally expensive implicit
Runge-Kutta (IRK) methods or a great number of very small integration steps of ERK methods.
Moreover, some nonlinear constraints, which are expensive to evaluate, could be only included in
a first part of the horizon. This can be motivated by the practical observation that constraints tend
to be active in the first part of the horizon [48]. For example, modelling oscillations within a physical
system can be expensive and only meaningful on short time horizons, as in the application of wind
turbine control [49].
Lastly, a variety of NMPC applications rely on underlying controllers that handle the actuators.
However, it can be beneficial to directly model these actuators to accurately represent their behavior
at least in the first part of the horizon. This can allow one to specify cost and constraints that need
a model of the underlying actuator. Including such considerations can unleash optimality potential
that is inaccessible otherwise.
3.3 Piecewise polynomial control parameterization

While piecewise constant control discretizations are by far the most common parameterization for
multiple shooting based MPC, formulation (2) can accommodate piecewise polynomial control pa-
rameterizations with degrees varying between shooting intervals. The k-th component of the control
trajectory on a shooting interval [τ0 , τ e ] could be parameterized by a polynomial
ndeg
X
vn,k (t) = un,k,i · (t − τ0 )i (6)
i=0
of degree ndeg , such that the discrete control input is given by un = (un,k,i )i=0,...,ndeg , k=1,...,nv . This
parametrization offers more degrees of freedom and can in general better approximate the optimal
continuous-time control trajectory. Higher order control parametrizations could thus allow the use
of longer shooting intervals compared to a piecewise constant parametrization, while maintaining the
approximation quality of the MPC feedback law.
The practical MPC consideration in (4) motivates using a constant control parameterization on
the first interval [0, ∆t]. In order to combine a constant control input on the first shooting interval
with higher order control parameterizations on other shooting intervals, an MOCP formulation is
thus necessary.
For linear parametrizations, simple bounds on the inputs can be satisfied everywhere by enforc-
ing them at the start and end of each control interval. In contrast, for higher order polynomial
parametrizations, even simple control bounds might be violated within the shooting intervals if only
enforced at the boundary points. In order to avoid excessive use of such violations, one possibility is
to enforce the control bounds at npc equidistant points within every shooting interval. For any fixed
point τ̄ ∈ [tn , tn+1 ], we have
ndeg
X
vn,k (τ̄ ) = un,k,i · (τ̄ − tn )i . (7)
i=0
which results in a linear inequality constraint with coefficients (τ̄ − tn )i for every intermediate point
on which a control bound is enforced. Additionally, one could add smooth penalties on violations of
the control bounds and integrate them over the shooting intervals, see Section 3.4.1.
3.4 Cost-to-go approximation via closed-loop costing

The idea of closed-loop costing (CLC) is to apply a simple control law κ : Rnx → Rnv to the nonlinear
system in (3c) and assign the resulting cost term as a terminal cost[50, 51, 52, 11]. This control law
is typically a locally stabilizing linear controller, e.g. corresponding to an LQR controller. In the
Reinforcement Learning and Dynamic Programming community, this idea is more commonly referred
to as rollout of a base policy[53]. The cost-to-go is approximated by the integrated cost associated
with the closed-loop system under policy κ. The closed-loop costing problem corresponding to (3)
6
can be written as
Z T1 Z T clc
min ℓ(x(t), v(t)) dt + ℓ(x(t), κ(x(t))) dt (8a)
x(·),v(·) 0 T1
s.t. 0 = x(0) − x̄0 , (8b)

0 = f (t, x(t), ẋ(t), v(t)), t ∈ [0, T1 ), (8c)
0 = f (t, x(t), ẋ(t), κ(x(t))), t ∈ [T1 , T clc ), (8d)
0 ≥ g(x(t), v(t)), t ∈ [0, T1 ], (8e)
0 ≥ g(x(t), κ(x(t))), t ∈ [T1 , T clc ), (8f)
0 = g e (x(T clc )), (8g)
where the first part of the horizon [0, T1 ] is the so-called control horizon and the latter part the
simulation horizon or closed-loop costing horizon. Note that the control input is only defined on
the control horizon, i.e. v : [0, T1 ] → Rnv and the control dimension is zero on the CLC horizon.
A discrete-time result showing that the stability region grows when increasing the CLC horizon is
presented in the work by Magni and colleagues [54]. Note that the constraints are still imposed on
the CLC horizon and violations correspond to infinite cost values [50].
Closed-loop costing problems can be expressed using two phases. In the terminal phase, the con-
trols are replaced by the control law in the cost and dynamics expressions, resulting in a phase k with
nu,k = 0. In literature, single shooting has been the method of choice to handle the CLC horizon
due to its superior computational efficiency in the case nu = 0. However, the beneficial convergence
properties also make multiple shooting an attractive option in acados: the partial condensing algo-
rithm provided by HPIPM allows to condense blocks of an arbitrary number of stages, and this can
be exploited to condense away all shooting nodes in the CLC horizon, while possibly retaining them
in the control horizon. This makes the computational efficiency of multiple shooting similar to the
one of single shooting, as the underlying QP solver does not see the shooting intervals corresponding
to the CLC horizon. For the experiments in Section 5.1, it was necessary to use CLC with multiple
shooting to achieve convergence. To the best of our knowledge, the approximate infinite horizon
corresponding to the closed-loop costing phase was only implemented with a single shooting interval
in previous works [50, 51, 52, 11, 55].
3.4.1 Practical treatment of constraints with closed-loop costing

In practical OCP formulations state constraints are often replaced by penalties in the cost function
to avoid infeasible OCPs [56]. Especially, when using longer shooting intervals, it is very important
to accuaretely integrate the cost term, including penalties [9].
Accurate cost integration together with a propagation of the cost gradient and a Gauss-Newton
Hessian can be performed efficiently. This is realized by the Gauss-Newton Runge-Kutta (GNRK)
integrators and implemented in acados [9]. They can effectively handle L2 constraint penalties. This
implementation has been extended to treat more general convex-over-nonlinear cost terms of the form
ℓ(x(t), v(t)) = ϕ(r(x(t), v(t))) (9)
with a smooth convex function ϕ and a nonlinear function r. The new implementation is able to
integrate such cost terms together with their generalized Gauss-Newton (GGN) Hessian [57, 58].
This is required to handle more general smooth penalties effectively, such as the squashed barrier
closed-loop costing formulation described in Section 3.4.2.
Despite the reformulation of state constraints as penalties, even simple control bounds can render
OCP (8) infeasible if the policy κ(·) would choose controls that violate those bounds on the CLC
horizon. This would implicitly define a terminal region constraint. If one wants to avoid that, a blunt,
but practical approach would be to not impose the control bounds on the CLC horizon, when using a
linear control law κ. Another option is to replace control bounds on the CLC horizon with additional
penalties. A third option is to combine the penalty approach with ”squashing”, as detailed next.
3.4.2 Squashed closed-loop costing

The idea of squashed CLC is to create a CLC OCP which is aware of the control bounds on the CLC
horizon. This can be achieved by defining the CLC control law as a squashed version of the linear
control law, ensuring that control bounds are always satisfied. A function σ : R → (−1, 1) is called
squashing function, if it is twice continuously differentiable, strictly monotonically increasing, odd,
and satisfies limz→±∞ σ(z) = ±1. Additionally, it is important, that ∂σ ∂z
(0) = 1, such that a linear
control law is locally not changed. One example is σ(z) = tanh(z), which was also used in previous
works[59].
7
In particular, a control variable ui with symmetric bounds [−ūi , ūi ], can be reformulated using a
new variable v and replacing ui by the expression ūi · tanh( ūvi ) in the OCP.
It is recommended to use squashing together with barrier penalties such that an iterative solver
does not go arbitrarily close to the squashed boundaries and gets stuck there[59]. A function β :
R → R≥0 ∪ {∞} is called barrier function if it is twice continuously differentiable, strictly convex
and limz→±1 β(z) = ∞. In particular, β(z) = − log(1 + z) − log(−z + 1) is such a function and
is used in this paper. The squashed LQR CLC variant with a progressively increasing barrier, can
be interpreted in the framework of progressive tightening [60], which offers both, numerical benefits
and closed loop stability guarantees for constrained nonlinear MPC. It is also possible to extend
this concept to one-sided constraints by using a smooth-max function instead of the sigmoid as a
squashing function and a one sided barrier.
3.5 Partial tightening

The partial tightening concept was introduced by Zanelli et al.[61] as a strategy to reduce the com-
putational cost of solving NMPC problems. To this end, an approximate formulation is introduced
which divides the horizon into two phases. On the second phase, the tightened horizon, the constraints
are tightened, i.e. replaced by barrier formulations. More precisely, a constraint 0 ≥ gj (x(t), v(t)) as
in (3) is replaced by a logarithmic barrier term
−τ log(gj (x(t), v(t))), (10)
which is added to the cost function with a barrier parameter τ .

The partial tightening formulation allows to derive closed-loop stability guarentees, based on the
property that the stage cost are monotonically increasing with the stage index, [61]. This concept
has been further extended to progressive tightening, which extends the concept to e.g. formulations
with stage-wise increasing barrier parameters τ .
Partial and progressive tightening formulations have been analyzed in the literature regarding
closed-loop stability properties [62, 60] and successfully applied in practical applications such as
collision avoidance for motion planning [63], control of a human-sized quadrotor [64], as well as
voltage control in active distribution networks [65].
The barrier formulation in (10) can be combined with a series of monotonically increasing values
for τ for subsequent stage costs on the tightened horizon to a progressive tightening formulation.
In context of the Real-Time Iteration (RTI), the variables corresponding to this second phase can
be fully eliminated in the preparation phase using a Riccati recursion. In an SQP setting this can be
achieved by performing partial condensing and summarizing all stages of the tightened horizon into
a single block of the reduced QP.
4 Efficient implementation in acados

This section presents how MOCPs can be posed for an efficient solution in acados. Starting from
problem (2), we discuss in Sections 4.1 and 4.2 the treatment of transition stages and how the problem
fits into the internal formulation treated by acados. Section 4.3 presents a tutorial MOCP example
and how it can be formulated for an efficient solution in acados.
4.1 Treating transition stages

In order to incorporate transitions between two phases into an existing SQP software for OCP struc-
tured problems, we first consider the simple case where (2c) boils down to xk+1,0 = xk,Nk , nη,k = 0 and
Ek (·) ≡ 0. These trivial phase transitions can be eliminated by enforcing ϕk,Nk −1 (xk,Nk −1 , uk,Nk −1 ) =
xk+1,0 and removing xk,Nk , ηk from the problem.
In the nontrivial case, we observe that the transition equation (2c) has the same form as the
equation for a shooting gap (2d). We thus call these discrete transition stages. On the discrete
transition stage, the discrete decision variable ηk takes the role of a control input and Ek takes the
role of the stage cost.
As opposed to formulating the transition as a concatenation of a dynamics step and a transition,
the implementation of a transition with a separate discrete transition stage has the following advan-
tages: Both the terminal state of the phase before the transition and the initial state of the phase
after the transition are readily available in the solver and the transition cost Ek fits seamlessly into
an MOCP formulation.
8
2 300
phase 1
standard OCP
phase 2
MOCP N1 = 15 T1 = 0.6
1
p
250 MOCP N1 = 10 T1 = 0.4

MOCP N1 = 5 T1 = 0.2
0
200
cost-to-go V ([p0 , 0])

0
−2
150
v
−4
100
20
0 50
a
−20
−40 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0
time t p0
Figure 1: Tutorial example: We consider a double integrator, which is in the second phase
approximated by a single integrator model. The left plot shows the state and control trajec-
tories for the tutorial MOCP. The right plot shows the cost-to-go function as a function of
the initial position p0 for different approximations of the OCP which only uses the double
integrator model.
4.2 Multi-phase multiple shooting formulation in acados

The discrete-time MOCP in (2) can be framed as a regular OCP-structured problem with stage-
varying costs and constraints and where – in contrast to the standard setting – the control and state
dimension might vary stage-wise,
N
X
minimize
x ,...,x ,
Lj (xj , uj , ∆tj ) + E(xN ) (11a)
0 N
u0 ,...,uN −1 j=0
subject to x̄0 = x(0), (11b)

xj+1 = ϕj (xj , uj ), j = 0, . . . , N − 1, (11c)
0 ≥ hj (xj , uj ), j = 0, . . . , N − 1. (11d)
This OCP consists of N stages, which capture both the shooting intervals and the transition stages.
In acados, nontrivial transitions are modeled by adding an extra phase comprised of a single interval.
The transition function Γk is specified using the acados discrete dynamics module and the discrete
decision variable ηk corresponds to the control variable for this phase. The number of stages PMN for
the OCP (11) would thus be N = M − 1+ M
P
k=1 N k if all transitions are nontrivial and N = k=1 Nk
if all transitions are trivial. Note that the stage cost Lj does not depend on the shooting interval
length if j is a transition stage.
A major challenge in tackling OCP (11) with efficient structure-exploiting solvers is that the
dimensions of states, controls and constraints are varying arbitrarily between stages. In acados,
this can be easily achieved by the different modules of the SQP type algorithm. Each stage has
its own module to compute and linearize cost and constraint functions. Each gap constraint (11c)
is associated with a module to evaluate and linearize it. The acados internal OCP-structured QP
subproblem is based on the HPIPM software package [30]. In addition to interior-point solvers for dense
and OCP-structured QP formulations, HPIPM offers efficient routines for transforming OCP-structured
QPs into dense QPs via full condensing or smaller OCP-structured QPs via partial condensing [66].
The HPIPM core algorithms, full and partial condensing, and the Riccati-based QP solver, all support
stage-varying dimensions. These full and partial condensing algorithms allow any dense QP solver,
such as DAQP [32] and qpOASES [31] or HPIPM itself, as well as the OCP-structure exploiting solver in
HPIPM, to be used seamlessly for multi-phase problem formulations in acados.
4.3 Tutorial example of a multi-phase OCP

In this section, we discuss a tutorial example of an MOCP which uses models on distinct parts of
the horizon and detail how to formulate the problem in acados. The code to reproduce the figures
in this section is publicly available1 . We regard a double integrator system which consists of the
1 https://github.com/acados/acados/blob/master/examples/acados_python/mocp_transition_example/main.py
9
state ξ1 = (p, s), with position p and speed s. The control input v1 is the acceleration a and the
continuous-time dynamics are simply given by

ṗ − s
0 = f1 (ξ1 , ξ˙1 , v1 ) = .
ṡ − a
Let us formulate an MOCP with two phases where an approximate model is employed in the second
phase. The approximate model does not consider acceleration and regards the velocity as the control
input instead yielding a one dimensional system, i.e., ξ2 = p, v2 = s. The system dynamics are then
f2 (ξ2 , ξ˙2 , v2 ) = ṗ − s. The transition function is Γk (ξ1 ) = p. We define an MOCP with a time horizon
of T = 1 and the cost functions
ℓ1 (ξ1 , v1 ) = p2 + 0.1s2 + 10−3 a2 , E1 (ξ1 ) = p2 + 0.01s2 ,

ℓ2 (ξ1 , v2 ) = p2 + 0.1s2 , E2 (ξ2 ) = 10p2 .
We impose control bounds as constraints on both phases: In the first phase, we have −50 ≤ a ≤ 50,
in the second phase, we use −5 ≤ s ≤ 5. We associate the first phase with the interval [0, 0.4] and the
second phase with [0.4, 1.0] and divide the both intervals uniformly into 10 and 15 shooting intervals,
respectively. The resulting problem is a QP and solved with a single SQP iteration in acados. The
optimal trajectory for x̄0 = [2, 0]⊤ is visualized in the left plot in Figure 1.
The right plot in Figure 1 shows how MOCPs can approximate the cost-to-go function. All
OCPs use N = 25 shooting intervals and a time horizon of T = 1. The MOCPs use the single
integrator approximate OCP phase as described above on the interval [T1 , T ]. The shooting intervals
are equidistant on both phases with N1 intervals in the first phase and N − N1 in the second phase.
We want to use the tutorial example to demonstrate the formulation of an MOCP with a nontrivial
transition using the new acados interface. Firstly, we need to define the models for all phases, the
double and the single integrator model as well as the transition model. The single integrator model
is defined as an explicit ODE using CasADi as follows
import casadi as ca
def g e t _ s i n g l e _ i n t e g r a t o r _ m o d e l () - > AcadosModel :
model = AcadosModel ()
model . name = ’ s i n g l e _ i n t e g r a t o r ’
model . x = ca . SX . sym ( ’p ’)
model . u = ca . SX . sym ( ’v ’)
model . f_expl_expr = model . u
return model
The transition model is defined as

def g e t _ t r a n s i t i o n _ m o d e l () - > AcadosModel :
model = AcadosModel ()
model . name = ’ t r a n s i t i o n _ m o d e l ’
p = ca . SX . sym ( ’p ’)
v = ca . SX . sym ( ’v ’)
model . x = ca . vertcat (p , v )
model . disc_dyn_expr = p
return model
Let’s assume that we already formulated the OCPs for the individual phases, the one with the single
and double integrator model, which use the established Python interface of acados, more precisely
the AcadosOcp class.
def f o r m u l a t e _ d o u b l e _ i n t e g r a t o r _ o c p () - > AcadosOcp :
...
def f o r m u l a t e _ s i n g l e _ i n t e g r a t o r _ o c p () - > AcadosOcp :
...
A multi-phase OCP can be formulated in acados using the AcadosMultiPhaseOcp class, which is
created by specifying the number of stages per phase.
N_list = [ 10 , 1 , 15 ] # 10 stages for double integrator ,
# 1 stage for transition , 15 stages for single i n t e g r a t o r
ocp = A c a d o s M u l t i p h a s e O c p ( N_list = N_list )
The novel interface allows to define the dynamics, cost and constraints of one phase utilizing the
single-phase OCP formulations, i.e. the AcadosOcp class.
# define phases for single and double i n t e g r a t o r
phase_0 = f o r m u l a t e _ d o u b l e _ i n t e g r a t o r _ o c p ()
ocp . set_phase ( phase_0 , 0 )
phase_2 = f o r m u l a t e _ s i n g l e _ i n t e g r a t o r _ o c p ()
10
# define the t r a n s i t i o n phase and cost
phase_1 = AcadosOcp ()
phase_1 . model = g e t _ t r a n s i t i o n _ m o d e l ()
phase_1 . cost . cost_type = ’ NONLINEAR_LS ’
phase_1 . model . cost_y_expr = phase_1 . model . x
phase_1 . cost . W = np . diag ( [ L2_COST_P , 1e - 1 * L2_COST_V ] )
phase_1 . cost . yref = np . array ( [ 0 . , 0 . ] )
Most solver options can be set in the same way as for the single-phase OCP:
ocp . so lver_op tions . n lp _s ol v er _t yp e = ’ SQP ’
ocp . so lver_op tions . time_steps = np . array ( N_list [ 0 ] * [ 0 . 4 / N_list [ 0 ] ]
+ [ 1 . 0 ] # t r a n s i t i o n stage
+ N_list [ 2 ] * [ 0 . 6 / N_list [ 2 ] ] )
Some additional solver options that can not vary for single-phase OCP problems can be set addition-
ally.
ocp . mocp_opts . i nt eg ra t or _t yp e = [ ’ ERK ’ , ’ DISCRETE ’ , ’ ERK ’]
Finally, an AcadosOcpSolver can be created from the AcadosMultiphaseOcp, just as from a AcadosOcp
object.
a c a d o s _ o c p _ s o l v e r = A ca do s Oc pS ol v er ( ocp )
The interactions with the solver are independent of whether it was created from a single or multi-phase
OCP formulation.
5 Numerical experiments
This section presents three numerical case studies. First, Section 5.1 compares MPC controllers
with different control parameterizations and closed-loop costing variants on an inverted pendulum
test problem. Second, Section 5.2 regards the task of controlling a differential drive robot directly
through the voltages of actuators and compares the performance of controllers based on single- and
multi-phase problem formulations. Third, Section 5.3 shows the efficiency of partial tightening with
real-time iterations qualitatively replicating the benchmark results from the first partial tightening
paper[61]. All experiments have been carried out using acados v0.3.2 via its Python interface on a
Laptop with an Intel i5-8365U CPU, 16 GB of RAM running Ubuntu 22.04.
Note that these experiments only compare acados controllers based on single- and multi-phase
OCP formulations. Comparisons with solvers based on other software packages are out of the scope of
this paper. However, since the computation times of single- and multi-phase OCP formulations within
acados are consistent, results from existing software comparisons can be transferred to multi-phase
problems if all competing solvers support MOCPs.
5.1 Inverted pendulum on cart test problem

We investigate the inverted pendulum on cart problem in the setting from the benchmark presented
by Frey et al. [9] and compare different controllers with different closed-loop costing variants and
control parametrizations. The code of the orignal benchmark has been adapted to incorporate the
new controllers2 .
The differential state of the model is x = [p, θ, s, ω]⊤ with cart position p, cart velocity s, angle
of the pendulum θ and angular velocity ω. The control input v is a force acting on the cart in the
horizontal plane. The system dynamics can be found e.g. in [12]. In our OCP formulation, v is
constrained to be in [−40, 40]. The simulation starts with an initial state x̄0 = [0, π5 , 0, 0]⊤ . The goal
is to drive all states to zero, i.e. the unstable upright position. We formulate the following nonlinear
least-squares cost consisting of quadratic costs on states and controls and a term penalizing a position
p outside of [−1, 1] =: [pmin , pmax ], namely
l(x, u) = x⊤ Qx + v ⊤ Rv + γ · (max(pmin − p, 0))2 + γ · (max(p − pmax , 0)2
where the cost weights are chosen as γ = 5 · 104 , Q = diag(100, 103 , 0.01, 0.01), R = 0.2. The terminal
cost term is set to M pend (x) = x⊤ P x, where P is obtained as solution of the discrete algebraic Riccati
equation with cost and dynamics linearized at the steady state.
The controllers use two different discretization grids, visualized in Figure 2. The plant is simulated
with a time step of ∆t = 0.02s. All controllers use the GNRK cost discretization described in our
2 https://github.com/FreyJo/GNRK_benchmark
11
T1 = 0.3
∆t
Grid A t
0 T = 4.0
∆t
Grid B t
0 T = 4.0
Figure 2: Two time grids considered in the benchmark. Both grids start with an interval
of length ∆t = 0.02. The grids are visualized for N = 10 shooting intervals. For Grid
A, the interval [∆t, T1 ] is divided into N1 = 4 equidistant intervals, and [T1 , T ] into N2 =
N − N1 − 1 = 5 equidistant intervals. For Grid B, the interval [∆t, T ] is divided into N − 1
intervals.
0.0
−0.5
p [m]
−1.0
40
20
ν [N]
−20 LQR-CLC
CLC-SQB
−40
0.0 0.2 0.4 0.6 0.8 1.0 1.2
t[s]
Figure 3: Multiple-shooting with closed-loop costing. The solver iterates after performing 3
iterations on the first problem in the closed-loop scenario in Figure 4 are visualized.
recent paper[9] and Section 3.4.1 with a Gauss-Radau IIA method of order 7 on each shooting
interval. Additionally, we compare various control parameterizations, namely piecewise polynomials
of different degree on all but the first shooting interval [0, ∆t], see Section 3.3, and closed-loop costing
formulation, see Section 3.4. In order to conveniently formulate the models with piecewise polynomial
controls, we added the functionality AcadosModel.reformulate with polynomial control().
An overview of all controller variants considered is presented in Table 1, some of the closed-loop
trajectories are plotted in Figure 4 and some of the open-loop control trajectories corresponding to
the first problem are shown in Figure 5.
5.1.1 Controller variants

The controller labeled IDEAL uses a uniform time discretization corresponding to time steps ∆t with
piecewise constant controls, i.e. no mismatch between the plant and the OCP model. This controller
gives the best closed-loop performance, but has the highest computational complexity. The controller
REF corresponds to the variant which gave the best tradeoff between computation time and closed-
loop performance in the original benchmark[9]. The goal is to derive a discrete-time OCP variant
which uses less shooting intervals than this controller in order to reduce the associated computational
complexity, while attaining similar closed-loop performance.
The variant REF-N10 uses the same kind of grid, but divides the interval [∆t, T ] into 9 instead of
19 shooting intervals. Thus, the computational complexity associated with one iteration is roughly
halved. However, the closed-loop performance is highly degraded. The controller variant PW-LIN-B
has only one difference, which is to use a piecewise linear control parametrization on all but the first
shooting interval. This results in a much better performance compared to REF-N10, with marginally
higher computational burden, indicating that the cost-to-go approximation on this coarse grid is
limited by the control parameterization.
We evaluate two closed-loop costing (CLC) variants. Both CLC controllers use Grid A, where
the second part of the time horizon, [T1 , T ] is implemented with a second phase, see Section 3.4. For
the LQR-CLC controller, the control law is the stabilizing LQR control law κLQR (x) = Kx. The
control input constraints are not enforced on the CLC horizon. This controller results in a closed-loop
performance that is similar to the one of controller REF proposed in the original benchmark[9] at
half the compuation time.
12
Table 1: Closed-loop comparison of different controller variants on the pendulum test prob-
lem with different discretization grids and the control parametrizations, including closed-
loop-costing and piecewise polynomial controls.
variant ID N Grid control parametrization comp. time / iter [ms] rel. suboptimality [%]
IDEAL 200 B pw. constant 3.69 0.00
REF 20 B pw. constant 0.43 3.72
REF-N10 10 B pw. constant 0.21 267.57
PW-LIN-B 10 B pw. polynomials, ndeg = 1, npc = 2 0.23 13.54
PW-CUBIC-B 10 B pw. polynomials, ndeg = 3, npc = 10 0.24 13.59
LQR-CLC 10 A unconstrained LQR CLC 0.22 3.41
CLC-SQB 10 A Squashed + prog. barrier CLC 0.23 5.13
PW-CONST-A 10 A pw. constant 0.22 0.49
PW-LIN-A 10 A pw. polynomials, ndeg = 1, npc = 2 0.23 0.40
PW-QUAD-1 10 A pw. polynomials, ndeg = 2, npc = 4 0.24 0.06
PW-QUAD-2 10 A pw. polynomials, ndeg = 2, npc = 10 0.26 0.15
PW-CUBIC-1 10 A pw. polynomials, ndeg = 3, npc = 4 0.23 0.07
0.0
−0.5
p [m]
−1.0
0.5
θ [rad/s]
0.0
−0.5
25
ν [N]
−25
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

t [s]
IDEAL REF-N10 CLC-SQB
REF PW-LIN-B PW-CUBIC-3
Figure 4: Closed-loop trajectories corresponding to the benchmark results in Table 1. Values

for the postion p outside of [−1, 1] are heavily penalized and result in high suboptimality,
which is reported in Figure 6 and Table 1.
13
40
30
20
10
ν [N]
−10
REF-N10: Grid B, pw constant
−20 PW-CUBIC-B: Grid B, pw polynomial ndeg = 3, npc = 10
PW-LIN-A: Grid A, pw polynomial ndeg = 1, npc = 2
−30
PW-CUBIC-2: Grid A, pw polynomial ndeg = 3, npc = 6
−40
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
t[s]
Figure 5: Open-loop control trajectories corresponding to the first problem solved in the
closed-loop simulation visualized in Figure 4. Details on the controller variants are in Table 1.
REF PW-LIN-A
Average comp. time per NLP iter. [ms]
0.40 REF-N10 PW-QUAD-1

PW-LIN-B PW-QUAD-2
PW-CUBIC-B PW-CUBIC-1
0.35 LQR-CLC PW-CUBIC-2
CLC-SQB PW-CUBIC-3
PW-CONST-A
0.30
0.25
0.20
10−1 100 101 102
relative suboptimality [%]
Figure 6: Pareto plot comparing the controllers in Table 1 in terms of average computation
time per NLP solver iteration and relative suboptimality with respect to the controller
IDEAL which is not included in this Figure.
14
The other CLC controller is labeled CLC-SQB and uses the same linear feedback law with ad-
ditional squashing to impose the given control limits, κsquash (x) = σ(Kx). Additionally, a barrier
term β(σ(Kx)) is added on the CLC horizon. The resulting controller can therefore not plan with
violations of the control bounds in the CLC horizon. This controller yields slightly worse closed-loop
performance compared to the standard LQR-CLC controller. We attribute this observation to the
fact that the CLC controller with squashing naturally has a more conservative plan, which results in
some additional suboptimality. Figure 3 visualizes intermediate iterates of the controllers LQR-CLC
and CLC-SQB and gives some insights into CLC. It can be seen that the controls on the CLC horizon
are continuous, since κ and the state trajectory are continuous. However, for the intermediate iterates
of the multiple-shooting formulation, the shooting gaps, which are visible in Fig. 3 result in gaps in
the controls within the CLC horizon which are closed at convergence. Lastly, we observe that the
LQR-CLC controller violates the control bounds in its plan, while the squashed variant CLC-SQB
inherently respects them.
As an alternative to the CLC controllers, some variants are included, which use discretization Grid
A, but a standard control horizon on the second part of the grid [T1 , T ] and piecewise polynomial
controls on [∆t, T ] with the control bounds enforced on npc equidistant points on every shooting
interval, see Section 3.3.
The controllers PW-CONST-A and PW-LIN-A exactly enforce the control bounds, are among the
controllers with the lowest computational complexities and result in a relative suboptimality below
0.5 %. In particular, they are able to roughly halve the computational complexity of REF, the best
controller in the original benchmark[9], while reducing the suboptimality by a factor greater than 7.
5.1.2 Discussion
The fact that controller PW-CONST-A has a lower computational burden compared to the CLC
variant is attributed to the fact that the control law κ(·) and its derivatives have to be evaluated
often within every step and Newton iteration of the implicit integrator. The additional computational
burden of a larger input dimension, which is 0 or 1 on the latter part of the horizon, is rather small
for efficient state-of-the-art QP solvers.
There are certainly other examples in which a CLC controller is computationally more attractive.
In addition to the test setting, this of course depends on the underlying software framework for
linearization and solution of the subproblems. In particular, for modern OCP-structure exploiting
software frameworks, the computational burden associated with an extra control variable is less
significant compared to when CLC was introduced in the early 1990s [28].
A similar comparison of CLC controllers with respect to obvious alternatives is not to be found in
the existing literature on CLC[50, 51, 52, 11, 55]. In addition to the LQR-CLC controller described
above, we compare a related variant, which uses N2 = 1, i.e. single shooting on the CLC horizon,
and instead performs five steps of IRK on this interval. This controller variant does not converge in
the first simulation step of the scenario and has a computational complexity similar to LQR-CLC.
This indicates that the multiple shooting CLC implementation proposed in this paper has desirable
properties compared to the single-shooting CLC variants used in previous works.
Overall, the comparison of CLC with respect to nonuniform OCP discretizations shows that the
latter are very competitive. On this test example, the nonuniform grid variant outperforms the
LQR closed-loop costing based controller in terms of closed-loop performance by a factor of 7 while
requiring an equal amount of computational resources.
Finally, we regard controllers with piecewise polynomial control parameterizations of degree ndeg >
1 which enforce the control bounds on npc intermediate points. The open-loop solutions corresponding
to the first problems within the closed-loop simulation are visualized for a few controllers in Figure 5.
We observe small violations of the open-loop control trajectories with respect to the control bounds,
e.g. at t ≈ 0.2 for PW-CUBIC-2 and t ≈ 0.35 for PW-CUBIC-B.
All variants in Figure 5 use 10 shooting intervals. Comparing REF-N10 in Figure 5 with the opti-
mal closed-loop control trajectory in Figure 4, we observe that this planned controls are qualitatively
very different from the optimal ones, resulting in a bad cost-to-go approximation and thus closed-loop
performance. In contrast, PW-CUBIC-2 can capture the qualitative behavior of the optimal closed-
loop control trajectory well. In particular, the relatively short intervals in the first part of the horizon
enable the cubic polynomials to approximate the optimal bang-bang solution in this phase well. In
contrast, the variant PW-CUBIC-B, which uses the same control parametrization as PW-CUBIC-2,
but on grid A, approximates this optimal bang-bang behavior with a slower transition, see Figure 4,
resulting in significantly higher closed-loop cost, see Table 1. A visual comparison of the open-loop
trajectories of PW-CUBIC-2 and PW-LIN-A shows that the optimal bang-bang behavior in the first
part of the horizon is approximated similarly, while on the first long interval, starting at t = 0.3, the
cubic polynomials capture the optimal behavior qualitatively better, as piecewise linear functions do
not have an inflection point.
15
When comparing controllers with the same parametrization, but different grids on which the
control bounds are enforced, i.e. varying npc , we can observe two effects. Firstly, comparing PW-
QUAD-1 and PW-QUAD-2, we see that enforcing control bounds on a finer grid can result in worse
closed loop performance. This can be attributed to the fact that enforcing the control bounds
everywhere using polynomials results in a more conservative approximation of the cost-to-go, since the
planned controls are approximated within the bounds. On the other hand, regarding PW-CUBIC-1,
PW-CUBIC-2, PW-CUBIC-3, we observe that enforcing constraints on a tighter grid can improve
the approximation quality.
Overall, the general trend in our experiments shows that additional degrees of freedom in piecewise
polynomial control parametrizations can result in a strongly improved closed-loop performance, if the
control bounds are enforced on a sufficiently fine grid, which we attribute to a better approximation
of the closed-loop cost. In particular, PW-CUBIC-3 results in better performance than PW-QUAD-2,
which in turn improves on PW-LIN-A.
Of course one needs to be careful, when it comes to drawing general conclusions from the presented
results on piecewise polynomial control parameterizations. The additional computational cost of
polynomial control parameterizations of higher degrees and handling the corresponding bounds, will
be much more significant, with growing nv . In addition to the dimensions, the optimal choice of
control parameterization and where to enforce control bounds depends on the qualitative behavior
of the continuous-time optimal solution (bang-bang or continuous), the time discretization grid, the
available solvers and computational resources. However, the possibility of formulating such problems
within an efficient software package can significantly improve the closed-loop performance of MPC
controllers.
5.2 Differential drive robot with actuator model and economic cost
We consider a differential drive robot and develop an NMPC controller which takes the underlying
actuators into account allowing it to consider their power consumption in the cost function. The code
to reproduce the results presented in this section is publicly available3
5.2.1 Modelling
This subsection describes the differential drive model with and without actuators [67]. For simplicity,
the dynamic model which disregards the actuators is presented first. This model consists of the state
vector xsimple = [px , py , v, θ, ω], where px , py denotes the robots position in x- and y-coordinates, v
the velocity of the robot, θ the heading angle, and ω the angular velocity. The input of this model
are usimple = [τ r , τ l ], the torques applied to the right and left driving wheel, respectively, in Nm. The
evolution of the system is described by the ODE
 
v cos θ
 v sin θ 
ẋsimple =  a1 +mc dω2  , (12)
 
 m+a2 
La3 −mc dωv
I+L2 a2
−τl
where the shorthands a1 = τ r +τ 2
l
, a2 = 2I w
R2
, a3 = τrR are introduced. The model contains the
following parameters, which we assume to be constant: The total mass of the robot m = 220kg, the
mass of the robot without the wheels and the rotating parts of the actuators mc = 200kg, the robot’s
moment of inertia around the center of mass I = 9.6 kg · m2 , the combined moment of inertia about
the wheel’s axis of a driving wheel and the rotating part of the actuator I w = 0.1kg · m2 .
In addition, we consider a more accurate model of the robot which takes the two actuators,
namely a motor on each driving wheel into account. The actuator model consists of the state xact =
[px , py , v, θ, ω, I r , I l ], where I r , I l are the currents in the motor driving the right and left wheel,
respectively, and the control input uact = [V r , V l ], where V r , V l denote the voltages applied to the
motors. The evolution of the states common for both models is given by (12), where the motor
torques τ r , τ l are substituted by τ r = K1 I r and τ l = K2 I l . In addition, the dynamics of the accurate
model are given by
" K1 ψ1 −Ract I r +V r #
I˙r − Lact
= (13)
I˙l − K2 ψ2 −R act I l +V l
Lact
with
ṗx cos θ + ṗy sin θ + Lω
ψ1 = , (14)
R
ṗx cos θ + ṗy sin θ − Lω
ψ2 = . (15)
R
3 https://github.com/FreyJo/ocp_solver_benchmark/blob/main/experiments/actuator_diff_drive.py
16
This accurate model additionally contains the motor constants K1 = 1.0, K2 = 1.0, the coil in-
ductance Lact = 10−4 H, the coil resistance Ract = 0.05Ω as parameters which we assume to be
constant.
5.2.2 Optimal control problem formulation

We want to minimize the following cost term
lact (xact , uact ) = xact ⊤ Qxact + |V r I r | + |V l I l | (16)
with Q = diag(103 , 103 , 10−4 , 1, 10−3 , 0.5, 0.5). Note that the second summand |V r I r | + |V l I l | is
an economic cost term corresponding to the power consumption. It is implemented in acados by
introducing slack variables slow , sup and constraints slow ≤ [V r I r , V l I l ]⊤ ≤ sup and replacing the
term |V r I r | + |V l I l | in the cost with slow,1 + slow,2 + sup,1 + sup,2 . When using the actuator model,
for the full horizon, we use the terminal cost term:
E act (xact ) = xact ⊤ Qxact (17)
We want to investigate approximate MOCP formulations which use the simple differential drive
model on the latter part of the horizon. The cost term in (16) can be approximated using the simple
model by disregarding the economic cost term, i.e. using
lsimple (xsimple , usimple ) = xsimple ⊤ Q̃xsimple + usimple ⊤ R̃usimple (18)
with Q̃ = diag(103 , 103 , 10−4 , 1, 10−3 ), R̃ = diag(0.5, 0.5). As a terminal cost of the first phase with
the actuator model, cf. (2), we define
E trans (xact ) = 0.5∆ttrans · (I r 2 + I l 2 ), (19)
where ∆ttrans denotes the length of the last shooting interval on which the actuator model is used.
This cost term corresponds to the tracking part of the cost in (16) and is needed since I r , I l are not
included in the simple model. When using the simple model at the end of the horizon, we use the
terminal cost term:
E simple (xsimple ) = xsimple ⊤ Q̃xsimple . (20)
For both models, we impose the following path constraints on the state
0 ≤ v ≤ 1, (21)
−0.5 ≤ ω ≤ 0.5. (22)
Additionally, we impose the following constraints on the control inputs when using the actuator model
−10 ≤ V l ≤ 10, (23)

−10 ≤ V r ≤ 10. (24)
Respectively, for the simple model, we impose
−60 ≤ τ r ≤ 60, (25)

−60 ≤ τ l ≤ 60. (26)
5.2.3 Comparing controller variants

We compare different controller variants. All variants use discretization steps of ∆t = 0.1s, the model
dynamics are integrated using an implicit Runge-Kutta method with a Legendre Butcher tableau of
order 6 on each shooting interval. The controllers are based on an acados full-step SQP solver using
a Gauss-Newton Hessian approximation and solving the QPs with HPIPM without condensing.
We consider MOCP variants with N = 40 and N = 50 shooting intervals and vary N act , the
number of shooting intervals on which the actuator model is used, afterwards, we use a transition
stage and N − N act shooting intervals with the simple model. For the single-phase OCP variants, we
vary the number of shooting intervals N , as this formulation does not allow for a transition.
We compare several controllers in a closed-loop simulation of 20s using the actuator model starting
at the initial state x0 = (1, 1, 0, π, 0, 0, 0), exactly integrating the cost in (16) using 10 integrator steps
for a sampling time of 0.1s. In order to evaluate relative suboptimality, a reference controller with
N = 60 shooting intervals of the actuator model is used. Controllers with fully converged SQP and
RTI variants, which only solve a single QP subproblem are evaluated in the same plot.
17
35
OCP N = 10 MOCP N = 50, Nact = 30
OCP N = 20 MOCP N = 50, Nact = 40
30 OCP N = 30 MOCP N = 40, Nact = 10
OCP N = 40 MOCP N = 40, Nact = 20
OCP N = 50 MOCP N = 40, Nact = 30
mean computation time [ms]

25
MOCP N = 50, Nact = 10 SQP
MOCP N = 50, Nact = 20 RTI
20
15
10
0
100 101
Relative suboptimality [%]
Figure 7: Pareto plot comparing different controller variants for directly controlling the
actuators of a differential drive robot with an economic cost. All MOCP variants use N = 50
shooting intervals and N − N act of an approximate model which disregards the actuator
dynamics.
The Pareto plot in Figure 7 visualizes the suboptimality and computation time of the various
variants. We observe that the Pareto front is dominated by the MOCP variants. Note that the
variant OCP N = 50 correspond to the controller MOCP N = 50, N act = 50. The solver variants
OCP N = 30 and MOCP N = 50, N act = 10 with SQP require a similar computation time, however
the MOCP variant results in a three fold lower relative suboptimality. On the other hand, MOCP
with N = 50, N act = 10 delivers a similar suboptimality compared to the variant OCP N = 50,
while only requiring roughly half of the computation time. Overall, the MOCP variants dominate
the Pareto front in large parts and it is of course possible to generate many more combinations using
MOCP formulations or by varying the number of SQP iterations. These results show that deriving
an approximate OCP formulation and using it within an MOCP can result in an NMPC controller
which is able to outperform NMPC controllers that only use a single-phase OCP formulation.
5.3 Partial tightening

This section briefly demonstrates how partial tightening can be efficiently implemented within acados,
qualitatively reproducing the results from the benchmark in the paper that introduced partial tight-
ening [61]. To this end, we regard the inverted pendulum on cart model as in Section 5.1. The cost
function is of linear least-squares form
l(x, v) = x⊤ Qx + v ⊤ Rv, (27)

−3
, R = 5 · 10−4 . For the terminal cost, we use x⊤ P x, where P is

with Q = diag 0.1, 1, 0.1, 2 · 10
the solution of the continuous-time algebraic Riccati equation with cost and dynamics linearized at
the steady-state. The only constraint is that the control input is −12 ≤ v ≤ 12. When using partial
Table 2: Performance overview of different controllers with RTI and partial tighten-
ing. Maximum timings are over the closed-loop simulation in Figure 9 and are given
in [ms] and relative suboptimality is evaluated by comparing with the controller variant
N = 100, Nexact = 100.
computation times
Variant preparation feedback total rel. subopt. %
N = 100, Nexact = 5 0.23 0.32 0.54 21.25
N = 100, Nexact = 10 0.25 0.36 0.59 16.61
N = 100, Nexact = 20 0.24 0.44 0.65 12.25
N = 100, Nexact = 50 0.25 1.15 1.36 5.15
N = 100, Nexact = 100 0.23 2.87 3.05 0.00
N = 50, Nexact = 50 0.10 2.17 2.27 990.43
18
Time taken for each solver (runs: 40)
3.0 preparation
feedback
max time per closed-loop iter. [ms]
2.5
2.0
1.5
1.0
0.5
0.0
5 10 = 20 = 50 = 100 = 50
0, Nexact = 0, Nexact = 0, Nexact 0, Nexact 0, Nexact , Nexact
N = 10 N = 10 N = 10 N = 10 N = 10 N = 50
Figure 8: Maximum timings for the preparation and feedback phase over the closed-loop
simulation shown in Figure 9. The timings are also given in Table 2.
4
10
2
px [m]
F [N]
0
−10
t [s]
4
θ [rad]
2 N = 100, Nexact = 5 N = 100, Nexact = 100

N = 100, Nexact = 20 N = 50, Nexact = 50
0
0 1 2 3 4 5
t [s]
Figure 9: Closed-loop trajectories corresponding to different controller variants with partial

tightening. The total number of shooting intervals N and the ones handled exactly N exact
are varied. The shooting intervals are part of the tightened horizon.
19
tightening, the constraint is replaced with an additional log-barrier term as in (10) corresponding to
this constraint with τ = 5 and a GGN Hessian approximation is used. While previous works only used
custom implementations of partial tightening, the MOCP interface allows a convenient formulation in
established software and the source code to reproduce the results presented in this Section is publicly
available4 .
This benchmark compares controllers in a closed-loop simulation with initial state x0 = (0, π, 0, 0)
for a duration of 5s with a time step of 0.01s. The controllers use different underlying OCP formu-
lations, where the length of the overall horizon and the tightened horizon are varied. The tightened
horizon is implemented as a second phase of an MOCP formulation. The length of a shooting interval
is fixed to 0.01s and the dynamics are discretized using an RK4 integrator. We vary the total number
of shooting intervals N and the number of shooting intervals on which the constraints are formulated
exactly N exact . The second phase contains N −N exact shooting intervals, on which the control bounds
are replaced with logarithmic barriers.
All controllers use the real-time iteration algorithm. The QPs are solved using partial condensing,
such that the blocks corresponding to the tightened horizon are condensed into one block and the
remaining blocks remain uncondensed. In order to perform as many operations as possible in the
preparation phase, acados internally implements functions that assume that only matrices of the QP
are known and a second one that completes the computations once the vector quantities are known.
For the acados module that performs the condensing and QP solution has a split functionality, namely
condense lhs() and condense rhs and solve().
Table 2 gives an overview on the controllers closed-loop performance and the computation times
split into preparation and feedback phase. Figure 9 visualizes the closed-loop trajectories for different
controller variants. Figure 8 and Table 2 show that the preparation time is consistent between all
solver variants with N = 100 and roughly halved for the variant with N = 50. While the controller
with N = 50, N exact = 50 fails at swinging up the pendulum, the other controllers succeed at this
task. Comparing the variants with N = 100, one can see that decreasing N exact results in an increase
in relative suboptimality and a decrease in associated computational complexity, more specifically a
decrease int he computation time of the feedback phase. Overall, the results are very similar to the
ones reported in the original benchmark[61]. Note that in our implementation, the barrier parameter τ
could be easily increased over the shooting intervals, resulting in a progressive tightening formulation
that is not just partial tightening. In summary, partial and progressive tightening formulations
allow one to trade-off computational complexity and closed-loop performance and can be formulated
conveniently as MOCPs.
6 Conclusion
This paper gives an overview on multi-phase OCP (MOCP) formulations and their efficient treatment
using multiple shooting. Several approaches are presented which allow to formulate a continuous-time
OCP in a successively approximate way which require an MOCP formulation. The work provides
an overview on different control parametrizations for use within multiple shooting, such as piecewise
polynomial controls of different degrees and closed-loop costing variants, and motivated their use.
These control parametrizations have been compared on a benchmark example from previous work.
Moreover, we demonstrate the efficiency of NMPC controllers based on an MOCP formulation with
an approximate model in a second phase. Lastly, we show how partial and progressive tightening
OCPs can be phrased as MOCPs. These examples show that the added degrees of freedom allow
one to develop NMPC controllers, which outperform ones limited to single-phase OCP formulations.
While other state-of-the-art software packages are limited to single-phase OCP formulations or use
general purpose NLP solvers instead of structure-exploiting algorithms tailored to OCPs, the new
acados feature allows for both a convenient formulation and the generation of efficient solver for
MOCPs. We believe that this new feature will make efficient solvers for MOCP formulations more
available to NMPC practitioners and spread their use in real-world applications.
References
[1] F. Dabbene, M. Cannon, M. Chamanbaz, A. Isaksson, M. Mammarella, and D. Raimondo,
“Guest editorial special issue on state-of-the-art applications of model predictive control,” IEEE
Transactions on Control Systems Technology, vol. 31, no. 5, pp. 1965–1970, 2023.
[2] S. Hänggi, J. Frey, S. van Dooren, M. Diehl, and C. H. Onder, “A modular approach for diesel
engine air path control based on nonlinear mpc,” IEEE Trans. Control Systems Technology,
2022.
4 https://github.com/FreyJo/ocp_solver_benchmark/blob/main/experiments/zanelli_partial_tightening.py
20
[3] A. Zanelli, J. Kullick, H. Eldeeb, G. Frison, C. Hackl, and M. Diehl, “Continuous control set
nonlinear model predictive control of reluctance synchronous machines,” IEEE Trans. Control
Systems Technology, pp. 1–12, 2021.
[4] B. B. Carlos, T. Sartor, A. Zanelli, M. Diehl, and G. Oriolo, “Least Conservative Linearized
Constraint Formulation for Real-Rime Motion Generation,” in Proc. 21st IFAC World Congr.,
2020, pp. 9519–9525.
[5] B. B. Carlos, T. Sartor, A. Zanelli, G. Frison, W. Burgard, M. Diehl, and G. Oriolo, “An efficient
real-time nmpc for quadrotor position control under communication time-delay,” in 2020 16th
International Conf. Control, Automation, Robotics and Vision (ICARCV), 2020, pp. 982–989.
[6] Y. Gao, F. Messerer, J. Frey, N. van Duijkeren, and M. Diehl, “Collision-free motion planning
for mobile robots by zero-order robust optimization-based mpc,” in Proc. Eur. Control Conf.
(ECC), 2023.
[7] A. Norouzi, S. Shahpouri, D. Gordon, A. Winkler, E. Nuss, D. Abel, J. Andert, M. Shahbakhti,
and C. R. Koch, “Deep learning based model predictive control for compression ignition engines,”
Control Engineering Practice, vol. 127, p. 105299, 2022.
[8] A. Romero, R. Penicka, and D. Scaramuzza, “Time-optimal online replanning for agile quadrotor
flight,” IEEE Robotics and Automation Lett., vol. 7, no. 3, pp. 7730–7737, 2022.
[9] J. Frey, K. Baumgärtner, and M. Diehl, “Gauss-Newton Runge-Kutta integration for efficient
discretization of optimal control problems with long horizons and least-squares costs,” Eur. J.
Control, 2024.
[10] V. Vassiliadis, R. Sargent, and C. Pantelides, “Solution of a class of multistage dynamic opti-
mization problems. 1. Problems without path constraints,” Industrial and Engineering Chemistry
Research, vol. 10, no. 33, pp. 2111–2122, 1994.
[11] L. Magni and R. Scattolini, “Stabilizing model predictive control of nonlinear continuous sys-
tems,” Annual Reviews in Control, vol. 28, pp. 1–11, 2004.
[12] R. Verschueren, G. Frison, D. Kouzoupis, J. Frey, N. van Duijkeren, A. Zanelli, B. Novoselnik,
T. Albin, R. Quirynen, and M. Diehl, “acados – a modular open-source framework for fast
embedded optimal control,” Math. Program. Comput., Oct 2021.
[13] D. Kouzoupis, “Structure-exploiting numerical methods for tree-sparse optimal control prob-
lems,” Ph.D. dissertation, Univ. of Freiburg, 2019.
[14] F. Fiedler, B. Karg, L. Lüken, D. Brandner, M. Heinlein, F. Brabender, and S. Lucia, “do-mpc:
Towards fair nonlinear and robust model predictive control,” Control Engineering Practice, vol.
140, p. 105676, 2023.
[15] S. Lucia, “Robust multi-stage nonlinear model predictive control,” Ph.D. dissertation, TU Dort-
mund, 2014.
[16] [Online]. Available: https://de.mathworks.com/help/mpc/ref/nlmpcmultistage.html
[17] M. A. Patterson and A. V. Rao, “Gpops-ii: A matlab software for solving multiple-phase
optimal control problems using hp-adaptive gaussian quadrature collocation methods and
sparse nonlinear programming,” ACM Trans. Math. Softw., vol. 41, no. 1, oct 2014. [Online].
Available: https://doi.org/10.1145/2558904
[18] A. Wächter and L. T. Biegler, “On the implementation of an interior-point filter line-search
algorithm for large-scale nonlinear programming,” Math. Program., vol. 106, no. 1, pp. 25–57,
2006.
[19] P. Gill, W. Murray, and M. Saunders, “SNOPT: An SQP algorithm for large-scale
constrained optimization,” SIAM Review, vol. 47, no. 1, pp. 99–131, 2005. [Online]. Available:
http://epubs.siam.org/doi/abs/10.1137/S0036144504446096
[20] H. Ye and R. Liu, “A multiphase optimal control method for multi-train control and scheduling
on railway lines,” Transportation Research Part B: Methodological, vol. 93, pp. 377–393, 2016.
[21] D. B. Leineweber, I. Bauer, H. G. Bock, and J. P. Schlöder, “An efficient multiple shooting based
reduced SQP strategy for large-scale dynamic process optimization. Part I: theoretical aspects,”
Computers and Chemical Engineering, vol. 27, pp. 157–166, 2003.
[22] D. B. Leineweber, A. A. S. Schäfer, H. G. Bock, and J. P. Schlöder, “An efficient multiple
shooting based reduced SQP strategy for large-scale dynamic process optimization. part II:
Software aspects and applications,” Computers and Chemical Engineering, vol. 27, pp. 167–174,
2003.
[23] G. Frison, D. Kouzoupis, T. Sartor, A. Zanelli, and M. Diehl, “BLASFEO: Basic linear algebra
subroutines for embedded optimization,” ACM Trans. Math. Softw., vol. 44, no. 4, pp. 42:1–
42:30, 2018.
21
[24] M. Diehl, “Real-time optimization for large scale nonlinear processes,” Ph.D. dissertation, Univ.
of Heidelberg, 2001.
[25] J. Frey, A. Nurkanović, and M. Diehl, “Advanced-step real-time iterations with four levels – new
error bounds and fast implementation in acados,” IEEE Control Systems Letters, 2024.
[26] D. Mayne, “A second-order gradient method for determining optimal trajectories of non-linear
discrete-time systems,” Int. J. Control, vol. 3, no. 1, pp. 85–96, 1966.
[27] D. Kiessling, K. Baumgärtner, J. Frey, W. Decré, J. Swevers, and M. Diehl, “Fast generation of
feasible trajectories in direct optimal control,” IEEE Control Systems Letters, 2024.
[28] D. Kouzoupis, G. Frison, A. Zanelli, and M. Diehl, “Recent advances in quadratic programming
algorithms for nonlinear model predictive control,” Vietnam Journal of Mathematics, vol. 46,
no. 4, pp. 863–882, 2018.
[29] L. Vanroye, A. Sathya, J. De Schutter, and W. Decré, “Fatrop: A fast constrained optimal control
problem solver for robot trajectory optimization and control,” in 2023 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS). IEEE, 2023, pp. 10 036–10 043.
[30] G. Frison and M. Diehl, “HPIPM: a high-performance quadratic programming framework for
model predictive control,” in Proc. IFAC World Congr., Berlin, Germany, July 2020.
[31] H. J. Ferreau, C. Kirches, A. Potschka, H. Bock, and M. Diehl, “qpOASES: a parametric active-
set algorithm for quadratic programming,” Math. Progam. Comput., vol. 6, no. 4, 2014.
[32] D. Arnstrom, A. Bemporad, and D. Axehill, “A dual active-set solver for embedded quadratic
programming using recursive LDLT updates,” IEEE Trans. Automatic Control, 2022.
[33] B. Stellato, T. Geyer, and P. J. Goulart, “High-speed finite control set model predictive control
for power electronics,” IEEE Trans. Automat. Control, vol. 32, no. 5, pp. 4007 – 4020, 2017.
[34] J. V. Frasch, M. Vukov, H. Ferreau, and M. Diehl. (2013) A dual Newton strategy for the efficient
solution of sparse quadratic programs arising in SQP-based nonlinear MPC. Optimization Online
3972.
[35] G. Frison, D. Kouzoupis, J. B. Jørgensen, and M. Diehl, “An efficient implementation of partial
condensing for nonlinear model predictive control,” in Proc. IEEE Conf. Decis. Control (CDC),
2016.
[36] D. Axehill, “Controlling the level of sparsity in MPC,” Systems & Control Lett., vol. 76, pp. 1–7,
2015.
[37] J. Frey, J. De Schutter, and M. Diehl, “Fast integrators with sensitivity propagation for use in
CasADi,” in Proc. Eur. Control Conf. (ECC), 2023.
[38] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “CasADi – a software
framework for nonlinear optimization and optimal control,” Math. Program. Comput., vol. 11,
no. 1, pp. 1–36, 2019.
[39] J. Gillis, B. Vandewal, G. Pipeleers, and J. Swevers, “Effortless modeling of optimal control
problems with rockit,” in 39th Benelux Meeting on Systems and Control, Date: 2020/03/10-
2020/03/12, Location: Elspeet, The Netherlands, 2020.
[40] P. Sopasakis, E. Fresk, and P. Patrinos, “OpEn: Code generation for embedded nonconvex
optimization,” in IFAC World Congr.
[41] T. Englert, A. Völz, F. Mesmer, S. Rhein, and K. Graichen, “A software framework for embed-
ded nonlinear model predictive control using a gradient-based augmented lagrangian approach
(GRAMPC),” Optimization and Engineering, vol. 20, no. 3, pp. 769–809, jan 2019.
[42] A. Zanelli, A. Domahidi, J. L. Jerez, and M. Morari, “FORCES NLP: An efficient implementation
of interior-point methods for multistage nonlinear nonconvex programs,” International Journal
of Control, pp. 13–29, 2017.
[43] A. Domahidi, A. Zgraggen, M. N. Zeilinger, M. Morari, and C. N. Jones, “Efficient interior
point methods for multistage problems arising in receding horizon control,” in Proc. IEEE Conf.
Decis. Control (CDC), Maui, HI, USA, December 2012, pp. 668–674.
[44] D. B. Leineweber, I. Bauer, A. A. S. Schäfer, H. G. Bock, and J. P. Schlöder, “An efficient
multiple shooting based reduced SQP strategy for large-scale dynamic process optimization.
(Parts I and II),” Computers and Chemical Engineering, vol. 27, pp. 157–174, 2003.
[45] G. Schultz and K. Mombaur, “Modeling and optimal control of human-like running,”
IEEE/ASME Trans. mechatronics, vol. 15, no. 5, pp. 783–792, 2009.
[46] M. Diehl, A. Schäfer, H. G. Bock, and J. P. Schlöder, “Optimization of multiple-fraction batch
distillation with recycled waste cuts,” AIChE Journal, vol. 48, no. 12, pp. 2869–2874, 2002.
22
[47] H. G. Bock and K. J. Plitt, “A multiple shooting algorithm for direct solution of optimal control
problems,” in Proc. IFAC World Congr. Pergamon Press, 1984, pp. 242–247.
[48] J. Frey, S. D. Cairano, and R. Quirynen, “Active-Set based Inexact Interior Point QP Solver for
Model Predictive Control,” in Proc. IFAC World Congr., 2020.
[49] D. Schlipf, D. J. Schlipf, and M. Kühn, “Nonlinear model predictive control of wind turbines
using lidar,” Wind energy, vol. 16, no. 7, pp. 1107–1129, 2013.
[50] G. De Nicolao, L. Magni, and R. Scattolini, “Stabilizing Receding-Horizon control of nonlinear
time varying systems,” IEEE Trans. Automatic Control, vol. AC-43, no. 7, pp. 1030–1036, 1998.
[51] ——, “Stabilizing nonlinear receding horizon control via a nonquadratic terminal state penalty,”
in Symposium on Control, Optimization and Supervision, CESA’96 IMACS Multiconference,
Lille, 1996, pp. 185–187.
[52] M. Diehl, L. Magni, and G. D. Nicolao, “Efficient NMPC of unstable periodic systems using
approximate infinite horizon closed loop costing,” Annual Reviews in Control, vol. 28, no. 1, pp.
37–45, 2004.
[53] D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming. Belmont, MA: Athena Scientific,
1996.
[54] L. Magni, G. De Nicolao, L. Magnani, and R. Scattolini, “A stabilizing model-based predictive
control for nonlinear systems,” Automatica, vol. 37, no. 9, pp. 1351–1362, 2001.
[55] R. Quirynen, M. Vukov, M. Zanon, and M. Diehl, “Autogenerating microsecond solvers for non-
linear MPC: a tutorial using ACADO integrators,” Optimal Control Applications and Methods,
vol. 36, pp. 685–704, 2014.
[56] J. B. Rawlings, D. Q. Mayne, and M. M. Diehl, Model Predictive Control: Theory, Computation,
and Design, 2nd ed. Nob Hill, 2017.
[57] R. Verschueren, N. van Duijkeren, R. Quirynen, and M. Diehl, “Exploiting convexity in direct
optimal control: a sequential convex quadratic programming method,” in Proc. IEEE Conf.
Decis. Control (CDC), 2016.
[58] K. Baumgärtner and M. Diehl, “The extended Gauss-Newton method for nonconvex loss func-
tions and its application to time-optimal model predictive control,” in Proc. Amer. Control Conf.
(ACC), 2022.
[59] K. Baumgärtner, Y. Wang, A. Zanelli, and M. Diehl, “Fast nonlinear model predictive control
using barrier formulations and squashing with a generalized Gauss-Newton Hessian,” in Proc.
IEEE Conf. Decis. Control (CDC), 2022.
[60] K. Baumgärtner, A. Zanelli, and M. Diehl, “Stability analysis of nonlinear model predictive
control with progressive tightening of stage costs and constraints,” IEEE Control Systems Lett.,
2023.
[61] A. Zanelli, R. Quirynen, G. Frison, and M. Diehl, “A partially tightened real-time iteration
scheme for nonlinear model predictive control,” in Proc. 56th IEEE Conf. Decis. Control, Mel-
bourne, Australia, December 2017.
[62] A. Zanelli, “Inexact methods for nonlinear model predictive control: stability, applications, and
software,” Ph.D. dissertation, Univ. of Freiburg, 2021.
[63] R. Reiter, K. Baumgärtner, R. Quirynen, and M. Diehl, “Progressive smoothing for motion
planning in real-time NMPC,” Proc. Eur. Control Conf. (ECC), 2024.
[64] A. Zanelli, G. Horn, G. Frison, and M. Diehl, “Nonlinear model predictive control of a human-
sized quadrotor,” in Proc. Eur. Control Conf. (ECC), 2018, pp. 1542–1547.
[65] G. Valverde and T. Van Cutsem, “Model predictive control of voltages in active distribution
networks,” IEEE Trans. Smart Grid, vol. 4, no. 4, pp. 2152–2161, 2013.
[66] G. Frison, “Algorithms and methods for high-performance model predictive control,” Ph.D.
dissertation, Tech. Univ. Denmark, 2015.
[67] R. Dhaouadi and A. A. Hatab, “Dynamic modelling of differential-drive mobile robots using
Lagrange and Newton-Euler methodologies: A unified framework,” Advances in Robotics &
Automation, vol. 2, no. 2, pp. 1–7, 2013.
23
View publication stats

Arxiv Mocp

Uploaded by

Copyright:

Available Formats

Arxiv Mocp

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Arxiv Mocp

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Multi-Phase Optimal Control Problems for Efﬁcient Nonlinear Model

Preprint · August 2024

Jonathan Frey Gianluca Frison

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Preprint. Article submitted to journal Optimal Control Applications and

2 Multi-Phase Optimal Control Problem Formulations

2.1 Continuous-time multi-phase optimal control problem

subject to 0 = x̄0 − ξ1 (t1 ), (1b)

2.2 Multi-phase multiple shooting discretization

subject to x1,0 = x̄0 , (2b)

3 Multi-phase OCP formulations for Model Predictive

3.1 Continuous-time optimal control problem, model predictive con-

v(t) = u0 for t ∈ [0, ∆t]. (4)

3.2 Models of different fidelity

3.3 Piecewise polynomial control parameterization

3.4 Cost-to-go approximation via closed-loop costing

s.t. 0 = x(0) − x̄0 , (8b)

3.4.1 Practical treatment of constraints with closed-loop costing

ℓ(x(t), v(t)) = ϕ(r(x(t), v(t))) (9)

3.4.2 Squashed closed-loop costing

3.5 Partial tightening

−τ log(gj (x(t), v(t))), (10)

which is added to the cost function with a barrier parameter τ .

4 Efficient implementation in acados

4.1 Treating transition stages

250 MOCP N1 = 10 T1 = 0.4

cost-to-go V ([p0 , 0])

4.2 Multi-phase multiple shooting formulation in acados

subject to x̄0 = x(0), (11b)

4.3 Tutorial example of a multi-phase OCP

ℓ1 (ξ1 , v1 ) = p2 + 0.1s2 + 10−3 a2 , E1 (ξ1 ) = p2 + 0.01s2 ,

The transition model is defined as

5.1 Inverted pendulum on cart test problem

l(x, u) = x⊤ Qx + v ⊤ Rv + γ · (max(pmin − p, 0))2 + γ · (max(p − pmax , 0)2

5.1.1 Controller variants

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Figure 4: Closed-loop trajectories corresponding to the benchmark results in Table 1. Values

0.40 REF-N10 PW-QUAD-1

5.2.2 Optimal control problem formulation

lact (xact , uact ) = xact ⊤ Qxact + |V r I r | + |V l I l | (16)

E act (xact ) = xact ⊤ Qxact (17)

lsimple (xsimple , usimple ) = xsimple ⊤ Q̃xsimple + usimple ⊤ R̃usimple (18)

E trans (xact ) = 0.5∆ttrans · (I r 2 + I l 2 ), (19)

E simple (xsimple ) = xsimple ⊤ Q̃xsimple . (20)

−10 ≤ V l ≤ 10, (23)

Respectively, for the simple model, we impose

−60 ≤ τ r ≤ 60, (25)

5.2.3 Comparing controller variants

mean computation time [ms]

5.3 Partial tightening

l(x, v) = x⊤ Qx + v ⊤ Rv, (27)

2 N = 100, Nexact = 5 N = 100, Nexact = 100

Figure 9: Closed-loop trajectories corresponding to different controller variants with partial

View publication stats

You might also like