Controls Engineering in FRC
Controls Engineering in FRC
Controls Engineering in FRC
Tyler Veness
Controls Engineering in the
FIRST Robotics Competition
Graduate-level control theory for high schoolers
Tyler Veness
Copyright © 2017-2024 Tyler Veness
HTTPS://GITHUB.COM/CALCMOGUL/CONTROLS-ENGINEERING-IN-FRC
Licensed under the Creative Commons Attribution-ShareAlike 4.0 International Li-
cense (the “License”). You may not use this file except in compliance with the License.
You may obtain a copy of the License at https://creativecommons.org/licenses/by-sa/
4.0/. Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDI-
TIONS OF ANY KIND, either express or implied. See the License for the specific language
governing permissions and limitations under the License.
Generated from commit d8141f0 made on February 23, 2024. The latest version can
be downloaded from https://file.tavsys.net/control/controls-engineering-in-frc.pdf.
Tree next to Oakes Circle at UCSC
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
2 PID controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Proportional term 14
2.2 Derivative term 14
2.3 Integral term 16
2.4 PID controller definition 19
2.5 Response types 19
2.6 Manual tuning 20
2.7 Limitations 21
3 Application advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Mechanical vs software solutions 23
3.2 Actuator saturation 24
4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1 Derivatives 27
4.1.1 Power rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Product rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.3 Chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.4 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Integrals 29
4.3 Change of variables 30
4.4 Tables 31
4.4.1 Common derivatives and integrals . . . . . . . . . . . . . . . . . . . . . . . . . 31
5 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1 Vectors 35
5.2 Linear combinations, span, and basis vectors 36
5.3 Linear transformations and matrices 36
5.4 Matrix multiplication 36
Contents v
IV System modeling
11 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11.1 Linear motion 187
11.2 Angular motion 187
11.3 Vectors 188
11.3.1 Basic vector operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
11.3.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
11.4 Curvilinear motion 189
11.5 Differential drive kinematics 190
11.5.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
11.5.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.6 Mecanum drive kinematics 193
11.6.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
11.6.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.7 Swerve drive kinematics 196
11.7.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.7.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
V Motion planning
VI Appendices
C Feedforwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C.1 QR-weighted linear plant inversion 281
C.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C.1.2 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
C.2 Steady-state feedforward 283
C.2.1 Continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
C.2.2 Discrete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
C.2.3 Deriving steady-state input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
D Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
D.1 Linear system zero-order hold 289
D.2 Kalman filter as Luenberger observer 291
D.2.1 Luenberger observer with separate prediction and update . . . . . . . . . . 292
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
xiv Contents
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Online 327
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Flower bush near Spencer’s Market in Santa Maria, CA
Preface
When I was a high school student on FIRST Robotics Competition (FRC) team 3512,
I had to learn control theory from scattered internet sources that either weren’t rigorous
enough or assumed too much prior knowledge. After I took graduate-level control
theory courses from University of California, Santa Cruz for my bachelor’s degree, I
realized that the concepts weren’t difficult when presented well, but the information
wasn’t broadly accessible outside academia.
I wanted to fix the information disparity so more people could appreciate the beauty
and elegance I saw in control theory. This book streamlines the learning process to
make that possible.
I wrote the initial draft of this book as a final project for an undergraduate technical
writing class I took at UCSC in Spring 2017 (CMPE 185). It was a 13-page IEEE-
formatted paper intended as a reference manual and guide to state-space control that
summarized the three graduate controls classes I had taken that year. I kept working
on it the following year to flesh it out, and it eventually became long enough to make
into a proper book. I’ve been adding to it ever since as I learn new things.
I contextualized the material within FRC because it’s always been a significant part of
my life, and it’s a useful application sandbox. I’ve since contributed implementations
of many of this book’s tools to the FRC standard library (WPILib) and maintain them.
xvi Preface
Acknowledgements
Thanks to my controls engineering instructors Dejan Milutinović and Gabriel Elkaim
for introducing me to the field of control theory. They taught me what it means to be
a controls engineer, and I appreciated their classes’s pragmatic perspective focused on
intuition and application.
Trees by Baskin Engineering building at UCSC
0.1 Prerequisites
Knowledge of basic algebra and complex numbers is assumed. Some introductory
physics and calculus will be taught as necessary.
real systems. The examples from part IV are converted to state-space representation,
implemented, and tested with a discrete controller.
Part II also introduces the basics of nonlinear control system analysis with Lyapunov
functions. It presents an example of a nonlinear controller for a unicycle-like vehicle as
well as how to apply it to a two-wheeled vehicle. Since nonlinear control isn’t the focus
of this book, we mention other books and resources for further reading.
Part III “Estimation and localization” introduces the field of stochastic control theory.
The Luenberger observer and the probability theory behind the Kalman filter is taught
with several examples of creative applications of Kalman filter theory.
Part IV “System modeling” introduces the basic calculus and physics concepts required
to derive the models used in the previous chapters. It walks through the derivations for
several common FRC subsystems. Then, methods for system identification are dis-
cussed for empirically measuring model parameters.
Part V “Motion planning” covers planning how the robot will get from its current state
to some desired state in a manner achievable by its dynamics. It introduces motion
profiles with one degree of freedom for simple maneuvers. Trajectory optimization
methods are presented for generating profiles with higher degrees of freedom.
The appendices provide further enrichment that isn’t required for a passing understand-
ing of the material. This includes derivations for many of the results presented and
used in the mainmatter of the book.
The Python scripts used to generate the plots in the case studies double as reference
implementations of the techniques discussed in their respective chapters. They are
available in this book’s Git repository. Its location is listed on the copyright page.
of nonlinear control apply often, and it isn’t that much of a leap (if Lyapunov stability
isn’t included). The control and estimation chapters cover relevant tools for dealing
with nonlinearities like linearization when appropriate.
feedback control, two of which are for graduate students, in one short book). Please
send us feedback, corrections, or suggestions through the GitHub link listed on the
copyright page. New examples that demonstrate key concepts and make them more
accessible are also appreciated.
I
Fundamentals of
control theory
2 PID controllers . . . . . . . . . . . . . . . . . . . . . . 13
3 Application advice . . . . . . . . . . . . . . . . . . 23
4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
This page intentionally left blank
Road near walking trail off of Rice Ranch Road in Santa Maria, CA
Control systems are all around us and we interact with them daily. A small list of ones
you may have seen includes heaters and air conditioners with thermostats, cruise control
and the anti-lock braking system (ABS) on automobiles, and fan speed modulation
on modern laptops. Control systems monitor or control the behavior of systems like
these and may consist of humans controlling them directly (manual control), or of only
machines (automatic control).
How can we prove closed-loop controllers on an autonomous car, for example, will
behave safely and meet the desired performance specifications in the presence of un-
certainty? Control theory is an application of algebra and geometry used to analyze
and predict the behavior of systems, make them respond how we want them to, and
make them robust to disturbances and uncertainty.
Controls engineering is, put simply, the engineering process applied to control theory.
As such, it’s more than just applied math. While control theory has some beautiful math
behind it, controls engineering is an engineering discipline like any other that is filled
with trade-offs. The solutions control theory gives should always be sanity checked and
informed by our performance specifications. We don’t need to be perfect; we just need
to be good enough to meet our specifications.[1]
well above that which is necessary. Part of the problem is the use of jargon.
While it efficiently communicates ideas to those within the field, new people
who aren’t familiar with it are lost. Therefore, it’s important to define terms
before using them. See the glossary for a list of words and phrases commonly
used in control theory, their origins, and their meaning. Links to the glossary
are provided for certain words throughout the book and will use this color.
t K t
+
input open-loop output
∓
feedback
The open-loop gain is the total gain from the sum node at the input (the circle) to the
output branch. This would be the system’s gain if the feedback loop was disconnected.
The feedback gain is the total gain from the output back to the input sum node. A sum
node’s output is the sum of its inputs.
Figure 1.3 is a block diagram with more formal notation in a feedback configuration.
+
X P1 Y
∓
P2
Feedforward u(t)
r(t) y(t)
controller
Controllers which incorporate information fed back from the plant’s output are called
closed-loop or feedback controllers. Figure 1.5 shows a plant with a feedback controller.
10 Chapter 1. Control system basics
Note that the input and output of a system are defined from the plant’s point of view.
The negative feedback controller shown is driving the difference between the reference
and output, also known as the error, to zero.
Figure 1.6 shows a plant with feedforward and feedback controllers.
Feedforward
controller
1.4 Feedforward
Feedback control can be effective for reference tracking (making a system’s output
follow a desired reference signal), but it’s a reactionary measure; the system won’t start
applying control effort until the system is already behind. If we could tell the con-
troller about the desired movement and required input beforehand, the system could
react quicker and the feedback controller could do less work. A controller that feeds
information forward into the plant like this is called a feedforward controller.
A feedforward controller injects information about the system’s dynamics (like a model
does) or the desired movement. The feedforward handles parts of the control actions
1.5 Why feedback control? 11
we already know must be applied to make a system track a reference, then feedback
compensates for what we do not or cannot know about the system’s behavior at runtime.
There are two types of feedforwards: model-based feedforward and feedforward for
unmodeled dynamics. The first solves a mathematical model of the system for the
inputs required to meet desired velocities and accelerations. The second compensates
for unmodeled forces or behaviors directly so the feedback controller doesn’t have to.
Both types can facilitate simpler feedback controllers; we’ll cover examples of each in
later chapters.
2. PID controllers
velocity error to zero, and the integral term accumulates the area between the setpoint
and output plots over time (the integral of position error) and adds the current total to
the control input. We’ll go into more detail on each of these.
where Kp is the proportional gain and e(t) is the error at the current time t.
Proportional gains act like “software-defined springs” that pull the system toward the
desired position. Recall from physics that we model springs as F = −kx where F
is the force applied, k is a proportional constant, and x is the displacement from the
equilibrium point. This can be written another way as F = k(0 − x) where 0 is the
equilibrium point. If we let the equilibrium point be our feedback controller’s setpoint,
the equations have a one-to-one correspondence.
F = k(r − x)
u(t) = Kp e(t) = Kp (r(t) − y(t))
so the “force” with which the proportional controller pulls the system’s output toward
the setpoint is proportional to the error, just like a spring.
de
u(t) = Kp e(t) + Kd (2.2)
dt
where Kp is the proportional gain, Kd is the derivative gain, and e(t) is the error
at the current time t.
Kp e(t)
+ e(t) + u(t) y(t)
r(t) Plant
− +
Kd de(t)
dt
A PD controller has a proportional controller for position (Kp ) and a proportional con-
troller for velocity (Kd ). The velocity setpoint is implicitly provided by how the posi-
tion setpoint changes over time. Figure 2.3 shows an example for an elevator.
To prove a PD controller is just two proportional controllers, we will rearrange the
equation for a PD controller.
ek − ek−1
u k = Kp e k + K d
dt
where uk is the control input at timestep k and ek is the error at timestep k. ek is defined
as ek = rk − yk where rk is the setpoint and yk is the current output at timestep k.
rk − rk−1 yk − yk−1
uk = Kp (rk − yk ) + Kd −
dt dt
Notice how rk −r dt
k−1
is the velocity of the setpoint. By the same reasoning, yk −y
dt
k−1
is
the system’s velocity at a given timestep. That means the Kd term of the PD controller
is driving the estimated velocity to the setpoint velocity.
If the setpoint is constant, the implicit velocity setpoint is zero, so the Kd term slows
the system down if it’s moving. This acts like a “software-defined damper”. These
are commonly seen on door closers, and their damping force increases linearly with
velocity.
where Kp is the proportional gain, Ki is the integral gain, e(t) is the error at the
current time t, and τ is the integration variable.
The integral integrates from time 0 to the current time t. We use τ for the integration
because we need a variable to take on multiple values throughout the integral, but
we can’t use t because we already defined that as the current time.
Kp e(t)
+ e(t) + u(t) y(t)
r(t) Plant
− +
∫t
Ki 0
e(τ ) dτ
When the system is close to the setpoint in steady-state, the proportional term may be
too small to pull the output all the way to the setpoint, and the derivative term is zero.
This can result in steady-state error, as shown in figure 2.5.
A common way of eliminating steady-state error is to integrate the error and add it to
the control input. This increases the control effort until the system converges. Figure
2.5 shows an example of steady-state error for a flywheel, and figure 2.6 shows how
an integrator added to the flywheel controller eliminates it. However, too high of an
integral gain can lead to overshoot, as shown in figure 2.7.
R There are better approaches to fix steady-state error like using feedforwards or
constraining when the integral control acts using other knowledge of the system.
We will discuss these in more detail when we get to modern control theory.
18 Chapter 2. PID controllers
where Kp is the proportional gain, Ki is the integral gain, Kd is the derivative gain,
e(t) is the error at the current time t, and τ is the integration variable.
Figure 2.8 shows a block diagram for a system controlled by a PID controller.
Kp e(t)
Kd de(t)
dt
Beware that if Ki is too large, integral windup can occur. Following a large change in
setpoint, the integral term can accumulate an error larger than the maximal control in-
put. As a result, the system overshoots and continues to increase until this accumulated
error is unwound.
2.7 Limitations
PID’s heuristic method of tuning is a reasonable choice when there is no a priori knowl-
edge of the system dynamics. However, controllers with much better response can be
developed if a dynamical model of the system is known. Furthermore, PID only applies
to single-input, single-output (SISO) systems; we’ll cover methods for multiple-input,
multiple-output (MIMO) control in part II of this book.
This page intentionally left blank
Treeline by Crown/Merril bus stop at UCSC
3. Application advice
the endgame starts. In many cases, manual processes can be automated later and given
a manual fallback if the associated software or sensor fails. Be cautious with designs
that require closed-loop control to function.
When should problems be solved in hardware instead of software with clever controls?
Controls can handle disturbances like battery voltage drop or measurement noise, but
there are limits. For example, there’s nothing software can do to work around a drive-
train gearbox seizing up or throwing a chain. Sometimes, you’re better off just fixing
the root cause in hardware.
Design robot mechanisms for controllability. FRC team 971’s “Mechanical Design for
Controllability” seminar goes into more detail. Two of the important takeaways from
it are:
• Reduce gearbox backlash
• Choose motors and gear reductions that provide adequate control authority
Remember, “fix it in software” isn’t always the answer. The remaining chapters of this
book assume you’ve done the engineering analysis and concluded that your chosen de-
sign would benefit from more sophisticated controls, and you have the time or expertise
to make it work.
uncapped value of u and kmax be the associated maximum gain. We will now compare
the capped and uncapped controllers for the same reference and current state.
umax < u
kmax (r − x) < k(r − x)
kmax < k
For the inequality to hold, kmax must be less than the original value for k. This reduced
gain is evident in a system response when there is a linear change in state instead of
an exponential one as it approaches the reference. This is due to the control effort no
longer following a decaying exponential plot. Once the system is closer to the reference,
the controller will stop saturating and produce realistic controller values again.
This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura
4. Calculus
This book uses derivatives and integrals occasionally to represent small changes in val-
ues over small changes in time and the infinitesimal sum of values over time respec-
tively. The formulas and tables presented here are all you’ll need to carry through with
the math in later chapters.
If you are interested in more information after reading this chapter, 3Blue1Brown does
a fantastic job of introducing them in his Essence of calculus series. We recommend
reading lesson chapters 1 through 3, 5, and 10 through 13 for a solid foundation.
“Essence of calculus”
3Blue1Brown
https://www.3blue1brown.com/topics/calculus
4.1 Derivatives
Derivatives are expressions for the slope of a curve at arbitrary points. Common nota-
tions for this operation on a function like f (x) include
28 Chapter 4. Calculus
f (x) = xn
f ′ (x) = nxn−1
h(x) = f (x)g(x)
h′ (x) = f ′ (x)g(x) + f (x)g ′ (x)
h(x) = f (g(x))
h′ (x) = f ′ (g(x)) · g ′ (x)
For example,
5
h(x) = (3x + 2)
h′ (x) = 5 (3x + 2) · (3)
4
h′ (x) = 15 (3x + 2)
4
4.2 Integrals 29
4.2 Integrals
The integral is the inverse operation of the derivative and calculates the area under a
curve. Here is an example of one based on table 4.2.
Z
eat dt
1 at
e +C
a
The arbitrary constant C is needed because when you take a derivative, constants are
discarded because vertical offsets don’t affect the slope. When performing the inverse
operation, we don’t have enough information to determine the constant.
However, we can provide bounds for the integration.
Z t
eat dt
0
t
1 at
e +C
a
0
1 at 1 a·0
e +C − e +C
a a
1 at 1
e +C − +C
a a
1 at 1
e +C − −C
a a
30 Chapter 4. Calculus
1 at 1
e −
a a
Let u = ωt.
du = ω dt
1
dt = du
ω
Another example, which will be relevant when we actually cover state-space notation
(ẋ = Ax + Bu), is a closed-loop state-space system.
ẋ = (A − BK)x + BKr
ẋ = Acl x + Bcl ucl
where Acl = A − BK, Bcl = BK, and ucl = r. Since it matches the form of the
open-loop system, all the same analysis tools will work with it.
4.4 Tables 31
4.4 Tables
4.4.1 Common derivatives and integrals
R
f (x) dx f (x) f ′ (x)
ax a 0
1 2
2 ax ax a
1 a+1 a
a+1 x x axa−1
1 ax
ae eax aeax
− cos(x) sin(x) cos(x)
sin(x) cos(x) − sin(x)
cos(x) − sin(x) − cos(x)
− sin(x) − cos(x) sin(x)
5 Linear algebra . . . . . . . . . . . . . . . . . . . . . . 35
5. Linear algebra
Modern control theory borrows concepts from linear algebra. At first, linear algebra
may appear very abstract, but there are simple geometric intuitions underlying it. We’ll
be linking to 3Blue1Brown’s Essence of linear algebra video series because it’s better
at conveying that intuition than static text.
5.1 Vectors
36 Chapter 5. Linear algebra
5.9.2 Operators
Transpose
The T in AT denotes transpose, which flips the matrix across its diagonal such that the
rows become columns and vice versa.
5.10 Matrix definiteness 39
Trace
tr(A) denotes the trace of the square matrix A, which is defined as the sum of the
elements on the main diagonal (top-left to bottom-right).
Table 5.1: Eigenvalue distribution and notation for each type of matrix definiteness.
Let M be a matrix.
Type Definition
Negative definite xT Mx < 0 for all x
Negative semidefinite xT Mx ≤ 0 for all x
Positive semidefinite xT Mx ≥ 0 for all x
Positive definite xT Mx > 0 for all x
Indefinite ∃x, y ∋ xT Mx < 0 < yT My
Table 5.2: Rigorous definition for each type of matrix definiteness. Let M be a matrix
and let x and y be nonzero vectors.
AX + XA − XBR−1 BT X + Q = 0 (5.1)
5.11 Common control theory matrix equations 41
Snippet 5.1 computes the unique stabilizing solution to the discrete algebraic Riccati
equation. A robust implementation should also enforce the following preconditions:
1. Q = QT ≥ 0 and R = RT > 0.
2. (A, B) is a stabilizable pair (see subsection 6.12.4).
3. (A, C) is a detectable pair where Q = CT C (see section 6.12.8).
#include <Eigen/Cholesky>
#include <Eigen/Core>
#include <Eigen/LU>
StateMatrix A_k = A;
StateMatrix G_k = B * R.llt().solve(B.transpose());
StateMatrix H_k;
StateMatrix H_k1 = Q;
do {
H_k = H_k1;
return H_k1;
}
5.12.1 Jacobian
The Jacobian is the first-order partial derivative of a vector-valued function with respect
to one of its vector arguments. The columns of the Jacobian of f are filled with partial
derivatives of f ’s rows with respect to each of the argument’s elements. For example,
the Jacobian of f with respect to x is
∂f1 ∂f1
. . . ∂x
h i ∂x1 m
∂f (x, u) .. .. ..
= ∂f∂x(x,u)
. . . ∂f (x,u)
∂x
= . . .
∂x 1 m
∂fm ∂fm
∂x1 . . . ∂xm
∂f1
∂x1 is the partial derivative of the first row of f with respect to the first row of x, and
so on for all rows of f and x. This has m2 permutations and thus produces a square
matrix.
The Jacobian of f with respect to u is
∂f1 ∂f1
...
h i ∂u1 ∂un
∂f (x, u) ..
= ∂f (x,u)
∂u1 ... ∂f (x,u)
∂un
= ... ..
. .
∂u ∂fm ∂fm
∂u1 ... ∂un
5.12 Matrix calculus 43
∂f1
∂u1 is the partial derivative of the first row of f with respect to the first row of u, and so
on for all rows of f and u. This has m × n permutations and can produce a nonsquare
matrix if m ̸= n.
5.12.2 Hessian
The Hessian is the second-order partial derivative of a vector-valued function with re-
spect to one of its vector arguments. For example, the Hessian of f with respect to x
is ∂ 2 f1
∂ 2 f1
. . .
∂ 2 f (x, u) h ∂ 2 f (x,u) i ∂x1
2 ∂x 2
..
m
= . . . ∂ 2 f (x,u)
= .. ..
.
∂x2 ∂x1 2 ∂xm 2 . .
2 2
∂ fm
∂x2
. . . ∂∂xf2m
1 m
∂xT Ax
Theorem 5.12.1 ∂x = 2Ax where A is symmetric.
∂(Ax+b)T C(Dx+e)
Theorem 5.12.2 ∂x = AT C(Dx + e) + DT CT (Ax + b)
44 Chapter 5. Linear algebra
∂(Ax+b)T C(Ax+b)
Corollary 5.12.3 ∂x = 2AT C(Ax + b) where C is symmetric.
Proof:
∂(Ax + b)T C(Ax + b)
= AT C(Ax + b) + AT CT (Ax + b)
∂x
∂(Ax + b)T C(Ax + b)
= (AT C + AT CT )(Ax + b)
∂x
C is symmetric, so
cause its response is hard to tune and some of its destabilizing dynamics aren’t visible
during simulation.
Why use model-based control in FRC? Poor build season schedule management often
leads to the software team:
1. Not getting enough time to verify basic functionality and test/tune feedback con-
trollers.
2. Spending dedicated software testing time troubleshooting mechanical/electrical
issues within recently integrated subsystems instead.
Model-based control (one of the focuses of this book) avoids both problems because
it lets software teams test basic functionality in simulation much earlier in the build
season and tune their feedback controllers automatically.
where ω is the angular velocity and ωmax is the maximum angular velocity. If DC
brushed motors are said to behave linearly, then why is this?
Linearity refers to a system’s equations of motion, not its time domain response. The
equation defining the motor’s change in angular velocity over time looks like
ω̇ = −aω + bV
where ω̇ is the derivative of ω with respect to time, V is the input voltage, and a and
b are constants specific to the motor. This equation, unlike the one shown before, is
actually linear because it only consists of multiplications and additions relating the input
V and current state ω.
6.3 Continuous state-space notation 47
Also of note is that the relation between the input voltage and the angular velocity
of the output shaft is a linear regression. You’ll see why if you model a DC brushed
motor as a voltage source and generator producing back-EMF (in the equation above,
bV corresponds to the voltage source and −aω corresponds to the back-EMF). As you
increase the input voltage, the back-EMF increases linearly with the motor’s angular
velocity. If there was a friction term that varied with the angular velocity squared (air
resistance is one example), the relation from input to output would be a curve. Friction
that scales with just the angular velocity would result in a lower maximum angular
velocity, but because that term can be lumped into the back-EMF term, the response
is still linear.
6.3.2 Definition
Below is the continuous version of state-space notation.
48 Chapter 6. Continuous state-space control
ẋ = Ax + Bu (6.1)
y = Cx + Du (6.2)
The change in state and the output are linear combinations of the state vector and the
input vector. The A and B matrices are used to map the state vector x and the input
vector u to a change in the state vector ẋ. The C and D matrices are used to map the
state vector x and the input vector u to an output vector y.
[1] Section 7.3 will explain why the matrix exponential eAt shows up here.
6.4 Eigenvalues and stability 49
So the system tends toward the equilibrium point (i.e., it’s stable) if λj < 0 for all j.
Now let’s consider when the eigenvalues are complex numbers. What does that mean
for the system response? Let λj = aj + bj i. Each of the exponential terms in the
solution can be written as
eλj t = e(aj +bj i)t = eaj t eibj t
Therefore,
eλj t = eaj t (cos(bj t) + i sin(bj t))
When the eigenvalue’s imaginary part bj ̸= 0, it contributes oscillation to the real part’s
response.
The eigenvalues of A are called poles.[4] Figure 6.1 shows the impulse responses in
the time domain for systems with various pole locations in the complex plane (real
numbers on the x-axis and imaginary numbers on the y-axis). Each response has an
initial condition of 1.
[2]We’re handwaving why this is the case, but it’s a consequence of eAt being diagonalizable.
[3] Euler’sformula may seem surprising at first, but it’s rooted in the fact that complex exponentials are
rotations in the complex plane around the origin. If you can imagine walking around the unit circle traced
by that rotation, you’ll notice the real part of your position oscillates between −1 and 1 over time. That is,
complex exponentials manifest as oscillations in real life.
“What is Euler’s formula actually saying? | Ep. 4 Lockdown live math” (51 minutes)
3Blue1Brown
https://youtu.be/ZxYOEwM6Wbk
[4] This name comes from classical control theory. See subsection E.2.2 for more.
50 Chapter 6. Continuous state-space control
Im
Stable Unstable
t
t
t
LHP RHP
t
t t
Re
t t t
Poles in the left half-plane (LHP) are stable; the system’s output may oscillate but
it converges to steady-state. Poles on the imaginary axis are marginally stable; the
system’s output oscillates at a constant amplitude forever. Poles in the right half-plane
(RHP) are unstable; the system’s output grows without bound.
ẋ = Ax + BK(r − x)
ẋ = Ax + BKr − BKx
ẋ = (A − BK)x + BKr (6.3)
Now for the output equation. Substitute the control law into equation (6.2).
y = Cx + D(K(r − x))
6.6 Model augmentation 51
y = Cx + DKr − DKx
y = (C − DK)x + DKr (6.4)
Instead of commanding the system to a state using the vector u directly, we can now
specify a vector of desired states through r and the controller will choose values of u
for us over time that drive the system toward the reference.
The eigenvalues of A − BK are the poles of the closed-loop system. Therefore, the
rate of convergence and stability of the closed-loop system can be changed by moving
the poles via the eigenvalues of A − BK. A and B are inherent to the system, but
K can be chosen arbitrarily by the controller designer. For equation (6.3) to reach
steady-state, the eigenvalues of A − BK must be in the left-half plane. There will be
steady-state error if Ar ̸= 0.
y = Cx + Du
[5] Observers use a matrix K to steer the state estimate toward the true state in the same way controllers
use a matrix K to steer the current state toward a desired state. We’ll cover this in more detail in part III.
6.7 Integral control 53
y C D
= x+ u
u −K 0
6.6.5 Examples
Snippet 6.1 shows how one packs together the following augmented matrix in Python
using np.block().
A B
C D
import numpy as np
Snippet 6.2 shows how one packs together the same augmented matrix in Python using
array slices.
import numpy as np
Section 6.7 demonstrates model augmentation for different types of integral control.
ẋI = e = r − Cx
[6]With proper tuning, proportional control should generally be doing much more work than integral
control.
6.7 Integral control 55
u = K(r − x) − KI xI
r x
u = K KI −
0 xI
ẋ = Ax + B (u + uerror )
ẋ = Ax + Bu + Buerror
ẋ = Ax + Bu + Berror uerror
where Berror is a column vector that maps uerror to changes in the rest of the state
the same way B does for u. Berror is only a column of B if uerror corresponds to an
existing input within u.
56 Chapter 6. Continuous state-space control
Notice how the state is augmented with uerror . With this model, the observer will
estimate both the state and the uerror term. The controller is augmented similarly. r
is augmented with a zero for the goal uerror term.
u = K (r − x) − kerror uerror
r x
u = K kerror −
0 uerror
where kerror is a column vector with a 1 in a given row if uerror should be applied to
that input or a 0 otherwise.
This process can be repeated for an arbitrary error which can be corrected via some
linear combination of the inputs.
6.9 Elevator
This elevator consists of a DC brushed motor attached to a pulley that drives a mass up
or down.
R ωp
I
ωm
Vemf
m v
V G
r
circuit mechanics
ẋ = v (6.5)
2
G Kt GKt
v̇ = − v+ V (6.6)
Rr2 mKv Rrm
Augment the matrix equation with the position state x, which has the model equation
ẋ = v. The matrix elements corresponding to v will be 1, and the others will be 0 since
they don’t appear, so ẋ = 0x + 1v + 0V . The existing rows will have zeroes inserted
where x is multiplied in.
˙ " #
x 0 1 x 0
= 2 + V
v 0 − RrG2 mK
Kt
v
v GKt
Rrm
58 Chapter 6. Continuous state-space control
ẋ = Ax + Bu
y = Cx + Du
x position
x= = y = x = position u = V = voltage
v velocity
" #
0 1
A= G2 K t (6.7)
0 − Rr2 mKv
0
B = GKt (6.8)
Rrm
C= 1 0 (6.9)
D=0 (6.10)
This will compensate for unmodeled dynamics such as gravity. However, using a con-
stant voltage feedforward to counteract gravity is preferred over uerror estimation in
this case because it results in a simpler controller with similar performance.
6.9 Elevator 59
where B is the motor acceleration term from B and uf f is the voltage feedforward.
Buf f = −(−g)
Buf f = g
GKt
uf f = g
Rrm
Rrmg
uf f =
GKt
6.9.4 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[7] creates and tests a controller for it. Figure 6.3 shows the closed-loop
system response.
6.9.5 Implementation
C++ and Java implementations of this elevator controller are available online.[8][9]
[7] https://github.com/calcmogul/frccontrol/blob/main/examples/elevator.py
[8] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/
StateSpaceElevator/cpp/Robot.cpp
[9] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/
wpilibj/examples/statespaceelevator/Robot.java
60 Chapter 6. Continuous state-space control
6.10 Flywheel
This flywheel consists of a DC brushed motor attached to a spinning mass of non-
negligible moment of inertia.
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
Factor out ω and V into column vectors.
˙ h G2 K i GK
ω = − Kv RJt ω + RJt V
6.10 Flywheel 61
R ωf
I J
ωm
Vemf
V G
circuit mechanics
ẋ = Ax + Bu
y = Cx + Du
G2 Kt
A=− (6.13)
Kv RJ
GKt
B= (6.14)
RJ
C=1 (6.15)
D=0 (6.16)
troller.
ω
x= y=ω u=V
uerror
A B B
Aaug = Baug = Caug = C 0 Daug = D (6.17)
0 0 0
r
Kaug = K 1 raug = (6.18)
0
This will compensate for unmodeled dynamics such as projectiles slowing down the
flywheel.
6.10.3 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[10] creates and tests a controller for it. Figure 6.5 shows the closed-loop
system response.
Notice how the control effort when the reference is reached is nonzero. This is a plant
inversion feedforward compensating for the system dynamics attempting to slow the
flywheel down when no voltage is applied.
[10] https://github.com/calcmogul/frccontrol/blob/main/examples/flywheel.py
6.10 Flywheel 63
6.10.4 Implementation
C++ and Java implementations of this flywheel controller are available online.[11][12]
GKt G2 K t
ω̇ = V − ω
RJ Kv RJ
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
Next, we’ll derive the current I as an output.
ω
V = IR +
Kv
ω
IR = V −
Kv
1 1
I=− ω+ V
Kv R R
Therefore,
[11] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/
StateSpaceFlywheel/cpp/Robot.cpp
[12] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/
wpilibj/examples/statespaceflywheel/Robot.java
64 Chapter 6. Continuous state-space control
ẋ = Ax + Bu
y = Cx + Du
G2 Kt
A=− (6.19)
Kv RJ
GKt
B= (6.20)
RJ
1
C=− (6.21)
Kv R
1
D= (6.22)
R
Notice that in this model, the output doesn’t provide any direct measurements of the
state. To estimate the full state (also known as full observability), we only need the
outputs to collectively include linear combinations of every state[13] . We’ll revisit this
in chapter 9 with an example that uses range measurements to estimate an object’s
orientation.
The effectiveness of this model’s observer is heavily dependent on the quality of the
current sensor used. If the sensor’s noise isn’t zero-mean, the observer won’t converge
to the true state.
where V is the controller’s new input voltage, Vcmd is the old input voltage, Vnominal
is the rail voltage when effects like voltage drop due to current draw are ignored, and
Vrail is the real rail voltage.
[13]While the flywheel model’s outputs are a linear combination of both the states and inputs, inputs don’t
provide new information about the states. Therefore, they don’t affect whether the system is observable.
6.10 Flywheel 65
To drive the model with a more accurate voltage that includes voltage drop, the recip-
rocal can be used.
Vrail
V = Vcmd (6.24)
Vnominal
where V is the model’s new input voltage. Note that if both the controller compensation
and model compensation equations are applied, the original voltage is obtained. The
model input only drops from ideal if the compensated controller voltage saturates.
For an explanation of where these equations come from, read section 12.1.
dω
First, we’ll solve for dt in terms of V .
66 Chapter 6. Continuous state-space control
J dω
dt ω
V = R+
Kt Kv
ω J dω
V − = dt R
Kv Kt
dω Kt ω
= V −
dt JR Kv
dω Kt Kt
=− ω +
|{z} V
|{z} (6.28)
dt JRK
|{z} | {z v} x JR
|{z} u
ẋ A B
u = K(r − x)
V = Kp (ωgoal − ω)
Closed-loop models have the form ẋ = (A − BK)x + BKr. Therefore, the closed-
loop poles are the eigenvalues of A − BK.
ẋ = (A − BK)x + BKr
Kt Kt Kt
ω̇ = − − (Kp ) ω + (Kp )(ωgoal )
JRKv JR JR
Kt K t Kp Kt Kp
ω̇ = − + ω+ ωgoal
JRKv JR JR
Kt Kp
This closed-loop flywheel model has one pole at − JRKKt
v
+ JR . It therefore only
needs one P controller to place that pole anywhere on the real axis. A derivative term
6.11 Single-jointed arm 67
Figure 6.6: Step response of second-order Figure 6.7: Step response of first-order DC
DC brushed motor plant augmented with brushed motor plant augmented with posi-
position (L = 230 μH) tion (L = 0 μH)
Subsection 6.7.2 covers a superior compensation method that avoids zeroes in the con-
troller, doesn’t act against the desired control action, and facilitates better tracking.
m
R ωarm l
I
ωm
Vemf
V G
circuit mechanics
Augment the matrix equation with the angle state θarm , which has the model equation
θ̇arm = ωarm . The matrix elements corresponding to ωarm will be 1, and the others
will be 0 since they don’t appear, so θ̇arm = 0θarm + 1ωarm + 0V . The existing rows
will have zeroes inserted where θarm is multiplied in.
˙ " #
θarm 0 1 θarm 0
= 2 + GKt V
ωarm 0 −K G Kt
v RJ
ωarm RJ
6.11 Single-jointed arm 69
ẋ = Ax + Bu
y = Cx + Du
θarm angle
x= = y = θarm = angle u = V = voltage
ωarm angular velocity
" #
0 1
A= G2 K t (6.31)
0 − Kv RJ
0
B = GKt (6.32)
RJ
C= 1 0 (6.33)
D=0 (6.34)
This will compensate for unmodeled dynamics such as gravity or other external loading
from lifted objects. However, if only gravity compensation is desired, a feedforward
of the form uf f = Vgravity cos θ is preferred where Vgravity is the voltage required
to hold the arm level with the ground and θ is the angle of the arm with the ground.
70 Chapter 6. Continuous state-space control
Torque is usually written as rF sin θ where θ is the angle between the r and F vectors,
but θ in this case is being measured from the horizontal axis rather than the vertical
one, so the force vector is π4 radians out of phase. Thus, an angle of 0 results in the
maximum torque from gravity being applied rather than the minimum.
The force of gravity mg is applied at the center of the arm’s mass. For a uniform beam,
this is halfway down its length, or L2 where L is the length of the arm.
L
JBuf f = − (−mg) cos θ
2
L
JBuf f = mg cos θ
2
GKt
B= RJ , so
GKt L
J uf f = mg cos θ
RJ 2
RJ L
uf f = mg cos θ
JGKt 2
L Rmg
uf f = cos θ
2 GKt
L
2 can be adjusted according to the location of the arm’s center of mass.
6.11.4 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[14] creates and tests a controller for it. Figure 6.9 shows the closed-loop
system response.
[14] https://github.com/calcmogul/frccontrol/blob/main/examples/single_jointed_arm.py
6.11 Single-jointed arm 71
6.11.5 Implementation
C++ and Java implementations of this single-jointed arm controller are available on-
line.[15][16]
[15] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/
StateSpaceArm/cpp/Robot.cpp
[16] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/
wpilibj/examples/statespacearm/Robot.java
72 Chapter 6. Continuous state-space control
where rank is the number of linearly independent rows in a matrix and n is the
number of states.
The controllability matrix in equation (6.37) being rank-deficient means the inputs can-
not apply transforms along all axes in the state-space; the transformation the matrix
represents is collapsed into a lower dimension.
The condition number of the controllability matrix C is defined as σσmax (C)
min (C)
where σmax
[17]
is the maximum singular value and σmin is the minimum singular value. As this
number approaches infinity, one or more of the states becomes uncontrollable. This
number can also be used to tell us which actuators are better than others for the given
system; a lower condition number means that the actuators have more control authority.
Alternatively, Z ∞
T
Wc = eAτ BBT eA τ
dτ (6.39)
0
6.12.4 Stabilizability
Stabilizability is a weaker form of controllability. A system is considered stabilizable
if one of the following conditions is true:
1. All uncontrollable states can be stabilized
2. All unstable states are controllable
where rank is the number of linearly independent rows in a matrix and n is the
number of states.
The observability matrix in equation (6.41) being rank-deficient means the outputs do
not contain contributions from every state. That is, not all states are mapped to a linear
74 Chapter 6. Continuous state-space control
combination in the output. Therefore, the outputs alone are insufficient to estimate all
the states.
The condition number of the observability matrix O is defined as σσmax (O)
min (O)
where σmax
[17]
is the maximum singular value and σmin is the minimum singular value. As this
number approaches infinity, one or more of the states becomes unobservable. This
number can also be used to tell us which sensors are better than others for the given
system; a lower condition number means the outputs produced by the sensors are better
indicators of the system state.
Alternatively, Z ∞
T
Wo = eA τ CT CeAτ dτ (6.43)
0
6.12.8 Detectability
Detectability is a weaker form of observability. A system is considered detectable if
one of the following conditions is true:
1. All unobservable states are stable
2. All unstable states are observable
Chaparral by Merril Apartments at UCSC
The complex plane discussed so far deals with continuous systems. In decades past,
plants and controllers were implemented using analog electronics, which are continuous
in nature. Nowadays, microprocessors can be used to achieve cheaper, less complex
controller designs. Discretization converts the continuous model we’ve worked with so
far from a differential equation like
ẋ = x − 3 (7.1)
where xk refers to the value of x at the k th timestep. The difference equation is run
with some update period denoted by T , by ∆T , or sometimes sloppily by dt.[1]
While higher order terms of a differential equation are derivatives of the state variable
(e.g., ẍ in relation to equation (7.1)), higher order terms of a difference equation are
delayed copies of the state variable (e.g., xk−1 with respect to xk in equation (7.2)).
[1] The discretization of equation (7.1) to equation (7.2) uses the forward Euler discretization method.
76 Chapter 7. Discrete state-space control
Continuous Discrete
(0, 0) (1, 0)
imaginary axis edge of unit circle
(−∞, 0) (0, 0)
Figure 7.1: Mapping of complex plane from continuous (left) to discrete (right)
Figure 7.2: Single poles in various loca- Figure 7.3: Complex conjugate poles in
tions in discrete plane various locations in discrete plane
Figure 7.4: The original signal is a 1.5 Hz sine wave, which means its Nyquist frequency
is 1.5 Hz. The signal is being sampled at 2 Hz, so the aliased signal is a 0.5 Hz sine
wave.
frequency is 20 kHz and the minimum sampling frequency is 40 kHz. (44.1 kHz in
particular was chosen for unrelated historical reasons.)
Frequencies above the Nyquist frequency are folded down across it. The higher fre-
quency and the folded down lower frequency are said to alias each other.[2] Figure 7.4
demonstrates aliasing.
The effect of these high-frequency aliases can be reduced with a low-pass filter (called
an anti-aliasing filter in this application).
domain become unstable in the discrete domain since it can’t react as quickly to output
changes. Here are a few ways to combat this.
• Run the controller with a high sample rate.
• Designing the controller in the analog domain with enough phase margin to com-
pensate for any phase loss that occurs as part of discretization.
• Convert the plant to the digital domain and design the controller completely in
the digital domain.
ẋ = Ac x + Bc u + w
y = Cc x + D c u + v
where w is the process noise, v is the measurement noise, and both are zero-mean
white noise sources with covariances of Qc and Rc respectively. w and v are defined
as normally distributed random variables.
w ∼ N (0, Qc )
v ∼ N (0, Rc )
80 Chapter 7. Discrete state-space control
Discretization is done using a zero-order hold. That is, the input is only updated at
discrete intervals and it’s held constant between samples.
R Watch the “Taylor series” video from 3Blue1Brown’s Essence of calculus series
for an explanation of how the Taylor series expansion works.
The definition for the matrix exponential and the approximations below all use the
Taylor series expansion. The Taylor series is a method of approximating a function
like et via the summation of weighted polynomial terms like tk . et has the following
Taylor series around t = 0.
X∞ n
t
et =
n=0
n!
X
3
1 1 1
(AT )k = I + AT + A2 T 2 + A3 T 3
k! 2 6
k=0
[3] Functions for which their Taylor series expansion converges to and also equals it are called analytic
functions.
7.3 Linear system discretization 81
Table 7.2 compares discretization methods for the matrix case, and table 7.3 compares
their Taylor series expansions. These use a more complex formula which we won’t
present here.
Method Conversion to Ad
Ac T
Zero-order hold e
−1
Bilinear I + 12 Ac T I − 12 Ac T
−1
Backward Euler (I − Ac T )
Forward Euler I + Ac T
Table 7.2: Discretization methods (matrix case). The zero-order hold discretization
method is exact.
Each of them has different stability properties. The bilinear transform preserves the
(in)stability of the continuous time system.
82 Chapter 7. Discrete state-space control
Table 7.3: Taylor series expansions of discretization methods (matrix case). The zero-
order hold discretization method is exact.
R Watch the “How (and why) to raise e to the power of a matrix” video from
3Blue1Brown’s Essence of linear algebra series for a visual introduction to the
matrix exponential.
To understand why the matrix exponential is used in the discretization process, consider
the scalar differential equation ẋ = ax. The solution to this type of differential equation
uses an exponential.
ẋ = ax
7.3 Linear system discretization 83
dx
= ax(t)
dt
dx = ax(t) dt
1
dx = a dt
x(t)
Z t Z t
1
dx = a dt
0 x(t) 0
ln(x(t))|t0 = at|t0
ln(x(t)) − ln(x(0)) = at − a · 0
ln(x(t)) − ln(x0 ) = at
x(t)
ln = at
x0
x(t)
= eat
x0
x(t) = eat x0
This solution generalizes via the matrix exponential to the set of differential equations
ẋ = Ax + Bu we use to describe systems.[4]
x(t) = eAt x0 + A−1 (eAt − I)Bu
where x0 contains the initial conditions and u is the constant input from time 0 to t. If
the initial state is the current system state, then we can describe the system’s state over
time as
xk+1 = eAT xk + A−1 (eAT − I)Buk
or more compactly,
xk+1 = Ad xk + Bd uk
where T is the time between samples xk and xk+1 . Theorem 7.3.1 has more efficient
ways to compute Ad and Bd .
7.3.3 Definition
The model can be discretized as follows
xk+1 = Ad xk + Bd uk + wk
[4] See section D.1 for a complete derivation of the linear system zero-order hold.
84 Chapter 7. Discrete state-space control
y k = Cd x k + D d u k + v k
with covariances
wk ∼ N (0, Qd )
vk ∼ N (0, Rd )
Ad = e Ac T (7.4)
Z T
Bd = eAc τ dτ Bc = A−1
c (Ad − I)Bc (7.5)
0
Cd = Cc (7.6)
Dd = Dc (7.7)
Z T
T
Qd = eAc τ Qc eAc τ dτ (7.8)
τ =0
1
Rd = Rc (7.9)
T
where subscripts c and d denote matrices for the continuous or discrete systems
respectively, T is the sample period of the discrete system, and eAc T is the matrix
exponential of Ac T .
Ad and Bd can be computed in one step as
Ac Bc
T
0 0 Ad Bd
e =
0 I
where Qd = ΦT
2,2 Φ1,2 [10].
7.4 Discrete state-space notation 85
Now for the output equation. Substitute the control law into equation (7.11).
yk = Cxk + D(K(rk − xk ))
yk = Cxk + DKrk − DKxk
yk = (C − DK)xk + DKrk (7.13)
Instead of commanding the system to a state using the vector uk directly, we can now
specify a vector of desired states through rk and the controller will choose values of
uk for us over time that drive the system toward the reference.
The eigenvalues of A − BK are the poles of the closed-loop system. Therefore, the
rate of convergence and stability of the closed-loop system can be changed by moving
the poles via the eigenvalues of A − BK. A and B are inherent to the system, but
K can be chosen arbitrarily by the controller designer. For equation (7.12) to reach
steady-state, the eigenvalues of A − BK must be in the left-half plane. There will be
steady-state error if Ark ̸= rk .
Since all our applications will be discrete systems, we’ll place poles in the discrete
domain (the z-plane). The s-plane’s LHP maps to the inside of a unit circle (see figure
7.6).
Figure 7.6: Mapping of complex plane from continuous (left) to discrete (right)
Pole placement should only be used if you know what you’re doing. It’s much easier to
let LQR place the poles for you, which we’ll discuss next.
where a is a negative constant representing the back-EMF of the motor, x is the angular
velocity, b is a positive constant that maps the input voltage to some change in angular
velocity (angular acceleration), u is the voltage applied to the motor, and ẋ is the angular
acceleration. Discretized, this equation would look like
xk+1 = ad x + bd uk
If the angular velocity starts from zero and we apply a positive voltage, we’d see the
motor spin up to some constant speed following an exponential decay, then stay at that
88 Chapter 7. Discrete state-space control
speed. If we throw in the control law uk = kp (rk − xk ), we can make the system
converge to a desired state rk through proportional feedback. In what manner can we
pick the constant kp that balances getting to the target angular velocity quickly with
getting there efficiently (minimal oscillations or excessive voltage)?
We can solve this problem with something called the linear-quadratic regulator. We’ll
define the following cost function that includes the states and inputs:
∞
X
J= (Q(rk − xk )2 + Ru2k )
k=0
We want to minimize this while obeying the constraint that the system follow our fly-
wheel dynamics xk+1 = ad xk + bd uk .
The cost is the sum of the squares of the error and the input for all time. If the controller
gain kp we pick in the control law uk = kp (rk − xk ) is stable, the error rk − xk and
the input uk will both go to zero and give us a finite cost. Q and R let us decide
how much the error and input contribute to the cost (we will require that Q ≥ 0 and
R > 0 for reasons that will be clear shortly[5] ). Penalizing error more will make the
controller more aggressive, while penalizing the input more will make the controller
less aggressive. We want to pick a kp that minimizes the cost.
There’s a common trick for finding the value of a variable that minimizes a function of
that variable. We’ll take the derivative (the slope) of the cost function with respect to
the input uk , set the derivative to zero, then solve for uk . When the slope is zero, the
function is at a minimum or maximum. Now, the cost function we picked is quadratic.
All the terms are strictly positive on account of the squared variables and nonnegative
weights, so our cost is strictly positive and the quadratic function is concave up. The
uk we found is therefore a minimum.
The actual process of solving for uk is mathematically intensive and outside the scope
of this explanation (appendix B references a derivation for those curious). The rest of
this section will describe the more general form of the linear-quadratic regulator and
how to use it.
contribute to the cost, so the optimal solution is to not move. This minimizes the sum of the inputs over time.
If we let R be zero, the input doesn’t contribute to the cost, so infinite inputs are allowed as they minimize
the sum of the errors over time. This isn’t useful, so we require that the input be penalized with a nonzero
R.
7.7 Linear-quadratic regulator 89
effort costs. This method of controller design uses a quadratic function for the cost-
to-go defined as the sum of the error and control effort over time for the linear system
xk+1 = Axk + Buk .
∞
X
J= xT T
k Qxk + uk Ruk
k=0
where J represents a trade-off between state excursion and control effort with the
weighting factors Q and R. LQR is a control law u that minimizes the cost func-
tional. Figure 7.7 shows the optimal cost-to-go for an elevator model. Pole placement,
on the other hand, will have a cost-to-go above this for an arbitrary state vector (in this
case, an arbitrary position-velocity pair).
The cost-to-go looks effectively constant along the velocity axis because the velocity
is contributing much less to the cost-to-go than position.[6] In other words, it’s much
more expensive to correct for a position error than a velocity error. This difference
[6]While it may not look like it at this scale, the elevator’s cost-to-go along both state axes is strictly
90 Chapter 7. Discrete state-space control
If the system is controllable, the optimal control policy u∗k that drives all the states
to zero is −Kxk . To converge to nonzero states, a reference vector rk can be added
to the state xk .
uk = K(rk − xk ) (7.15)
This means that optimal control can be achieved with simply a set of proportional gains
on all the states. To use the control law, we need knowledge of the full state of the
system. That means we either have to measure all our states directly or estimate those
we do not measure.
See appendix B for how K is calculated. If the result is finite, the controller is guaran-
teed to be stable and robust with a phase margin of 60 degrees [20].
R LQR design’s Q and R matrices don’t need discretization, but the K calculated
for continuous time and discrete time systems will be different. The discrete
time gains approach the continuous time gains as the sample period tends to
zero.
increasing away from the origin (convex upward). This means there’s a global minimum at the origin, and
the system is globally stable; no matter what state you start from, you’re guaranteed to reach the origin with
this controller.
7.7 Linear-quadratic regulator 91
Bryson’s rule, the diagonals of the Q and R matrices are chosen based on the maxi-
mum acceptable value for each state and actuator. The nondiagonal elements are zero.
The balance between state excursion and control effort penalty (Q and R respectively)
can be adjusted using the weighting factor ρ.
ρ 1
Q = diag R = diag
x2max u2max
where xmax is the vector of acceptable state tolerances, umax is the vector of accept-
able input tolerances, and ρ is a tuning constant. Small values of ρ penalize control
effort while large values of ρ penalize state excursions. Large values would be chosen
in applications like fighter jets where performance is necessary. Spacecrafts would use
small values to conserve their limited fuel supply.
Figure 7.8 shows the response using various discrete pole placements and using LQR
with the following cost matrices.
1
0
Q = 202 R = 1212
0 0
With Bryson’s rule, this means an angular velocity tolerance of 20 rad/s, an infinite
current tolerance (in other words, we don’t care what the current does), and a voltage
tolerance of 12 V.
Notice with pole placement that as the current pole moves toward the origin, the control
effort becomes more aggressive.
92 Chapter 7. Discrete state-space control
Figure 7.8: Second-order CIM motor response with pole placement and LQR
7.8 Feedforward
There are two types of feedforwards: model-based feedforward and feedforward for
unmodeled dynamics. The first solves a mathematical model of the system for the
inputs required to meet desired velocities and accelerations. The second compensates
for unmodeled forces or behaviors directly so the feedback controller doesn’t have to.
Both types can facilitate simpler feedback controllers; we’ll cover examples of each.
u(t)
r(t) y(t)
1. Don’t invert an unstable plant. If the expected plant doesn’t match the real plant
exactly, the plant inversion will still result in an unstable system. Stabilize the
plant first with feedback, then inject an inversion.
2. Don’t invert a nonminimum phase system. The advice for pole-zero cancellation
in subsection E.2.2 applies here.
Setup
Let’s start with the equation for the reference dynamics
where uk is the feedforward input. Note that this feedforward equation does not and
should not take into account any feedback terms. We want to find the optimal uk such
that we minimize the tracking error between rk+1 and rk .
To solve for uk , we need to take the inverse of the nonsquare matrix B. This isn’t
possible, but we can find the pseudoinverse given some constraints on the state tracking
error and control effort. To find the optimal solution for these sorts of trade-offs, one
can define a cost function and attempt to minimize it. To do this, we’ll first solve the
expression for 0.
0 = Buk − (rk+1 − Ark )
This expression will be the state tracking cost we use in the following cost function as
an H2 norm.
Minimization
Given theorem 5.12.1, find the minimum of J by taking the partial derivative with
respect to uk and setting the result to 0.
∂J
= 2BT (Buk − (rk+1 − Ark ))
∂uk
0 = 2BT (Buk − (rk+1 − Ark ))
0 = 2BT Buk − 2BT (rk+1 − Ark )
2BT Buk = 2BT (rk+1 − Ark )
BT Buk = BT (rk+1 − Ark )
94 Chapter 7. Discrete state-space control
Theorem 7.8.1 — Linear plant inversion. Given the discrete model xk+1 =
Axk + Buk , the plant inversion feedforward is
Discussion
Linear plant inversion in theorem 7.8.1 compensates for reference dynamics that don’t
follow how the model inherently behaves. If they do follow the model, the feedforward
has nothing to do as the model already behaves in the desired manner. When this occurs,
rk+1 − Ark will return a zero vector.
For example, a constant reference requires a feedforward that opposes system dynamics
that would change the state over time. If the system has no dynamics, then A = I and
thus
uk = B+ (rk+1 − Irk )
uk = B+ (rk+1 − rk )
For a constant reference, rk+1 = rk .
uk = B+ (rk − rk )
uk = B+ (0)
uk = 0
can be used compensate for this. The feedforward takes the form of a voltage constant
because voltage is proportional to force applied, and the force is acting in only one
direction at all times.
uk = Vapp (7.17)
where Vapp is a constant. Another feedforward holds a single-jointed arm steady in the
presence of gravity. It has the following form.
where Vapp is the voltage required to keep the single-jointed arm level with the ground,
and θ is the angle of the arm relative to the ground. Therefore, the force applied is
greatest when the arm is parallel with the ground and zero when the arm is perpendicular
to the ground (at that point, the joint supports all the weight).
Note that the elevator model could be augmented easily enough to include gravity and
still be linear, but this wouldn’t work for the single-jointed arm since a trigonometric
function is required to model the gravitational force in the arm’s rotating reference
frame.[7]
[7]While the applied torque of the motor is constant throughout the arm’s range of motion, the torque
caused by gravity in the opposite direction varies according to the arm’s angle.
96 Chapter 7. Discrete state-space control
The system is stable if the eigenvalues of A − BK are within the unit circle. Now add
the plant inversion feedforward controller B+ (rk+1 − Ark ) to uk .
The multiplicative term on xk is still A−BK, so the feedforward didn’t affect stability.
It still affects the system response and steady-state error though.
0
c2 a2,1
.. .. ..
. . .
cs as,1 ... as,s−1
b1 ... ... bs
where s is the number of stages in the method, the matrix [aij ] is the Runge-Kutta
matrix, b1 , . . . , bs are the weights, and c1 , . . . , cs are the nodes. The top-left quadrant
contains the sums of the rows in the top-right quadrant. Each column in the right half
corresponds to a k coefficient from k1 to ks .
The family of solutions to ẋ = f (t, x) is given by
k1 = f (tk , xk )
k2 = f (tk + c2 h, xk + h(a2,1 k1 ))
..
.
ks = f (tk + cs h, xk + h(as,1 k1 + . . . + as,s−1 ks−1 ))
Xs
xk+1 = xk + h bi k i
i=1
0
k1 = f (t+0h,xk ) 1
xk+1 = xk + h(1k1 )
98 Chapter 7. Discrete state-space control
In FRC, our differential equations are of the form ẋ = f (x, u) where u is held constant
between timesteps. Since it’s time-invariant, we can ignore the time argument of the
integration method. This gives theorem 7.9.1.
xk+1 = xk + hf (xk , uk ) 0
1
0
1 1
k1 = f (t+0h, xk ) 2 2
1 1
1 1 2 0 2
k2 = f (t+ h,xk + h( k1 ))
2 2 1 0 0 1
1 1 1 1 1 1
k3 = f (t+ h,xk + h(0k1 + k2 )) 6 3 3 6
2 2
k4 = f (t+1h, xk + h(0k1 + 0k2 + 1k3 ))
1 1 1 1
xk+1 = xk + h( k1 + k2 + k3 + k4 )
6 3 3 6
1 1
k2 = f (t + h, xk + h k1 )
2 2
1 1
k3 = f (t + h, xk + h k2 )
2 2
k4 = f (t + h, xk + hk3 )
1 1 1 1
xk+1 = xk + h k1 + k2 + k3 + k4
6 3 3 6
1
6 is usually factored out of the last equation to reduce the number of floating point
operations.
1
xk+1 = xk + h (k1 + 2k2 + 2k3 + k4 )
6
In FRC, our differential equations are of the form ẋ = f (x, u) where u is held constant
between timesteps. Since it’s time-invariant, we can ignore the time argument of the
integration method. This gives theorem 7.9.2.
0
k1 = f (xk , uk ) 1 1
1 2 2
k2 = f (xk + h k1 , uk ) 1 1
2 0 2
2
1 1 0 0 1
k3 = f (xk + h k2 , uk ) 1 1 1 1
2 6 3 3 6
k4 = f (xk + hk3 , uk )
1
xk+1 = xk + h (k1 + 2k2 + 2k3 + k4 )
6
/**
* Performs 4th order Runge-Kutta integration of dx/dt = f(x, u) for dt.
*
* @param f The function to integrate. It must take two arguments x and u.
100 Chapter 7. Discrete state-space control
T k1 = f(x, u);
T k2 = f(x + h * 0.5 * k1, u);
T k3 = f(x + h * 0.5 * k2, u);
T k4 = f(x + h * k3, u);
Other methods of Runge-Kutta integration exist with various properties [1], but the
one presented here is popular for its high accuracy relative to the amount of floating
point operations (FLOPs) it requires.
0
1 1
5 5
3 3 9
10 40 40
4
5
44
45 − 56
15
32
9
8
9
19372
6561 − 25360
2187
64448
6561 − 212
729
1 9017
3168 − 355
33
46732
5247
49
176 − 18656
5103
1 35
384 0 500
1113
125
192 − 2187
6784
11
84
35
384 0 500
1113
125
192 − 2187
6784
11
84 0
5179
57600 0 7571
16695
393
640 − 339200
92097 187
2100
1
40
The first row of coefficients below the table divider gives the fifth-order accurate
solution. The second row gives an alternative solution which, when subtracted from
the first solution, gives the error estimate.
7.9 Numerical integration methods 101
#include <Eigen/Core>
#include <units/time.h>
/**
* Performs adaptive Dormand-Prince integration of dx/dt = f(x, u) for dt.
*
* @param f The function to integrate. It must take two arguments x
and
* u.
* @param x The initial value of x.
* @param u The value u held constant over the integration period.
* @param dt The time over which to integrate.
* @param maxError The maximum acceptable truncation error. Usually a small
* number like 1e-6.
*/
template <typename F, typename T, typename U>
T RKDP(F&& f, T x, U u, units::second_t dt, double maxError = 1e-6) {
// See https://en.wikipedia.org/wiki/Dormand%E2%80%93Prince_method for the
// Butcher tableau the following arrays came from.
// clang-format off
constexpr double A[kDim - 1][kDim - 1]{
{ 1.0 / 5.0},
{ 3.0 / 40.0, 9.0 / 40.0},
{ 44.0 / 45.0, -56.0 / 15.0, 32.0 / 9.0},
{19372.0 / 6561.0, -25360.0 / 2187.0, 64448.0 / 6561.0, -212.0 /
729.0},
{ 9017.0 / 3168.0, -355.0 / 33.0, 46732.0 / 5247.0, 49.0 / 176.0,
-5103.0 / 18656.0},
{ 35.0 / 384.0, 0.0, 500.0 / 1113.0, 125.0 / 192.0,
-2187.0 / 6784.0, 11.0 / 84.0}};
// clang-format on
T newX;
double truncationError;
// clang-format off
T k1 = f(x, u);
T k2 = f(x + h * (A[0][0] * k1), u);
T k3 = f(x + h * (A[1][0] * k1 + A[1][1] * k2), u);
T k4 = f(x + h * (A[2][0] * k1 + A[2][1] * k2 + A[2][2] * k3), u);
T k5 = f(x + h * (A[3][0] * k1 + A[3][1] * k2 + A[3][2] * k3 + A[3][3]
* k4), u);
T k6 = f(x + h * (A[4][0] * k1 + A[4][1] * k2 + A[4][2] * k3 + A[4][3]
* k4 + A[4][4] * k5), u);
// clang-format on
// Since the final row of A and the array b1 have the same coefficients
// and k7 has no effect on newX, we can reuse the calculation.
newX = x + h * (A[5][0] * k1 + A[5][1] * k2 + A[5][2] * k3 +
A[5][3] * k4 + A[5][4] * k5 + A[5][5] * k6);
T k7 = f(newX, u);
dtElapsed += h;
x = newX;
}
return x;
}
8. Nonlinear control
While many tools exist for designing controllers for linear systems, all systems in real-
ity are inherently nonlinear. We’ll briefly mention some considerations for nonlinear
systems.
8.1 Introduction
Recall from linear system theory that we defined systems as having the following form:
ẋ = Ax + Bu + w
y = Cx + Du + v
In this equation, A and B are constant matrices, which means they are both time-
invariant and linear (all transformations on the system state are linear ones, and those
transformations remain the same for all time). In nonlinear and time-variant systems,
the state evolution and output are defined by arbitrary functions of the current states
and inputs.
ẋ = f (x, u, w)
y = h(x, u, v)
Nonlinear functions come up regularly when attempting to control the pose of a vehicle
in the global coordinate frame instead of the vehicle’s rotating local coordinate frame.
104 Chapter 8. Nonlinear control
Converting from one to the other requires applying a rotation matrix, which consists of
sine and cosine operations. These functions are nonlinear.
8.2 Linearization
One way to control nonlinear systems is to linearize the model around a reference point.
Then, all the powerful tools that exist for linear controls can be applied. This is done by
taking the Jacobians of f and h with respect to the state and input vectors. See section
5.12 for more on Jacobians.
∂f (x, u, w) ∂f (x, u, w)
A= B=
∂x ∂u
∂h(x, u, v) ∂h(x, u, v)
C= D=
∂x ∂u
[1] Equilibrium points are points where ẋ = 0. At these points, the system is in steady-state.
8.3 Lyapunov stability 105
Lyapunov stability means that the system trajectory can be kept arbitrarily close to
the origin by starting sufficiently close to it. Lyapunov’s direct method uses a function
consisting of the energy in a system or derivatives of the system’s state to prove stability
around an equilibrium point. This is done by showing that the function, and thus its
inputs, decay to some ground state. More rigorously, the value function V (x) must be
positive definite and equal zero at the equilibrium point
V (x) > 0
V (0) = 0
dV ∂V dx
V̇ (x) = = ≤0
dt ∂x dt
More than one Lyapunov function can prove stability, and if one function doesn’t prove
it, another candidate should be tried. For this reason, we refer to these functions as
Lyapunov candidate functions.
An affine system is a linear system with a constant offset in the dynamics. If (x0 , u0 ) is
an equilibrium point, f (x0 , u0 ) = 0, the resulting model is linear, and LQR works as
usual. If (x0 , u0 ) is, say, the current operating point rather than an equilibrium point,
the easiest way to correctly apply LQR is
1. Find a control input u0 that makes (x0 , u0 ) an equilibrium point.
2. Obtain an LQR for the linearized system.
3. Add u0 to the LQR’s control input.
A control-affine system is of the form ẋ = f (x)+g(x)u. Since it has separable control
inputs, u0 can be derived via plant inversion as follows.
ẋ = f (x0 ) + g(x0 )u0
0 = f (x0 ) + g(x0 )u0
g(x0 )u0 = −f (x0 )
u0 = −g −1 (x0 )f (x0 ) (8.1)
8.5 Pendulum
8.5.1 State-space model
Below is the model for a pendulum
g
θ̈ = − sin θ
l
where θ is the angle of the pendulum and l is the length of the pendulum.
Since state-space representation requires that only single derivatives be used, they should
be broken up as separate states. We’ll reassign θ̇ to be ω so the derivatives are easier
to keep straight for state-space representation.
g
ω̇ = − sin θ
l
Now separate the states.
θ̇ = ω
g
ω̇ = − sin θ
l
T
This makes our state vector θ ω and our nonlinear model the following.
ω
f (x, u) =
− gl sin θ
Linearization around θ = 0
To apply our tools for linear control theory, the model must be a linear combination
of the states and inputs (addition and multiplication by constants). Since this model is
nonlinear on account of the sine function, we should linearize it.
Linearization finds a tangent line to the nonlinear dynamics at a desired point in the state-
space. The Taylor series is a way to approximate arbitrary functions with polynomials,
so we can use it for linearization.
The taylor series expansion for sin θ around θ = 0 is θ − 16 θ3 + 1 5
120 θ − . . .. We’ll
take just the first-order term θ to obtain a linear function.
θ̇ = ω
108 Chapter 8. Nonlinear control
g
ω̇ = − θ
l
Now write the model in state-space representation. We’ll write out the system of equa-
tions with the zeroed variables included to assist with this.
θ̇ = 0θ + 1ω
g
ω̇ = − θ + 0ω
l
Factor out θ and ω into a column vector.
˙
θ 0 1 θ
= (8.4)
ω − gl 0 ω
If we want to linearize around an arbitrary point, we can take the Jacobian with respect
to x.
∂f (x, u) 0 1
=
∂x − gl cos θ 0
For full state feedback, knowledge of all states is required. If not all states are measured
directly, an estimator can be used to supplement them.
We may only be measuring θ in the pendulum example, not θ̇, so we’ll need to estimate
the latter. The C matrix the observer would use in this case is
C= 1 0
y = Cx
θ
y= 1 0
ω
y = 1θ + 0ω
y=θ
8.6 Holonomic drivetrains 109
where vx,chassis is the velocity ahead in the chassis frame, vy,chassis is the velocity to
the left in the chassis frame, and ωchassis is the angular velocity in the chassis frame.
This can be written in state-space notation as
˙
x 0 0 0 x cos θ − sin θ 0 vx,chassis
y = 0 0 0 y + sin θ cos θ 0 vy,chassis
θ 0 0 0 θ 0 0 1 ωchassis
8.6.2 Control
This control-affine model is fully actuated but nonlinear in the chassis frame. However,
we can apply linear control theory to the error dynamics in the global frame instead.
Note how equation (8.5)’s state vector contains the global pose and its input vector
contains the global linear and angular velocities.
˙
x 0 0 0 x 1 0 0 vx,global
y = 0 0 0 y + 0 1 0 vy,global (8.5)
θ 0 0 0 θ 0 0 1 ωglobal
vx,global cos θ − sin θ 0 vx,chassis
where vy,global = sin θ cos θ 0 vy,chassis (8.6)
ωglobal 0 0 1 ωchassis
We can control the model in equation (8.5) with an LQR, which will have three inde-
pendent proportional controllers. Then, we can convert the global velocity commands
to chassis velocity commands with equation (8.6) and convert the chassis velocity com-
mands to wheel speed commands with inverse kinematics. LQRs on each wheel can
track the wheel speed commands.
Note that the full control law is nonlinear because the kinematics contain a rotation
matrix for transforming from the chassis frame to the global frame. However, the
nonlinear part has been abstracted away from the tunable linear control laws.
110 Chapter 8. Nonlinear control
vr
vl
ωl ωr
J +x
r
θ
rb
Factor out vl and vr into a column vector and Vl and Vr into a column vector.
˙ r2 r2
vl
1
+ Jb C1 1
− Jb C3 vl
= m m
vr rb2 rb2
m − J
1
C1 1
m + J C3 vr
r2 r2
1
+ Jb C2 1
− Jb C4 Vl
+ m m
r2 r2
1
m − b C2
J
1
m + b C 4 Vr
J
112 Chapter 8. Nonlinear control
ẋ = Ax + Bu
vl left velocity Vl left voltage
x= = u= =
vr right velocity Vr right voltage
rb2 rb2
1
+ J C1
1
− J C3
A= m m (8.7)
rb2 rb2
1
m − J C1 1
m + J C3
rb2 rb2
1
+ J C2
1
− J C4
B= m m (8.8)
rb2 rb2
1
m − J C2 1
m + J C4
G2 K G2 K
where C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 = − KvrRrt2 , and C4 = Gr K t
Rr .
Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[2] creates and tests a controller for it. Figure 8.3 shows the closed-loop
system response.
Given the high inertia in drivetrains, it’s better to drive the reference with a motion
profile instead of a step input for reproducibility.
[2] https://github.com/calcmogul/frccontrol/blob/main/examples/differential_drive.py
8.7 Differential drive 113
ẋ = Ax + Bu
θ heading
V left voltage
x = vl = left velocity u = l =
Vr right voltage
vr right velocity
0 − 2r1b
1
2rb
rb2 r2
A= 0 1
+ J C1
1
− Jb C3 (8.9)
m m
rb2 r2
0 1
m − J C1 1
m + Jb C3
0 0
1 2
rb 2
rb
B= + J C2
1
− J C4 (8.10)
m m
rb2 rb2
1
m − J C2 1
m + J C 4
G2 K G2 K
where C1 = − KvlRrt2 , C2 = GRr l Kt
, C3 = − KvrRrt2 , and C4 = Gr K t
Rr . The
constants C1 through C4 are from the derivation in section 12.6.
114 Chapter 8. Nonlinear control
m − − Jb C2 Vl + m
1 1 1 1
J C1 vl + m + J C3 vr + m + J C 4 Vr
(8.11)
Therefore,
ẋ = Ax + Bu
x x position
y y position
Vl left voltage
x = θ = heading u = =
vl left velocity Vr right voltage
vr right velocity
0 0 −vs 1
2c
1
2c
0 0 vc 1 1
2s 2s
0 0 0 − 2r1b 1
A=
r2
2rb
rb2
(8.12)
0 0 0 1
+ Jb C1 1
− J C3
m m
r2 r2
0 0 0 1
m − Jb C1 1
m + Jb C3
0 0
0 0
0 0
B=
1 r2
rb2
(8.13)
m + Jb C2 1
− J C4
m
rb2 r2
m − J
1 1
C2 m + Jb C4
G2 K
where v = vl +vr
2 , c = cos θ, s = sin θ, C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 =
G2r Kt
− Kv Rr 2, and C4 = Gr K t
Rr . The constants C1 through C4 are from the derivation in
section 12.6.
We can also use this in an extended Kalman filter as is since the measurement model
(y = Cx + Du) is linear.
116 Chapter 8. Nonlinear control
The linearized differential drive model doesn’t track well because the first-order lin-
earization of A doesn’t capture the full heading dynamics, making the model update
inaccurate. This linearization inaccuracy is evident in the Hessian matrix (second par-
tial derivative with respect to the state vector) being nonzero.
0 0 − vl +v 2
r
cos θ 0 0
0 0 − vl +vr sin θ 0 0
∂ 2 f (x, u) 0 0
2
= 0 0 0
∂x2 0 0 0 0 0
0 0 0 0 0
Figures 8.6 and 8.7 show a simulation using RKDP instead of the first-order model.
where the the superscript R represents the robot’s coordinate frame and the superscript
G represents the global coordinate frame.
With this transformation, the x and y error cost in LQR penalize the error ahead of the
robot and cross-track error respectively instead of global pose error. Since the cross-
track error is always measured from the robot’s coordinate frame, the model used to
118 Chapter 8. Nonlinear control
1 1
0 0 0 2 2
0 0 v 0 0
∂f (x) 0 0 0 − 2r1b 1
A= =
rb2
2rb
rb2
(8.14)
∂x 0 0 0 1
+ 1
− J C3
θ=0
m J C1 m
rb2 r2
0 0 0 1
m − J C1 1
m + Jb C3
0 0
0 0
0 0
B=
1 r2
2
rb
(8.15)
m + Jb C2 1
− J C4
m
rb2 r2
m − J
1 1
C2 m + Jb C4
G2 K G2 K
2 , C1 = − Kv Rr 2 , C2 = Rr , C3 = − Kv Rr 2 , and C4 =
where v = vl +v Gl K t Gr K t
Rr .
r l t r t
At each timestep, the LQR controller gain K is computed for the (A, B) pair eval-
uated at the current state.
With the model in theorem 8.7.4, y is uncontrollable at v = 0 because the row corre-
sponding to y becomes the zero vector. This means the state dynamics and inputs can
no longer affect y. This is obvious given that nonholonomic drivetrains can’t move side-
ways. Some DARE solvers throw errors in this case, but one can avoid it by linearizing
120 Chapter 8. Nonlinear control
T
The measurement model for the complete nonlinear model is now y = θ xl xr
T
instead of y = θ vl vr .
U error estimation
As per subsection 6.7.2, we will now augment the model so uerror states are added to
the control inputs.
The plant and observer augmentations should be performed before the model is dis-
cretized. After the controller gain is computed with the unaugmented discrete model,
the controller may be augmented. Therefore, the plant and observer augmentations
assume a continuous model and the controller augmentation assumes a discrete con-
troller.
The three uerror states we’ll be adding are uerror,l , uerror,r , and uerror,heading for
left voltage error, right voltage error, and heading error respectively. The left and right
wheel positions are filtered encoder positions and are not adjusted for heading error.
The turning angle computed from the left and right wheel positions is adjusted by the
gyroscope heading. The heading uerror state is the heading error between what the
wheel positions imply and the gyroscope measurement.
122 Chapter 8. Nonlinear control
augment the linear subspace, so the nonlinear pose dynamics are the same.
˙
x v cos θ
y = v sin θ (8.19)
2rb − 2rb
vr vl
θ
The left and right voltage error states should be mapped to the corresponding velocity
states, so the system matrix should be augmented with B.
The heading uerror is measuring counterclockwise encoder understeer relative to the
gyroscope heading, so it should add to the left position and subtract from the right po-
sition for clockwise correction of encoder positions. That corresponds to the following
input mapping vector.
0
0
Bθ =
1
−1
Now we’ll augment the linear system matrix horizontally to accomodate the uerror
states.
vl
˙ vr
vl
xl
vr
= A B Bθ xr + Bu (8.20)
xl
uerror,l
xr
uerror,r
uerror,heading
8.8 Ramsete unicycle controller 123
R The process noise for the voltage error states should be how much the voltage
can be expected to drop. The heading error state should be the encoder model
uncertainty.
where ex is the tracking error in x, ey is the tracking error in y, and eθ is the tracking
error in θ. The 3 × 3 matrix is a rotation matrix that transforms the error in the pose
(represented by xd − x, yd − y, and θd − θ) from the global coordinate frame into the
vehicle’s coordinate frame.
We will use the following control laws u1 and u2 for velocity and turning rate respec-
tively.
u1 = −k1 ex
(8.23)
u2 = −k3 eθ − k2 vd sinc(eθ )ey
where k1 , k2 , and k3 are time-varying gains and sinc(eθ ) is defined as sineθeθ . This
choice of control law may seem arbitrary, and that’s because it is. The only requirement
on our choice is that there exist an associated Lyapunov candidate function that proves
the control law is globally asymptotically stable. We’ll provide a sketch of a proof in
theorem 8.8.1.
Our velocity and turning rate commands for the vehicle will use the following nonlinear
transformation of the inputs.
v = vd cos eθ − u1
ω = ωd − u2
v = vd cos eθ + k1 ex
ω = k3 eθ + ωd + k2 vd sinc(eθ )ey
8.8 Ramsete unicycle controller 125
Theorem 8.8.1 Assuming that vd and ωd are bounded with bounded derivatives,
and that vd (t) → 0 or ωd (t) → 0 when t → ∞, the control laws in equation (8.23)
globally asymptotically stabilize the origin e = 0.
Proof:
To prove convergence, the paper previously mentioned uses the following Lyapunov
function.
k2 e2
V = (e2x + e2y ) + θ
2 2
where k2 is a tuning constant, ex is the tracking error in x, ey is the tracking error
in y, and eθ is the tracking error in θ.
The time derivative along the solutions of the closed-loop system is nonincreasing
since
V̇ = −k1 k2 e2x − k3 e2θ ≤ 0
Thus, ∥e(t)∥ is bounded, V̇ (t) is uniformly continuous, and V (t) tends to some
limit value. Using the Barbalat lemma, V̇ (t) tends to zero [12].
v and ω should be the references for a drivetrain reference tracker. A good choice would
be using equations (11.1) and (11.2) to convert v and ω to left and right velocities, then
applying LQR to the model in theorem 8.7.1.
x, y, and θ are obtained via a pose estimator (see chapter 10 for how to implement
one). The desired velocity, turning rate, and pose can be varied over time according to
a desired trajectory.
Figures 8.10 and 8.11 show the tracking performance of Ramsete for a given trajectory.
Units of sinc(eθ )
First, we’ll find the units of sinc(eθ ) for later use.
sin(eθ )
sinc(eθ ) =
eθ
1
[sinc(eθ )] =
A
[sinc(eθ )] = A−1
AT −1 = [k]A
T −1 = [k]
[k] = T −1
This is consistent with the result from the Ramsete velocity command equation.
Next, we’ll find the units of b.
The left-hand side and second term of equation (8.29) must have equivalent units.
AT −1 = [b]A−1 L2 T −1
A2 L−2 = [b]
[b] = A2 L−2
Ramsete k equation
Start from equation (8.27).
q
k = 2ζ ωd2 + bvd2
p
[k] = [ζ] [ωd ]2 + [b][vd ]2
p
T −1 = [ζ] (AT −1 )2 + [b](LT −1 )2
p
T −1 = [ζ] A2 T −2 + [b]L2 T −2 (8.30)
The additive terms under the square root must have equivalent units.
(AT −1 )2 = [b](LT −1 )2
A2 T −2 = [b]L2 T −2
A2 L−2 = [b]
[b] = A2 L−2
This is consistent with the result from the angular velocity command equation, so we
can use it when determining the units of ζ.
Next, we’ll find the units of ζ.
Substitute [b] into equation (8.30).
p
T −1 = [ζ] A2 T −2 + [b]L2 T −2
p
T −1 = [ζ] A2 T −2 + A2 L−2 · L2 T −2
p
T −1 = [ζ] A2 T −2 + A2 T −2
√
T −1 = [ζ] A2 T −2
T −1 = [ζ]AT −1
A−1 = [ζ]
[ζ] = A−1
ẋ = v cos θ
ẏ = v sin θ
θ̇ = ω
130 Chapter 8. Nonlinear control
T T
Here’s the model as a vector function where x = x y θ and u = v ω .
v cos θ
f (x, u) = v sin θ (8.33)
ω
We’re going to make a cross-track error controller, so we’ll apply a clockwise rotation
matrix to the global tracking error to transform it into the robot’s coordinate frame.
Since the cross-track error is always measured from the robot’s coordinate frame, the
model used to compute the LQR should be linearized around θ = 0 at all times.
0 0 −v sin 0 cos 0 0
A = 0 0 v cos 0 B = sin 0 0
0 0 0 0 1
0 0 0 1 0
A = 0 0 v B = 0 0
0 0 0 0 1
Therefore,
8.9 Linear time-varying unicycle controller 131
0 0 0
∂f (x, u)
A= = 0 0 v
∂x θ=0 0 0 0
1 0
∂f (x, u)
B= = 0 0
∂u θ=0 0 1
At each timestep, the LQR controller gain K is computed for the (A, B) pair eval-
uated at the current input.
“Underactuated Robotics”
Russ Tedrake
https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg/videos
The books Nonlinear Dynamics and Chaos by Steven Strogatz and Applied Nonlinear
Control by Jean-Jacques Slotine are also good references.
III
Estimation and
localization
Stochastic control theory is a subfield of control theory that deals with the existence
of uncertainty either in observations or in noise that drives the evolution of a system.
We assign probability distributions to this random noise and aim to achieve a desired
control task despite the presence of this uncertainty.
Stochastic optimal control is concerned with doing this with minimum cost defined
by some cost functional, like we did with LQR earlier. First, we’ll cover the basics of
probability and how we represent linear stochastic systems in state-space representation.
Then, we’ll derive an optimal estimator using this knowledge, the Kalman filter, and
demonstrate creative applications of the Kalman filter theory.
This will be a math-heavy introduction, so we recommend reading Kalman and Bayesian
Filters in Python by Roger Labbe first [9].
9.1 Terminology
First, we should provide definitions for terms that have specific meanings in this field.
A causal system is one that uses only past information. A noncausal system also uses
information from the future. A filter is a causal system that filters information through a
probabilistic model to produce an estimate of a desired quantity that can’t be measured
directly. A smoother is a noncausal system, so it uses information from before and after
the current state to produce a better estimate.
136 Chapter 9. Stochastic control theory
[1] An agent is a system-agnostic term for independent controlled actors like robots or aircraft.
9.2 State observers 137
Variables denoted with a hat are estimates of the corresponding variable. For example,
x̂ is the estimate of the true state x.
Notice that a Luenberger observer has an extra term in the state evolution equation. This
term uses the difference between the estimated outputs and measured outputs to steer
the estimated state toward the true state. L approaching C+ trusts the measurements
more while L approaching 0 trusts the model more.
R Using an estimator forfeits the performance guarantees from earlier, but the
138 Chapter 9. Stochastic control theory
responses are still generally very good if the process and measurement noises are
small enough. See John Doyle’s paper Guaranteed Margins for LQG Regulators
for a proof.
Predict step
x̂− −
k+1 = Ax̂k + Buk (9.5)
Update step
− −1
x̂+
k+1 = x̂k+1 + A L(yk − ŷk ) (9.6)
ŷk = Cx̂−
k (9.7)
uk = −Kx̂k
140 Chapter 9. Stochastic control theory
With the estimation error defined as ek = xk − x̂k , we get the observer dynamics
derived in equation (9.8).
ek+1 = (A − LC)ek
Also, after rearranging the error equation to be x̂k = xk − ek , the controller can be
rewritten as
uk = −K(xk − ek )
Since the closed-loop system matrix is triangular, the eigenvalues are those of A−BK
and A − LC. Therefore, the stability of the feedback controller and observer are
independent.
the value of x will be in the interval x ∈ [x1 , x1 + dx] is p(x1 ) dx. In other words, the
probability is the area under the PDF within the region [x1 , x1 + dx] (see figure 9.1).
A probability of zero means that the sample will not occur and a probability of one
means that the sample will always occur. Probability density functions require that no
probabilities are negative and that the sum of all probabilities is 1. If the probabilities
sum to 1, that means one of those outcomes must happen. In other words,
Z ∞
p(x) ≥ 0, p(x) dx = 1
−∞
or given that the probability of a given sample is greater than or equal to zero, the sum
of probabilities for all possible input values is equal to one.
9.3.3 Variance
Informally, variance is a measure of how far the outcome of a random variable deviates
from its mean. Later, we will use variance to quantify how confident we are in the
estimate of a random variable; we can’t know the true value of that variable without
randomness, but we can give a bound on its randomness.
Z ∞
var(x) = σ 2 = E[(x − x)2 ] = (x − x)2 p(x) dx
−∞
Joint probability density functions also require that no probabilities are negative and
that the sum of all probabilities is 1.
Z ∞Z ∞
p(x, y) ≥ 0, p(x, y) dx dy = 1
−∞ −∞
The variance of a joint PDF measures how a variable correlates with itself (we’ll cover
variances with respect to other variables shortly).
Z ∞Z ∞
var(x) = Σxx = E[(x − x)2 ] = (x − x)2 p(x, y) dx dy
−∞ −∞
Z ∞Z ∞
var(y) = Σyy = E[(y − y)2 ] = (y − y)2 p(x, y) dx dy
−∞ −∞
9.3.5 Covariance
A covariance is a measurement of how a variable correlates with another. If they vary
in the same direction, the covariance increases. If they vary in opposite directions, the
covariance decreases.
Z ∞Z ∞
cov(x, y) = Σxy = E[(x − x)(y − y)] = (x − x)(y − y) p(x, y) dx dy
−∞ −∞
9.3.6 Correlation
Two random variables are correlated if the result of one random variable affects the
result of another. Correlation is defined as
Σxy
ρ(x, y) = p , |ρ(x, y)| ≤ 1
Σxx Σyy
So two variable’s correlation is defined as their covariance over the geometric mean of
their variances. Uncorrelated sources have a covariance of zero.
9.3 Introduction to probability 145
9.3.7 Independence
Two random variables are independent if the following relation is true.
p(x, y) = p(x) p(y)
This means that the values of x do not correlate with the values of y. That is, the
outcome of one random variable does not affect another’s outcome. If we assume
independence,
Z ∞Z ∞
E[xy] = xy p(x, y) dx dy
−∞ −∞
Z ∞Z ∞
E[xy] = xy p(x) p(y) dx dy
−∞ −∞
Z ∞ Z ∞
E[xy] = x p(x) dx y p(y) dy
−∞ −∞
E[xy] = E[x]E[y]
E[xy] = x y
If p(x, y) is known, then we also know p(x, y = y ∗ ). However, note that the latter is
not the conditional density p(x|y ∗ ), instead
Z ∞
∗
C(y ) = p(x, y = y ∗ ) dx
−∞
1
p(x|y ∗ ) = p(x, y = y ∗ )
C(y ∗ )
1
The scale factor C(y ∗ ) is used to scale the area under the PDF to 1.
If x and y are independent, then p(x|y) = p(x), p(y|x) = p(y), and p(x, y) =
p(x) p(y).
The elements of x are scalar variables jointly distributed with a joint density p(x1 , . . . , xn ).
The expectation is
Z ∞
E[x] = x = x p(x) dx
−∞
E[x1 ]
..
E[x] = .
E[xn ]
Z ∞ Z ∞
E[xi ] = ... xi p(x1 , . . . , xn ) dx1 . . . dxn
−∞ −∞
Z ∞
E[f (x)] = f (x) p(x) dx
−∞
This n×n matrix is symmetric and positive semidefinite. A positive semidefinite matrix
satisfies the relation that for any v ∈ Rn for which v ̸= 0, vT Σv ≥ 0. In other words,
the eigenvalues of Σ are all greater than or equal to zero.
Σxy = cov(x, y)
Σxy = E[(x − x)(y − y)T ]
Σxy = E[xyT ] − E[xyT ] − E[xyT ] + E[xyT ]
148 Chapter 9. Stochastic control theory
Factor out constants from the inner integral. This includes variables which are held
constant for each inner integral evaluation.
Z Z
T
E[xy ] = p(x) x dx p(y) yT dyT
X Y
Each of these integrals is just the expected value of their respective integration variable.
Σz = cov(z, z)
Σz = E[(z − z)(z − z)T ]
Σz = E[(Ax + By − Ax − By)(Ax + By − Ax − By)T ]
Σz = E[(A(x − x) + B(y − y))(A(x − x) + B(y − y))T ]
Σz = E[(A(x − x) + B(y − y))((x − x)T AT + (y − y)T BT )]
Σz = E[A(x − x)(x − x)T AT + A(x − x)(y − y)T BT
+ B(y − y)(x − x)T AT + B(y − y)(y − y)T BT ]
Since x and y are independent, Σxy = 0 and the cross terms cancel out.
Σz = AΣx AT + BΣy BT
E[x] = x
var(x) = σ 2
1 (x−x)2
p(x) = √ e− 2σ 2
2πσ 2
While we could use any random variable to represent a random process, we use the
Gaussian random variable a lot in probability theory due to the central limit theo-
rem.
This is the case even if the original variables themselves are not normally distributed.
The theorem is a key concept in probability theory because it implies that probabilistic
and statistical methods that work for normal distributions can be applicable to many
problems involving other types of distributions.
For example, suppose that a sample is obtained containing a large number of indepen-
dent observations, and that the arithmetic mean of the observed values is computed.
The central limit theorem says that the computed values of the mean will tend toward
being distributed according to a normal distribution.
150 Chapter 9. Stochastic control theory
E[wk ] = 0
E[wk wkT ] = Qk
E[vk ] = 0
E[vk vkT ] = Rk
where Qk is the process noise covariance matrix and Rk is the measurement noise
covariance matrix. We assume the noise samples are independent, so E[wk wjT ] = 0
and E[vk vjT ] = 0 where k ̸= j. Furthermore, process noise samples are independent
from measurement noise samples.
We’ll compute the expectation of these equations and their covariance matrices, which
we’ll use later for deriving the Kalman filter.
Since the error and noise are independent, the cross terms are zero.
Sk = E[(C(xk − xk ) + vk )(C(xk − xk ) + vk )T ]
Sk = E[(C(xk − xk ) + vk )((xk − xk )T CT + vkT )]
Sk = E[(C(xk − xk )(xk − xk )T CT ] + E[C(xk − xk )vkT ]
+ E[vk (xk − xk )T CT ] + E[vk vkT ]
Sk = CE[(xk − xk )(xk − xk )T ]CT + CE[(xk − xk )vkT ]
+ E[vk (xk − xk )T ]CT + E[vk vkT ]
Sk = CPk CT + CE[(xk − xk )vkT ] + E[vk (xk − xk )T ]CT + Rk
Since the error and noise are independent, the cross terms are zero.
1
p(x|z1 ) = p(z1 |x)p(x) or p(x|z1 ) ∼ p(z1 |x)p(x)
C
R Even in the single measurement case, the estimation of the variable x is a data/in-
formation fusion problem. We combine prior probability with the probability
resulting from the measurement.
p(x, z1 )
p(x|z1 ) =
p(z1 )
p(z1 |x)p(x)
p(x|z1 ) =
p(z1 )
1
p(x|z1 ) = p(z1 |x)p(x)
C
( )
x−x0 2
1 1 1 z1 −x 2 1 −1
p(x|z1 ) = √ e− 2 ( σ ) √ e 2 σ0
C σ 2π σ0 2π
Absorb the leading coefficients of the two probability distributions into a new constant
C1 .
( )2
1 − 12 ( z1σ−x )2 − 12 x−x0
p(x|z1 ) = e e σ0
C1
154 Chapter 9. Stochastic control theory
This means that if we’re given an initial estimate x0 and a measurement z1 with associ-
ated means and variances represented by Gaussian distributions, this information can
be combined into a third Gaussian distribution with its own mean value and variance.
The expected value of x given z1 is
σ02 σ2
E[x|z1 ] = µ = z 1 + x0 (9.11)
σ02 + σ 2 σ02 + σ 2
σ 2 σ02
E[(x − µ)2 |z1 ] = (9.12)
σ02 + σ2
The expected value, which is also the maximum likelihood value, is the linear combi-
nation of the prior expected (maximum likelihood) value and the measurement. The
expected value is a reasonable estimator of x.
σ02 σ2
x̂ = E[x|z1 ] = z1 + 2 x0 (9.13)
σ02 +σ 2 σ0 + σ 2
x̂ = w1 z1 + w2 x0
Note that the weights w1 and w2 sum to 1. When the prior (i.e., prior knowledge of
state) is uninformative (a large variance),
σ02
w1 = 2lim 2 =1 (9.14)
σ0 →∞ σ0 + σ2
σ2
w2 = 2lim 2 =0 (9.15)
σ0 →∞ σ0 + σ2
and x̂ = z1 . That is, the weight is on the observations and the estimate is equal to the
measurement.
156 Chapter 9. Stochastic control theory
Let us assume we have a model providing an almost exact prior for x. In that case, σ02
approaches 0 and
σ02
w1 = lim 2 =0 (9.16)
σ0 →0 σ0 + σ 2
2
σ2
w2 = lim 2 =1 (9.17)
σ0 →0 σ0 + σ 2
2
The Kalman filter uses this optimal fusion as the basis for its operation.
9.6.1 Derivations
−
k+1 = x̂k+1 + Kk+1 (yk+1 − ŷk+1 ), we want
Given the a posteriori update equation x̂+
to find the value of Kk+1 that minimizes the a posteriori estimate covariance (the error
covariance) because this minimizes the estimation error.
a posteriori estimate covariance update equation
The following is the definition of the a posteriori estimate covariance matrix.
Substitute in the a posteriori update equation and expand the measurement equations.
−
k+1 = cov(xk+1 − (x̂k+1 + Kk+1 (yk+1 − ŷk+1 )))
P+
−
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (yk+1 − ŷk+1 ))
P+
−
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
(Ck+1 xk+1 + Dk+1 uk+1 + vk+1 )
− (Ck+1 x̂−
k+1 + Dk+1 uk+1 )))
−
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
Ck+1 xk+1 + Dk+1 uk+1 + vk+1
− Ck+1 x̂−
k+1 − Dk+1 uk+1 ))
9.6 Kalman filter 157
Reorder terms.
−
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
Ck+1 xk+1 − Ck+1 x̂−
k+1
+ Dk+1 uk+1 − Dk+1 uk+1 + vk+1 ))
−
k+1 = cov((I − Kk+1 Ck+1 )(xk+1 − x̂k+1 ) − Kk+1 vk+1 )
P+
This is the Joseph form of the covariance update equation, which is valid for all Kalman
gains.
−
k+1 = (I − Kk+1 Ck+1 )Pk+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
P+ T T
Multiply in I − CT T
k+1 Kk+1 .
− −
k+1 = Pk+1 (I − Ck+1 Kk+1 ) − Kk+1 Ck+1 Pk+1 (I − Ck+1 Kk+1 )
P+ T T T T
+ Kk+1 Rk+1 KT
k+1
Expand terms.
− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1
P+ T T
+ Kk+1 Ck+1 P− T T T
k+1 Ck+1 Kk+1 + Kk+1 Rk+1 Kk+1
[2] By definition, the characteristic polynomial of an n × n matrix P is given by
p(t) = det(tI − P) = tn − tr(P)tn−1 + · · · + (−1)n det(P)
as well as p(t) = (t − λ1 ) . . . (t − λn ) where λ1 , . . . , λn are the eigenvalues of P. The coefficient for
tn−1 in the second polynomial’s expansion is −(λ1 + · · · + λn ). Therefore, by matching coefficients for
tn−1 , we get tr(P) = λ1 + · · · + λn .
9.6 Kalman filter 159
− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1
P+ T T
+ Kk+1 (Ck+1 P− T T
k+1 Ck+1 + Rk+1 )Kk+1
Ck+1 P− T
k+1 Ck+1 +Rk+1 is the innovation (measurement residual) covariance at timestep
k + 1. We’ll let this expression equal Sk+1 . We won’t need it in the final theorem, but
it makes the derivations after this point more concise.
− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1 + Kk+1 Sk+1 Kk+1 (9.19)
P+ T T T
+ tr(Kk+1 Sk+1 KT
k+1 )
P−
k+1 is symmetric, so we can drop the transpose.
− − −
k+1 ) = tr(Pk+1 ) − tr((Kk+1 Ck+1 Pk+1 ) ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+ T
+ tr(Kk+1 Sk+1 KT
k+1 )
The trace of a matrix is equal to the trace of its transpose since the elements used in
the trace are on the diagonal.
− − −
k+1 ) = tr(Pk+1 ) − tr(Kk+1 Ck+1 Pk+1 ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+
+ tr(Kk+1 Sk+1 KT
k+1 )
− −
k+1 ) = tr(Pk+1 ) − 2 tr(Kk+1 Ck+1 Pk+1 ) + tr(Kk+1 Sk+1 Kk+1 )
tr(P+ T
∂
Theorem 9.6.1 ∂A tr(ABAT ) = 2AB where B is symmetric.
160 Chapter 9. Stochastic control theory
∂
Theorem 9.6.2 ∂A tr(AC) = CT
∂ tr(P+
k+1 )
= 0 − 2(Ck+1 P− T
k+1 ) + 2Kk+1 Sk+1
∂K
∂ tr(P+
k+1 )
= −2P−T T
k+1 Ck+1 + 2Kk+1 Sk+1
∂K
∂ tr(P+
k+1 )
= −2P− T
k+1 Ck+1 + 2Kk+1 Sk+1
∂K
0 = −2P− T
k+1 Ck+1 + 2Kk+1 Sk+1
2Kk+1 Sk+1 = 2P− T
k+1 Ck+1
Kk+1 Sk+1 = P− T
k+1 Ck+1
Kk+1 = P− T −1
k+1 Ck+1 Sk+1
Kk+1 = P− T −1
k+1 Ck+1 Sk+1
Kk+1 Sk+1 = P− T
k+1 Ck+1
−
Kk+1 Sk+1 KT T T
k+1 = Pk+1 Ck+1 Kk+1
− − − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1 + (Pk+1 Ck+1 Kk+1 )
P+ T T T T
− −
k+1 = Pk+1 − Kk+1 Ck+1 Pk+1
P+
Factor out P−
k+1 to the right.
−
k+1 = (I − Kk+1 Ck+1 )Pk+1
P+
9.6 Kalman filter 161
Predict step
x̂− +
k+1 = Ax̂k + Buk (9.20)
P−
k+1 = AP−
kA
T
+Q (9.21)
Update step
Kk+1 = P− T − T
k+1 C (CPk+1 C + R)
−1
(9.22)
x̂+
k+1 = x̂−
k+1 + Kk+1 (yk+1 − Cx̂−
k+1 − Duk+1 ) (9.23)
+
Pk+1 = (I − Kk+1 Ck+1 )P−
k+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 KT
T
k+1
(9.24)
C, D, Q, and R from the equations derived earlier are made constants here.
R To implement a discrete time Kalman filter from a continuous model, the model
and continuous time Q and R matrices can be discretized using theorem 7.3.1.
[3] Explaining why we use the Wiener process would require going much more in depth into stochastic
processes and Itô calculus, which is outside the scope of this book.
9.6 Kalman filter 163
9.6.3 Setup
Equations to model
The following example system will be used to describe how to define and initialize the
matrices for a Kalman filter.
A robot is between two parallel walls. It starts driving from one wall to the other at a
velocity of 0.8 cm/s and uses ultrasonic sensors to provide noisy measurements of the
distances to the walls in front of and behind it. To estimate the distance between the
walls, we will define three states: robot position, robot velocity, and distance between
the walls.
xk+1 = xk + vk ∆T (9.25)
vk+1 = vk (9.26)
xw
k+1 = xw
k (9.27)
where the Gaussian random variable wk has a mean of 0 and a variance of 1. The
164 Chapter 9. Stochastic control theory
observation model is
1 0 0
yk = x + θk (9.30)
−1 0 1 k
Initial conditions
To fill in the P matrix, we calculate the covariance of each combination of state vari-
ables. The resulting value is a measure of how much those variables are correlated.
Due to how the covariance calculation works out, the covariance between two vari-
ables is the sum of the variance of matching terms which aren’t constants multiplied by
any constants the two have. If no terms match, the variables are uncorrelated and the
covariance is zero.
In P11 , the terms in x1 correlate with itself. Therefore, P11 is x1 ’s variance, or P11 =
10. For P21 , One term correlates between x1 and x2 , so P21 = 10 dt . The constants
from each are simply multiplied together. For P22 , both measurements are correlated,
20
so the variances add together. Therefore, P22 = dt 2 . It continues in this fashion until
the matrix is filled up. Order doesn’t matter for correlation, so the matrix is symmetric.
Selection of priors
Choosing good priors is important for a well performing filter, even if little information
is known. This applies to both the measurement noise and the noise model. The act
of giving a state variable a large variance means you know something about the system.
Namely, you aren’t sure whether your initial guess is close to the true state. If you make
a guess and specify a small variance, you are telling the filter that you are very confident
9.6 Kalman filter 165
in your guess. If that guess is incorrect, it will take the filter a long time to move away
from your guess to the true value.
Covariance selection
While one could assume no correlation between the state variables and set the covari-
ance matrix entries to zero, this may not reflect reality. The Kalman filter is still guar-
enteed to converge to the steady-state covariance after an infinite time, but it will take
longer than otherwise.
Noise model selection
We typically use a Gaussian distribution for the noise model because the sum of many
independent random variables produces a normal distribution by the central limit the-
orem. Kalman filters only require that the noise is zero-mean. If the true value has an
equal probability of being anywhere within a certain range, use a uniform distribution
instead. Each of these communicates information regarding what you know about a
system in addition to what you do not.
Process noise and measurement noise covariance selection
Recall that the process noise covariance is Q and the measurement noise covariance is
R. To tune the elements of these, it can be helpful to take a collection of measurements,
then run the Kalman filter on them offline to evaluate its performance.
The diagonal elements of R are the variances of each measurement, which can be
easily determined from the offline measurements. The diagonal elements of Q are the
variances of each state. They represent how much each state is expected to deviate
from the model.
Selecting Q is more difficult. If the data is trusted too much over the model, one risks
overfitting the data. One should balance estimating any hidden states sufficiently with
actually filtering out the noise.
Modeling other noise colors
The Kalman filter assumes a model with zero-mean white noise. If the model is incom-
plete in some way, whether it’s missing dynamics or assumes an incorrect noise model,
e = y − Cx̂ over time will have probability characteristics not indicative
the residual y
of white noise (e.g., it isn’t zero-mean).
To handle other colors of noise in a Kalman filter, define that color of noise in terms
of white noise and augment the model with it.
9.6.4 Simulation
Figure 9.3 shows the state estimates and measurements of the Kalman filter over time.
Figure 9.4 shows the position estimate and variance over time. Figure 9.5 shows the
166 Chapter 9. Stochastic control theory
wall position estimate and variance over time. Notice how the variances decrease over
time as the filter gathers more measurements. This means that the filter becomes more
confident in its state estimates.
The final precisions in estimating the position of the robot and the wall are the square
roots of the corresponding elements in the covariance matrix. That is, 0.5188 m and
0.4491
√ m respectively. They are smaller than the precision of the raw measurements,
10 = 3.1623 m. As expected, combining the information from several measure-
ments produces a better estimate than any one measurement alone.
Figure 9.4: Robot position estimate and variance with Kalman filter
Figure 9.5: Wall position estimate and variance with Kalman filter
168 Chapter 9. Stochastic control theory
import numpy as np
import scipy as sp
Keyword arguments:
A -- numpy.array(states x states), system matrix.
C -- numpy.array(outputs x states), output matrix.
Q -- numpy.array(states x states), process noise covariance matrix.
R -- numpy.array(outputs x outputs), measurement noise covariance matrix.
Returns:
K -- numpy.array(outputs x states), Kalman gain matrix.
"""
P = sp.linalg.solve_discrete_are(a=A.T, b=C.T, q=Q, r=R)
return np.linalg.solve(C @ P @ C.T + R, C @ P.T).T
9.7.2 Example
We will modify the robot model so that instead of a velocity of 0.8 cm/s with random
noise, the velocity is modeled as a random walk from the current velocity.
xk
x k = vk (9.46)
xwk
1 1 0 0
xk+1 = 0 1 0 xk + 0.1 wk (9.47)
0 0 1 0
See Roger Labbe’s book Kalman and Bayesian Filters in Python for more on smooth-
ing.[4]
[4] https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/13-Smoothing.
ipynb
9.8 Extended Kalman filter 173
Predict step
∂f (x, u)
A= (9.48)
∂x x̂+
k ,uk
| {z }
Linearize f (x,u)
Ak = eAT
|{z} (9.49)
Discretize A
x̂− +
k+1 = RK4(f, x̂k , uk , T ) (9.50)
| {z }
Numerical integration
P−
k+1 = A k P− T
k Ak + Q k (9.51)
Update step
∂h(x, u)
Ck+1 = (9.52)
∂x x̂−
k+1 ,uk+1
| {z }
Linearize h(x,u)
Kk+1 = P− T − T
k+1 Ck+1 (Ck+1 Pk+1 Ck+1 + Rk+1 )
−1
(9.53)
x̂+
k+1 = x̂− −
k+1 + Kk+1 (yk+1 − h(x̂k+1 , uk+1 )) (9.54)
+
Pk+1 = (I − Kk+1 Ck+1 )P−
k+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
T T
(9.55)
[5] https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/
10-Unscented-Kalman-Filter.ipynb
176 Chapter 9. Stochastic control theory
14-Adaptive-Filtering.ipynb
Road next to Stevenson Academic building at UCSC
Pose is defined as the position and orientation of an agent (a system with a controller).
The plant usually includes the pose in its state vector. We’ll cover several methods for
estimating an agent’s pose from local measurements such as encoders and gyroscope
heading.
xk+1 = xk + vk cos θk T
yk+1 = yk + vk sin θk T
θk+1 = θgyro,k+1
where T is the sample period. This odometry approach assumes that the robot follows
a straight path between samples (that is, ω = 0 at all but the sample times).
We use the pose exponential to take encoder measurement deltas and gyro angle deltas
10.2 Pose exponential 179
(which are in the tangent space and are thus a twist) and turn them into a change in
pose. This gets added to the pose from the last update.
10.2.4 Derivation
We can obtain a more accurate approximation of the pose than Euler integration by
including first-order dynamics for the heading θ.
x
x = y
θ
vx , vy , and ω are the x and y velocities of the robot within its local coordinate frame,
which will be treated as constants.
R There are two coordinate frames used here: robot and global. A superscript
on the left side of a matrix denotes the coordinate frame in which that matrix
is represented. The robot’s coordinate frame is denoted by R and the global
coordinate frame is denoted by G.
To transform this into the global frame SE(2), we apply a counterclockwise rotation
matrix where θ changes over time.
G R
dx cos θ(t) − sin θ(t) 0 vx
dy = sin θ(t) cos θ(t) 0 vy dt
dθ 0 0 1 ω
G R
dx cos ωt − sin ωt 0 vx
dy = sin ωt cos ωt 0 vy dt
dθ 0 0 1 ω
Now, integrate the matrix equation (matrices are integrated element-wise). This deriva-
tion heavily utilizes the integration method described in section 4.3.
G sin ωt cos ωt
R t
∆x ω ω 0 vx
∆y = − cos ωt sin ωt 0 vy
ω ω
∆θ 0 0 t ω
0
180 Chapter 10. Pose estimation
G sin ωt cos ωt−1
R
∆x ω ω 0 vx
∆y = 1−cos ωt sin ωt
0 vy
ω ω
∆θ 0 0 t ω
R Control system implementations will generally have a model update and a con-
troller update in a given iteration. Equation (10.1) (the model update) uses the
current velocity to advance the state to the next timestep (into the future). Since
controllers use the current state, the controller update should be run before the
model update.
If we factor out a t, we can use change in pose between updates instead of velocities.
G sin ωt R
∆x cos θ − sin θ 0 ω
cos ωt−1
ω 0 vx
∆y = sin θ cos θ 0 1−cos ωt sin ωt
0 vy
ω ω
∆θ 0 0 1 0 0 t ω
G sin ωt R
∆x cos θ − sin θ 0 ωt
cos ωt−1
ωt 0 vx
∆y = sin θ cos θ 0 1−cos ωt sin ωt
0 v y t
ωt ωt
∆θ 0 0 1 0 0 1 ω
G sin ωt R
∆x cos θ − sin θ 0 ωt
cos ωt−1
ωt 0 vx t
∆y = sin θ cos θ 0 1−cos ωt sin ωt
0 v y t
ωt ωt
∆θ 0 0 1 0 0 1 ωt
G R R
∆x cos θ − sin θ 0 sin ∆θ
∆θ
cos ∆θ−1
∆θ 0 ∆x
∆y = sin θ cos θ 0 1−cos ∆θ sin ∆θ
0 ∆y (10.2)
∆θ ∆θ
∆θ 0 0 1 0 0 1 ∆θ
R
The vector ∆x ∆y ∆θ T is a twist because it’s an element of the tangent space
(the robot’s local coordinate frame).
R Control system implementations will generally have a model update and a con-
troller update in a given iteration. Equation (10.2) (the model update) uses local
10.2 Pose exponential 181
distance and heading deltas between the previous and current timestep, so it ad-
vances the state to the current timestep. Since controllers use the current state,
the controller update should be run after the model update.
When the robot is traveling on a straight trajectory (∆θ = 0), some expressions within
the equation above are indeterminate. We can approximate these with Taylor series
expansions.
G R sin ∆θ R
∆x cos θ − sin θ 0 ∆θ
cos ∆θ−1
∆θ 0 ∆x
∆y = sin θ cos θ 0 1−cos ∆θ sin ∆θ
0 ∆y (10.3)
∆θ ∆θ
∆θ 0 0 1 0 0 1 ∆θ
where G denotes global coordinate frame and R denotes robot’s coordinate frame.
For sufficiently small ∆θ:
Figures 10.1 through 10.4 show the pose estimation errors of forward Euler odometry
and pose exponential odometry for a feedforward S-curve trajectory (dt = 20 ms).
The highest errors for the 8.84 m by 5.0 m trajectory are 3.415 cm in x, 0.158 cm in
y, and −0.644 deg in heading. The difference would be even more noticeable for paths
with higher curvatures and longer durations. The error returns to near zero in this case
182 Chapter 10. Pose estimation
Figure 10.1: Pose estimation comparison Figure 10.2: Pose estimation comparison
(y vs x) (x error vs time)
Figure 10.3: Pose estimation comparison Figure 10.4: Pose estimation comparison
(y error vs time) (heading error vs time)
because the curvature is symmetric, so the second half cancels the error accrued in the
first half.
Using a smaller update period somewhat mitigates the forward Euler pose estimation
error. However, there are bigger sources of error like turning scrub on skid steer robots
that should be dealt with before odometry numerical accuracy.
11 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11. Dynamics
1
x(t) = x0 + v0 t + at2
2
where x(t) is an object’s position at time t, x0 is the initial position, v0 is the initial
velocity, and a is the acceleration.
1
θ(t) = θ0 + ω0 t + αt2
2
188 Chapter 11. Dynamics
where θ(t) is an object’s angle at time t, θ0 is the initial angle, ω0 is the initial angular
velocity, and α is the angular acceleration.
11.3 Vectors
Vectors are quantities with a magnitude and a direction. Vectors in three-dimensional
space have a coordinate for each spatial direction x, y, and z. Let’s take the vector
⃗a = ⟨1, 2, 3⟩. ⃗a is a three-dimensional vector that describes a movement 1 unit in the
x direction, 2 units in the y direction, and 3 units in the z direction.
We define î, ĵ, and k̂ as vectors that represent the fundamental movements one can
make the three-dimensional space: 1 unit of movement in the x direction, 1 unit of
movement y direction, and 1 unit of movement in the z direction respectively. These
three vectors form a basis of three-dimensional space because copies of them can be
added together to reach any point in three-dimensional space.
î = ⟨1, 0, 0⟩
ĵ = ⟨0, 1, 0⟩
k̂ = ⟨0, 0, 1⟩
⃗a = ⟨1, 2, 3⟩
11.4 Curvilinear motion 189
î × ĵ = k̂
ĵ × k̂ = î
k̂ × î = ĵ
They proceed in a cyclic fashion through i, j, and k. If a vector is crossed with itself, it
produces the zero vector (a scalar zero for each coordinate). The cross products of the
basis vectors in the opposite order progress backwards and include a negative sign.
î × k̂ = −ĵ
k̂ × ĵ = −î
ĵ × î = −k̂
Given vectors ⃗u = aî + bĵ + ck̂ and ⃗v = dî + eĵ + f k̂, ⃗u × ⃗v is computed using the
distributive property.
+x
+y
where ⃗vB is the velocity vector at point B, ⃗vA is the velocity vector at point A, ωA is
the angular velocity vector at point A, and ⃗rB|A is the distance vector from point A to
point B (also described as the “distance to B relative to A”).
ωl ωr
J
r
rb
⃗vl = vc î + ω k̂ × rb ĵ
⃗vl = vc î − ωrb î
⃗vl = (vc − ωrb )î
Now, project this vector onto the left wheel, which is pointed in the î direction.
î
vl = (vc − ωrb )î ·
∥î∥
⃗vr = vc î + ω k̂ × rb ĵ
⃗vr = vc î + ωrb î
⃗vr = (vc + ωrb )î
Now, project this vector onto the right wheel, which is pointed in the î direction.
î
vr = (vc + ωrb )î ·
∥î∥
vr = vc + ωrb (11.2)
vf l vf r
vrl vrr
+x
+y
Note that the velocity of the wheel is the same as the velocity in the diagram for the
purposes of feedback control. The rollers on the wheel redirect the velocity vector.
î − ĵ
vf l = ((vx − ωrf ly )î + (vy + ωrf lx )ĵ) · √
2
1
vf l = ((vx − ωrf ly ) − (vy + ωrf lx )) √
2
1
vf l = (vx − ωrf ly − vy − ωrf lx ) √
2
1
vf l = (vx − vy − ωrf ly − ωrf lx ) √
2
194 Chapter 11. Dynamics
1
vf l = (vx − vy − ω(rf lx + rf ly )) √
2
1 1 1
vf l = √ vx − √ vy − √ ω(rf lx + rf ly ) (11.8)
2 2 2
1
vrl = (vx − ωrrly + vy + ωrrlx ) √
2
1
vrl = (vx + vy − ωrrly + ωrrlx ) √
2
1
vrl = (vx + vy + ω(rrlx − rrly )) √
2
1 1 1
vrl = √ vx + √ vy + √ ω(rrlx − rrly ) (11.10)
2 2 2
î − ĵ
vrr = ((vx − ωrrry )î + (vy + ωrrrx )ĵ) · √
2
1
vrr = ((vx − ωrrry ) − (vy + ωrrrx )) √
2
1
vrr = (vx − ωrrry − vy − ωrrrx ) √
2
1
vrr = (vx − vy − ωrrry − ωrrrx ) √
2
1
vrr = (vx − vy − ω(rrrx + rrry )) √
2
1 1 1
vrr = √ vx − √ vy − √ ω(rrrx + rrry ) (11.11)
2 2 2
1 1 1
vrr = √ vx − √ vy − √ ω(rrrx + rrry )
2 2 2
Now we’ll factor them out into matrices.
√1 − √1
− √12 (rf lx + rf ly )
vf l 2 2
vf r √1 √1 √1 (rf r − rf r ) vx
= 12 2 2 x y
vy
vrl √ √1 √1 (rrl − rrl )
2 2 2 x y
ω
vrr √1
2
− √1
2
− √12 (rrrx + rrry )
vf l 1 −1 −(rf lx + rf ly )
vf r vx
= √1 1 1 (rf rx − rf ry )
vy
vrl (11.12)
2 1 1 (rrlx − rrly )
ω
vrr 1 −1 −(rrrx + rrry )
⃗ robot × ⃗rrobot2wheel1
⃗vwheel1 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel2
⃗vwheel2 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel3
⃗vwheel3 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel4
⃗vwheel4 = ⃗vrobot + ω
where ⃗vwheel is the wheel velocity vector, ⃗vrobot is the robot velocity vector, ω
⃗ robot
is the robot angular velocity vector, ⃗rrobot2wheel is the displacement vector from the
11.7 Swerve drive kinematics 197
v1
v3 v2
+x
v4
+y
v1x = vx − ωr1y
v2x = vx − ωr2y
v3x = vx − ωr3y
v4x = vx − ωr4y
v1y = vy + ωr1x
v2y = vy + ωr2x
v3y = vy + ωr3x
v4y = vy + ωr4x
To convert the swerve module x and y velocity components to a velocity and head-
ing, use the Pythagorean theorem and arctangent respectively. Here’s an example for
module 1.
q
v1 = v1x 2 + v2 (11.15)
1y
v1y
θ1 = tan−1 (11.16)
v1x
A model is a set of differential equations describing how the system behaves over time.
There are two common approaches for developing them.
1. Collecting data on the physical system’s behavior and performing system identi-
fication with it.
2. Using physics to derive the system’s model from first principles.
This chapter covers the second approach using Newtonian mechanics.
The models derived here should cover most types of motion seen on an FRC robot.
Furthermore, they can be easily tweaked to describe many types of mechanisms just
by pattern-matching. There’s only so many ways to hook up a mass to a motor in FRC.
The flywheel model can be used for spinning mechanisms, the elevator model can be
used for spinning mechanisms transformed to linear motion, and the single-jointed
arm model can be used for rotating servo mechanisms (it’s just the flywheel model
augmented with a position state).
These models assume all motor controllers driving DC brushed motors are set to brake
mode instead of coast mode. Brake mode behaves the same as coast mode except
where the applied voltage is zero. In brake mode, the motor leads are shorted together
to prevent movement. In coast mode, the motor leads are an open circuit.
202 Chapter 12. Newtonian mechanics examples
V
ω
Vemf = Kv
V is the voltage applied to the motor, I is the current through the motor in Amps, R
is the resistance across the motor in Ohms, ω is the angular velocity of the motor in
radians per second, and Kv is the angular velocity constant in radians per second per
Volt. This circuit reflects the following relation.
ω
V = IR + (12.1)
Kv
τ = Kt I (12.2)
where τ is the torque produced by the motor in Newton-meters and Kt is the torque
12.1 DC brushed motor 203
τ ω
V = R+ (12.3)
Kt Kv
τ = Kt I
τ
Kt =
I
τstall
Kt = (12.4)
Istall
204 Chapter 12. Newtonian mechanics examples
where τstall is the stall torque and Istall is the stall current of the motor from its
datasheet.
Resistance R
Recall equation (12.1).
ω
V = IR +
Kv
When the motor is stalled, ω = 0.
V = Istall R
V
R= (12.5)
Istall
where Istall is the stall current of the motor and V is the voltage applied to the motor
at stall.
Angular velocity constant Kv
Recall equation (12.1).
ω
V = IR +
Kv
ω
V − IR =
Kv
ω
Kv =
V − IR
When the motor is spinning under no load,
ωf ree
Kv = (12.6)
V − If ree R
where ωf ree is the angular velocity of the motor under no load (also known as the free
speed), and V is the voltage applied to the motor when it’s spinning at ωf ree , and If ree
is the current drawn by the motor under no load.
If several identical motors are being used in one gearbox for a mechanism, multiply the
stall torque by the number of motors.
the current, so the voltage is reduced before the current threshold is exceeded. Reactive
current limiting uses an actual current measurement, so the voltage is reduced after the
current threshold is exceeded.
The following pseudocode demonstrates each type of current limiting.
# Normal feedback control
V = K @ (r - x)
12.2 Elevator
This elevator consists of a DC brushed motor attached to a pulley that drives a mass up
or down.
R ωp
I
ωm
Vemf
m v
V G
r
circuit mechanics
Gear ratios are written as output over input, so G is greater than one in figure 12.3.
206 Chapter 12. Newtonian mechanics examples
where G is the gear ratio between the motor and the pulley and τp is the torque produced
by the pulley.
rFm = τp (12.8)
where r is the radius of the pulley. Substitute equation (12.7) into τm in the DC brushed
motor equation (12.3).
τp
G ωm
V = R+
Kt Kv
τp ωm
V = R+
GKt Kv
Substitute in equation (12.8) for τp .
rFm ωm
V = R+ (12.9)
GKt Kv
where ωp is the angular velocity of the pulley. The velocity of the mass (the elevator
carriage) is
v = rωp
v
ωp = (12.11)
r
Substitute equation (12.11) into equation (12.10).
v
ωm = G (12.12)
r
RrFm G
V = + v
GKt rKv
Solve for Fm .
RrFm G
=V − v
GKt rKv
G GKt
Fm = V − v
rKv Rr
2
GKt G Kt
Fm = V − v (12.13)
Rr Rr2 Kv
P
where F is the sum of forces applied to the elevator carriage, m is the mass of the
elevator carriage in kilograms, and a is the acceleration of the elevator carriage.
ma = Fm
GKt G2 Kt
ma = V − v
Rr Rr2 Kv
GKt G2 Kt
a= V − v
Rrm Rr2 mKv
G2 K t GKt
a=− 2 v+ V (12.15)
Rr mKv Rrm
R Gravity is not part of the modeled dynamics because it complicates the state-
space model and the controller will behave well enough without it.
12.3 Flywheel
This flywheel consists of a DC brushed motor attached to a spinning mass of non-
negligible moment of inertia.
Gear ratios are written as output over input, so G is greater than one in figure 12.4.
208 Chapter 12. Newtonian mechanics examples
R ωf
I J
ωm
Vemf
V G
circuit mechanics
τf Kt GKt
= V − ωf
G R Kv R
GKt G2 K t
τf = V − ωf (12.16)
R Kv R
The torque applied to the flywheel is defined as
τf = J ω̇f (12.17)
where J is the moment of inertia of the flywheel and ω̇f is the angular acceleration.
Substitute equation (12.17) into equation (12.16).
GKt G2 K t
(J ω̇f ) = V − ωf
R Kv R
GKt G2 Kt
ω̇f = V − ωf
RJ Kv RJ
G2 Kt GKt
ω̇f = − ωf + V
Kv RJ RJ
We’ll relabel ωf as ω for convenience.
G2 Kt GKt
ω̇ = − ω+ V (12.18)
Kv RJ RJ
JKv R Kv
ω̇ = −ω + V
G2 K t G
JKv R Kv
ω=− 2 ω̇ + V (12.19)
G Kt G
m
R ωarm l
I
ωm
Vemf
V G
circuit mechanics
Gear ratios are written as output over input, so G is greater than one in figure 12.5.
We will start with the equation derived earlier for a DC brushed motor, equation (12.3).
τm ωm
V = R+
Kt Kv
Solve for the angular acceleration. First, we’ll rearrange the terms because from inspec-
tion, V is the model input, ωm is the state, and τm contains the angular acceleration.
R 1
V = τm + ωm
Kt Kv
Solve for τm .
R 1
V = τm + ωm
Kt Kv
R 1
τm =V − ωm
Kt Kv
Kt Kt
τm = V − ωm (12.20)
R Kv R
Since τm G = τarm and ωm = Gωarm ,
τ K Kt
arm t
= V − (Gωarm )
G R Kv R
τarm Kt GKt
= V − ωarm
G R Kv R
GKt G2 K t
τarm = V − ωarm (12.21)
R Kv R
The torque applied to the arm is defined as
τarm = J ω̇arm (12.22)
where J is the moment of inertia of the arm and ω̇arm is the angular acceleration.
Substitute equation (12.22) into equation (12.21).
GKt G2 K t
(J ω̇arm ) = V − ωarm
R Kv R
G2 Kt GKt
ω̇arm =− ωarm + V
Kv RJ RJ
We’ll relabel ωarm as ω for convenience.
G2 Kt GKt
ω̇ = − ω+ V (12.23)
Kv RJ RJ
where m is the mass of the arm and l is the length of the arm. Otherwise, a procedure
for measuring it experimentally is presented below.
First, rearrange equation (12.23) into the form y = mx + b such that J is in the
numerator of m.
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
G2 K t GKt
J ω̇ = − ω+ V
Kv R R
Kv R
Multiply by G2 K t on both sides.
JKv R GKt Kv R
ω̇ = −ω + · V
G2 K t R G2 Kt
JKv R Kv
2
ω̇ = −ω + V
G Kt G
JKv R Kv
ω=− 2 ω̇ + V (12.25)
G Kt G
12.5 Pendulum
Kinematics and dynamics are a rather large topics, so for now, we’ll just focus on the
basics required for working with the models in this book. We’ll derive the same model,
a pendulum, using three approaches: sum of forces, sum of torques, and conservation
of energy.
θ
y0
y1 θ0
mg sin θ
mg cos θ
θ h
mg
F = ma
where F is the sum of forces on the object, m is mass, and a is the acceleration. Because
we are only concerned with changes in speed, and because the bob is forced to stay in a
circular path, we apply Newton’s equation to the tangential axis only. The short violet
arrow represents the component of the gravitational force in the tangential axis, and
trigonometry can be used to determine its magnitude. Therefore
−mg sin θ = ma
214 Chapter 12. Newtonian mechanics examples
a = −g sin θ
where g is the acceleration due to gravity near the surface of the earth. The negative
sign on the right hand side implies that θ and a always point in opposite directions. This
makes sense because when a pendulum swings further to the left, we would expect it
to accelerate back toward the right.
This linear acceleration a along the red axis can be related to the change in angle θ by
the arc length formulas; s is arc length and l is the length of the pendulum.
s = lθ (12.26)
ds dθ
v= =l
dt dt
d2 s d2 θ
a= 2 =l 2
dt dt
Therefore,
d2 θ
l = −g sin θ
dt2
2
d θ g
2
= − sin θ
dt l
g
θ̈ = − sin θ
l
τ =r×F
First start by defining the torque on the pendulum bob using the force due to gravity.
τ = l × Fg
where l is the length vector of the pendulum and Fg is the force due to gravity.
For now just consider the magnitude of the torque on the pendulum.
|τ | = −mgl sin θ
where m is the mass of the pendulum, g is the acceleration due to gravity, l is the
length of the pendulum and θ is the angle between the length vector and the force due
to gravity.
12.5 Pendulum 215
L = r × p = mr × (ω × r)
|L| = mr2 ω
dθ
|L| = ml2
dt
2
d d θ
|L| = ml2 2
dt dt
dL
According to τ = dt , we can just compare the magnitudes.
d2 θ
−mgl sin θ = ml2
dt2
g d2 θ
− sin θ = 2
l dt
g
θ̈ = − sin θ
l
∆U = mgh
dθ p
v=l = 2gh
dt
dθ 2gh
= (12.27)
dt l
where h is the vertical distance the pendulum fell. Look at figure 12.6b, which presents
the trigonometry of a pendulum. If the pendulum starts its swing from some initial angle
θ0 , then y0 , the vertical distance from the pivot point, is given by
y0 = l cos θ0
Similarly for y1 , we have
y1 = l cos θ
Then h is the difference of the two
h = l(cos θ − cos θ0 )
Substituting this into equation (12.27) gives
r
dθ 2g
= (cos θ − cos θ0 )
dt l
This equation is known as the first integral of motion. It gives the velocity in terms of
the location and includes an integration constant related to the initial displacement (θ0 ).
We can differentiate by applying the chain rule with respect to time. Doing so gives the
acceleration.
r
d dθ d 2g
= (cos θ − cos θ0 )
dt dt dt l
d2 θ 1 − 2g
l sin θ dθ
2
= q
dt 2 2g (cos θ − cos θ ) dt
l 0
r
d2 θ 1 − 2g
l sin θ 2g
2
= q (cos θ − cos θ0 )
dt 2 2g (cos θ − cos θ ) l
l 0
d2 θ g
= − sin θ
dt2 l
g
θ̈ = − sin θ
l
vl
+x
θ
+y
where τ is the torque applied by one wheel of the differential drive, G is the gear ratio
of the differential drive, Kt is the torque constant of the motor, R is the resistance of
the motor, and Kv is the angular velocity constant. Since τ = rF and ω = vr where v
is the velocity of a given drive side along the ground and r is the drive wheel radius
GKt G2 Kt v
(rF ) = V −
R Kv R r
GKt G2 Kt
rF = V − v
R Kv Rr
GKt G2 K t
F = V − v
Rr Kv Rr2
G2 Kt GKt
F =− 2
v+ V
Kv Rr Rr
218 Chapter 12. Newtonian mechanics examples
where the l and r subscripts denote the side of the robot to which each variable corre-
sponds.
G2 K G2 K
Let C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 = − KvrRrt2 , and C4 = Gr K t
Rr .
Fl = C 1 v l + C 2 V l (12.29)
Fr = C 3 v r + C 4 V r (12.30)
2rb2
(−Fl + Fr ) = v̇r − v̇l
J
2r2
v̇r = v̇l + b (−Fl + Fr )
J
Substitute equation (12.32) back into equation (12.31) to obtain an expression for v̇l .
2 1 rb2
v̇l = (Fl + Fr ) − (Fl + Fr ) + (−Fl + Fr )
m m J
2
1 r
v̇l = (Fl + Fr ) − b (−Fl + Fr )
m J
1 rb2
v̇l = (Fl + Fr ) + (Fl − Fr )
m J
1 1 r2 r2
v̇l = Fl + Fr + b Fl − b Fr
m
m J J
1 rb2 1 r2
v̇l = + Fl + − b Fr (12.34)
m J m J
τ1 = r × F
τ1 = rma
where τ1 is the torque applied by a drive motor during only linear acceleration, r is the
wheel radius, m is the robot mass, and a is the linear acceleration.
τ2 = Iα
where τ2 is the torque applied by a drive motor during only angular acceleration, I is the
moment of inertia (same as J), and α is the angular acceleration. If a constant voltage
is applied during both the linear and angular acceleration tests, τ1 = τ2 . Therefore,
rma = Iα
rmv + C1 = Iω + C2
rmv = Iω + C3
I
v= ω + C3 (12.37)
rm
2. Rotate the drivetrain around its center by applying the same voltage as the lin-
ear acceleration test with the motors driving in opposite directions. Record the
angular velocity over time using a gyroscope.
3. Perform a linear regression of linear velocity versus angular velocity. The slope
I
of this line has the form rm as per equation (12.37).
4. Multiply the slope by rm to obtain a least squares estimate of I.
This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura
A model is a set of differential equations describing how the system behaves over time.
There are two common approaches for developing them.
1. Collecting data on the physical system’s behavior and performing system identi-
fication with it.
2. Using physics to derive the system’s model from first principles.
This chapter covers the second approach using Lagrangian mechanics.
We suggest reading An introduction to Lagrangian and Hamiltonian Mechanics by Si-
mon J.A. Malham for the basics [13].
13.3 Cart-pole
See https://underactuated.mit.edu/acrobot.html#cart_pole for a derivation of the kine-
matics and dynamics of a cart-pole via Lagrangian mechanics.
Hills by northbound freeway between Santa Maria and Ventura
First, we’ll cover a system identification procedure for generating a velocity system
feedforward model from performance data. Then, we’ll use that feedforward model to
create state-space models for several common mechanisms.
each row corresponds to a datapoint, and n > p (more datapoints than unknowns).
OLS is a type of linear least squares method that estimates the unknown parameters β
in a linear regression model y = Xβ + ϵ where ϵ is the error in the observations y.
β is chosen by the method of least squares: minimizing the sum of the squares of the
difference between y (observations of the dependent variable) and Xβ (predictions of
y using a linear function of the independent variable β).
226 Chapter 14. System identification
Geometrically, this is the sum of the squared distances, parallel to the y-axis, between
each value in y and the corresponding value in Xβ. Smaller differences mean a better
model fit.
To find the β that fits best, we can solve the following quadratic minimization problem
β̂ = arg min(y − Xβ)T (y − Xβ)
β
arg minβ means “find the value of β that minimizes the following function of β”.
Given corollary 5.12.3, take the partial derivative of the cost function with respect to
β and set it equal to zero, then solve for β̂.
0 = −2XT (y − Xβ̂)
0 = XT (y − Xβ̂)
0 = XT y − XT Xβ̂
XT Xβ̂ = XT y
β̂ = (XT X)−1 XT y (14.1)
14.1.1 Examples
While this is a linear regression, we can fit nonlinear functions by making the contents
of X nonlinear functions of the independent variables. For example, we can find the
quadratic equation y = ax2 + bx + c that best fits a set of x-y tuples. Lets assume we
have a set of observations as follows.
y1 = ax21 + bx1 + c
..
.
yn = ax2n + bxn + c
We want to find a, b, and c, so let’s factor those out.
2
y1 x1 x1 1 a
.. .. .. .. b
. = . . .
yn x 2
xn 1 c
n
Plug these matrices into equation (14.1) to obtain the coefficients a, b, and c.
2
y1 x1 x1 1 a
.. β = b
y = ... X = ... ..
. .
yn x2n xn 1 c
14.2 1-DOF mechanism feedforward model 227
14.2.2 Overview
First, we’ll put equation (14.2) into the form dx
dt = Ax + Bu + c. Then, we’ll discretize
it to obtain an equation of the form xk+1 = αxk + βuk + γ sgn(xk ). Then, we’ll
perform OLS on xk+1 = αxk + βuk + γ sgn(xk ) using 4-tuples of current velocity,
next velocity, input voltage, and velocity sign to obtain α, β, and γ. Since we solved
for those in the discretization step earlier, we have a system of three equations we can
solve for Ks , Kv , and Ka .
14.2.3 Derivation
First, solve u = Ks sgn(v) + Kv v + Ka a for a.
Ka a = u − Ks sgn(v) − Kv v
1 Ks Kv
a= u− sgn(v) − v
Ka Ka Ka
Kv 1 Ks
a=− v+ u− sgn(v)
Ka Ka Ka
Let x = v, dx
dt = a, A = − K
Kv
a
,B = 1
Ka , and c = − K
Ks
a
sgn(x).
dx
= Ax + Bu + c
dt
Since Bu + c is a constant over the time interval [0, T ) where T is the sample period,
this equation can be integrated according to appendix D.1, which gives
xk+1 = eAT xk + A−1 (eAT − 1)(Buk + c)
228 Chapter 14. System identification
This equation has the form xk+1 = αxk + βuk + γ sgn(xk ). Running OLS with
(xk+1 , xk , uk , sgn(xk )) 4-tuples will give α, β, and γ. To obtain Ks , Kv , and Ka ,
solve the following system.
α = e
AT
β = A−1 eAT − 1 B
Ks −1
γ = −K a
A e AT
− 1
Divide the second equation by the third equation and solve for Ks .
β A−1 (α − 1)B
= Ks −1
γ − Ka A (α − 1)
β Ka B
=−
γ Ks
β Ka K1a
=−
γ Ks
β 1
=−
γ Ks
γ
Ks = − (14.4)
β
Solve the second equation in (14.3) for Kv .
β = A−1 (α − 1)B
14.2 1-DOF mechanism feedforward model 229
−1
Kv 1
β= − (α − 1)
Ka Ka
Ka 1
β =− (α − 1)
Kv Ka
1
β = (1 − α)
Kv
Kv β =1−α
1−α
Kv = (14.5)
β
Solve the first equation in (14.3) for Ka .
Kv
α = e− Ka T
Kv
ln α = − T
Ka
Ka ln α = −Kv T
Kv T
Ka = −
ln α
Substitute in Kv from equation (14.5).
1−α
β T
Ka = −
ln α
(1 − α)T
Ka = −
β ln α
(α − 1)T
Ka = (14.6)
β ln α
14.2.4 Results
The new system identification process is as follows.
1. Gather input voltage and velocity from a dynamic acceleration event
2. Given an equation of the form xk+1 = αxk + βuk + γ sgn(xk ), perform OLS
on (xk+1 , xk , uk , sgn(xk )) 4-tuples to obtain α, β, and γ
The feedforward gains are
γ
Ks = − (14.7)
β
1−α
Kv = (14.8)
β
230 Chapter 14. System identification
(α − 1)T
Ka = (14.9)
β ln α
α > 0, β ̸= 0
We want to derive what A and B are from the following feedforward model
u = Kv x + Ka ẋ
Since this equation is a linear multiple regression, the constants Kv and Ka can be de-
termined by applying ordinary least squares (OLS) to vectors of recorded input voltage,
velocity, and acceleration data from quasistatic velocity tests and acceleration tests.
Kv is a proportional constant that describes how much voltage is required to maintain
a given constant velocity by offsetting the electromagnetic resistance of the motor and
any friction that increases linearly with speed (viscous drag). The relationship between
speed and voltage (for a given initial acceleration) is linear for permanent-magnet DC
motors in the FRC operating regime.
Ka is a proportional constant that describes how much voltage is required to induce
a given acceleration in the motor shaft. As with Kv , the relationship between voltage
and acceleration (for a given initial velocity) is linear.
To convert u = Kv x + Ka ẋ to state-space form, simply solve for ẋ.
u = Kv x + Ka ẋ
Ka ẋ = u − Kv x
Ka ẋ = −Kv x + u
Kv 1
ẋ = − x+ u
Ka Ka
By inspection, A = − K
Kv
a
and B = 1
Ka . A model with position and velocity states
would be
14.4 Drivetrain left/right velocity state-space model 231
ẋ = Ax + Bu
position
x= u = voltage
velocity
0 1 0
ẋ = x + u (14.10)
0 −K Kv
a
1
Ka
The model in theorem 14.3.1 is undefined when Ka = 0. If one wants to design an LQR
for such a system, use the model in theorem 14.3.2. As Ka tends to zero, acceleration
requires no effort and velocity becomes an input for position control.
ẋ = Ax + Bu
x = position u = velocity
ẋ = 0 x + 1 u (14.11)
We want to derive what A and B are from linear and angular feedforward models.
Since the left and right dynamics are symmetric, we’ll guess the model has the form
A1 A2 B1 B2
A= B=
A2 A1 B2 B1
Let Kv,lin be the linear velocity gain, Ka,lin be the linear acceleration gain, Kv,ang
be the angular velocity gain, and Ka,ang be the angular acceleration gain. Let u =
T
Kv,lin v Kv,lin v be the input that makes both sides of the drivetrain move at a
T T
constant velocity v. Therefore, x = v v and ẋ = 0 0 . Substitute these into
232 Chapter 14. System identification
B2
A1
A2
0 = 1 1 Kv,lin Kv,lin B1
B2
T
Let u = Kv,lin v + Ka,lin a Kv,lin v + Ka,lin a be the input that accelerates both
T
sides of the drivetrain by a from an initial velocity of v. Therefore, x = v v and
T
ẋ = a a . Substitute these into the state-space model.
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
= +
a A2 A1 v B2 B1 Kv,lin v + Ka,lin a
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
= +
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
Since the rows are linearly dependent, we can use just one of them.
a = A1 A2 v + B1 B2 Kv,lin v + Ka,lin a
A1
A2
a= v v Kv,lin v + Ka,lin a Kv,lin + Ka,lin a B1
B2
14.4 Drivetrain left/right velocity state-space model 233
T
Let u = −Kv,ang v Kv,ang v be the input that rotates the drivetrain in place
T
where each wheel has a constant velocity v. Therefore, x = −v v and ẋ =
T
0 0 .
0 A1 A2 −v B1 B2 −Kv,ang v
= +
0 A2 A1 v B2 B1 Kv,ang v
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 −A2 A1 v −B2 B1 Kv,ang v
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 A1 −A2 v B1 −B2 Kv,ang v
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 −A1 A2 v −B1 B2 Kv,ang v
Since the rows are linearly dependent, we can use just one of them.
0 = −A1 A2 v + −B1 B2 Kv,ang v
0 = −vA1 + vA2 − Kv,ang vB1 + Kv,ang vB2
A1
A2
0 = −v v −Kv,ang v Kv,ang v
B1
B2
A1
A2
0 = −1 1 −Kv,ang Kv,ang B1
B2
T
Let u = −Kv,ang v − Ka,ang a Kv,ang v + Ka,ang a be the input that rotates
the drivetrain in place where each wheel has an initial speed of v and accelerates by a.
T T
Therefore, x = −v v and ẋ = −a a .
−a A1 A2 −v B1 B2 −Kv,ang v − Ka,ang a
= +
a A2 A1 v B2 B1 Kv,ang v + Ka,ang a
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
a −A2 A1 v −B2 B1 Kv,ang v + Ka,ang a
234 Chapter 14. System identification
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
a A1 −A2 v B1 −B2 Kv,ang v + Ka,ang a
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
Since the rows are linearly dependent, we can use just one of them.
−a = −A1 A2 v + −B1 B2 Kv,ang v + Ka,ang a
−a = −vA1 + vA2 − (Kv,ang v + Ka,ang a)B1 + (Kv,ang v + Ka,ang a)B2
A1
A2
−a = −v v −(Kv,ang v + Ka,ang a) Kv,ang v + Ka,ang a B1
B2
A1
A2
a = v −v Kv,ang v + Ka,ang a −(Kv,ang v + Ka,ang a) B1
B2
Solve for matrix elements with Wolfram Alpha. Let b = Kv,lin , c = Ka,lin , d =
Kv,ang , and f = Ka,ang .
inverse of {{1, 1, b, b},
{v, v, b v + c a, b v + c a},
{-1, 1, -d, d},
{v, -v, d v + f a, -(d v + f a)}}
* {{0}, {a}, {0}, {a}}
A1 −cd − bf
A2
= 1 cd − bf
B1 2cf c + f
B2 f −c
14.4 Drivetrain left/right velocity state-space model 235
A1 −Ka,lin Kv,ang − Kv,lin Ka,ang
A2 1 Ka,lin Kv,ang − Kv,lin Ka,ang
=
B1 2Ka,lin Ka,ang Ka,lin + Ka,ang
B2 Ka,ang − Ka,lin
Ka,lin Kv,ang
Kv,lin Ka,ang
− Ka,lin Ka,ang − Ka,lin
A1 Ka,lin Kv,ang
Ka,ang
Kv,lin Ka,ang
A2 1 K Ka,ang − Ka,lin Ka,ang
= a,lin
B1 2 K a,lin K a,ang
Ka,lin Ka,ang + Ka,lin Ka,ang
B2 Ka,ang Ka,lin
Ka,lin Ka,ang − Ka,lin Ka,ang
Kv,ang Kv,lin
A1 − Ka,ang − Ka,lin
A2 1 Kv,ang Kv,lin
= Ka,ang − Ka,lin
B1 2 Ka,ang + K 1
1
a,lin
B2
Ka,lin − Ka,ang
1 1
−
Kv,lin
+
Kv,ang
A1 Ka,lin Ka,ang
A2 1 − Kv,lin − Kv,ang
= Ka,lin Ka,ang
B1 2
Ka,lin 1 1
+ Ka,ang
B2
Ka,lin − Ka,ang
1 1
To summarize,
236 Chapter 14. System identification
ẋ = Ax + Bu
left velocity left voltage
x= u=
right velocity right voltage
A1 A2 B1 B2
ẋ = x+ u
A2 A1 B2 B1
K
− v,lin
+
Kv,ang
A1 Ka,lin Ka,ang
A2 1 − Kv,lin − Kv,ang
= Ka,lin Ka,ang (14.12)
B1 2
Ka,lin1 1
+ Ka,ang
B2
Ka,lin − Ka,ang
1 1
V V
Kv,ang and Ka,ang have units of L/T and L/T 2 respectively where V means voltage
If Kv and Ka are the same for both the linear and angular cases, it devolves to the
one-dimensional case. This means the left and right sides are decoupled.
− Kv
+ Kv
A1 Ka Ka
A2 1 − K v − K v
= Ka
B1 2 Ka
K1 + K1
B2 a a
Ka − Ka
1 1
Kv
A1 −2 Ka
A2 1 0
= 2
B1 2
Ka
B2 0
Kv
A1 − Ka
A2 0
= 1
B1
Ka
B2 0
14.5 Drivetrain linear/angular velocity state-space model 237
We want to write this model in terms of left and right voltages. The linear and angular
voltages have the following relations.
Factor them into a matrix equation, then solve for the linear and angular voltages.
ulef t 1 −1 ulin
=
uright 1 1 uang
ulin 0.5 0.5 ulef t
=
uang −0.5 0.5 uright
ẋ = Ax + Bu
linear velocity left voltage
x= u=
angular velocity right voltage
" K # " #
− Ka,lin
v,lin
0 0.5
Ka,lin
0.5
Ka,lin
ẋ = x+ u
0
K
− Kv,ang − Ka,ang
0.5 0.5
Ka,ang
a,ang
V V
Kv,ang and Ka,ang have units of A/T and A/T 2 respectively where V means voltage
If smooth, predictable motion of a system over time is desired, it’s best to only change a
system’s reference as fast as the system is able to physically move. Motion profiles, also
known as trajectories, are used for this purpose. For multi-state systems, each state is
given its own trajectory. Since these states are usually position and velocity, they share
different derivatives of the same profile.
until it reaches zero. In phase IV, the velocity is constant until deceleration begins, at
which point the profiles decelerates in a manner symmetric to phases I, II and III.
A trapezoid profile, on the other hand, has three phases. It is a subset of an S-curve pro-
file, having only the phases corresponding to phase II of the S-curve profile (constant
acceleration), phase IV (constant velocity), and phase VI (constant deceleration). This
reduced number of phases underscores the difference between these two profiles: the
S-curve profile has extra motion phases which transition between periods of accelera-
tion and periods of nonacceleration; the trapezoid profile has instantaneous transitions
between these phases. This can be seen in the acceleration graphs of the corresponding
velocity profiles for these two profile types.
15.1.1 Jerk
The motion characteristic that defines the change in acceleration, or transitional period,
is known as “jerk”. Jerk is defined as the rate of change of acceleration with time. In
a trapezoid profile, the jerk (change in acceleration) is infinite at the phase transitions,
while in the S-curve profile the jerk is a constant value, spreading the change in accel-
eration over a period of time.
From figures 15.1 and 15.2, we can see S-curve profiles are smoother than trapezoid
profiles. Why, however, do the S-curve profile result in less load oscillation? For a
given load, the higher the jerk, the greater the amount of unwanted vibration energy
will be generated, and the broader the frequency spectrum of the vibration’s energy will
be.
This means that more rapid changes in acceleration induce more powerful vibrations,
and more vibrational modes will be excited. Because vibrational energy is absorbed in
15.1 1-DOF motion profiles 243
the system mechanics, it may cause an increase in settling time or reduced accuracy if
the vibration frequency matches resonances in the mechanical system.
R S-curve profiles are unnecesary for FRC mechanisms. Motors in FRC effec-
tively have first-order velocity dynamics because the inductance effects are on
the order of microseconds; FRC dynamics operate on the order of milliseconds.
First-order motor models can achieve the instantaneous acceleration changes of
trapezoid profiles because voltage is electromotive force, which is analogous to
acceleration. That is, we can instantaneously achieve any desired acceleration
with a finite voltage, and we can follow any trapezoid profile perfectly with only
feedforward control.
244 Chapter 15. Motion profiles
where x(t) is the position at time t, x0 is the initial position, v0 is the initial velocity,
and a is the acceleration at time t.
Snippet 15.1 shows a Python implementation.
"""Function for generating a trapezoid profile."""
import math
Returns:
t_rec -- list of timestamps
x_rec -- list of positions at each timestep
v_rec -- list of velocities at each timestep
a_rec -- list of accelerations at each timestep
Keyword arguments:
max_v -- maximum velocity of profile
time_to_max_v -- time from rest to maximum velocity
dt -- timestep
goal -- final position when the profile is at rest
"""
t_rec = [0.0]
x_rec = [0.0]
v_rec = [0.0]
a_rec = [0.0]
a = max_v / time_to_max_v
time_at_max_v = goal / max_v - time_to_max_v
# If profile is short
if max_v * time_to_max_v > goal:
time_to_max_v = math.sqrt(goal / a)
time_from_max_v = time_to_max_v
time_total = 2.0 * time_to_max_v
profile_max_v = a * time_to_max_v
else:
time_from_max_v = time_to_max_v + time_at_max_v
time_total = time_from_max_v + time_to_max_v
profile_max_v = max_v
if t < time_to_max_v:
# Accelerate up
a_rec.append(a)
v_rec.append(a * t)
elif t < time_from_max_v:
# Maintain max velocity
a_rec.append(0.0)
v_rec.append(profile_max_v)
elif t < time_total:
# Accelerate down
decel_time = t - time_from_max_v
a_rec.append(-a)
v_rec.append(profile_max_v - a * decel_time)
else:
a_rec.append(0.0)
v_rec.append(0.0)
x_rec.append(x_rec[-1] + v_rec[-1] * dt)
return t_rec, x_rec, v_rec, a_rec
S-curve profile
The S-curve profile equations also include jerk.
1 1
x(t) = x0 + v0 t + at2 + jt3
2 6
1 2
v(t) = v0 + at + jt
2
a(t) = a0 + jt
where j is the jerk at time t, a(t) is the acceleration at time t, and a0 is the initial
acceleration.
Snippet 15.2 shows a Python implementation.
"""Function for generating an S-curve profile."""
import math
Returns:
t_rec -- list of timestamps
x_rec -- list of positions at each timestep
v_rec -- list of velocities at each timestep
a_rec -- list of accelerations at each timestep
Keyword arguments:
max_v -- maximum velocity of profile
max_a -- maximum acceleration of profile
time_to_max_a -- time from rest to maximum acceleration
246 Chapter 15. Motion profiles
dt -- timestep
goal -- final position when the profile is at rest
"""
t_rec = [0.0]
x_rec = [0.0]
v_rec = [0.0]
a_rec = [0.0]
j = max_a / time_to_max_a
short_profile = max_v * (time_to_max_a + max_v / max_a) > goal
if short_profile:
profile_max_v = max_a * (
math.sqrt(goal / max_a - 0.75 * time_to_max_a**2) - 0.5 *
time_to_max_a
)
else:
profile_max_v = max_v
# Ramp up acceleration
a_rec.append(-max_a + j * (t - t6))
v_rec.append(
max_a * (t2 + t5 - t - 0.5 * time_to_max_a) + 0.5 * j * (t -
t6) ** 2
)
else:
a_rec.append(0.0)
v_rec.append(0.0)
x_rec.append(x_rec[-1] + v_rec[-1] * dt)
return t_rec, x_rec, v_rec, a_rec
16.1 Introduction
When a robot can interfere with itself, we need a motion profiling solution that en-
sures the robot doesn’t destroy itself. Now that we’ve demonstrated various methods
of motion profiling, how can we incorporate safe zones into the planning process or
implement mechanism collision avoidance?
1-DOF systems are straightforward. We define the minimum and maximum position
constraints for the arm, elevator, etc. and ensure we stay between them. To do so, we
might use some combination of:
1. Rejecting, coercing, or clamping setpoints so that they are always within the valid
range
2. Soft limits on the speed controller
3. Limit switches, and/or physical hard stops
Multi-DOF systems are more complex. In the 2-DOF case (e.g., an arm with a shoulder
and a wrist, or an elevator and an arm), each DOF has some limits that are indepen-
dent of the rest of the system (e.g., min and max angles), but there are also dependent
constraints (e.g., if the shoulder is angled down, the wrist must be angled up to avoid
hitting the ground). Examples of such constraints could include:
• Minimum/maximum angles
• Maximum extension rules
250 Chapter 16. Configuration spaces
• Robot chassis
• Angles that require an infeasible amount of holding torque
One intuitive way to think about this is to envision an N-D space (2D for 2 DOFs, etc.)
where the x and y axes are the positions of the degrees of freedom. A given point (x, y)
can either be a valid configuration or an invalid configuration (meaning it violates one
or more constraints). So, our 2D plot may have regions (half-planes, rectangles, etc.)
representing constraints.
Figure 16.3: Elevator-arm configuration space with initial state x and goal state r
Figure 16.4: Elevator-arm configuration space with path between initial state x and
final state r
16.3 Waypoint traversal 253
minf (x)
x
subject to ce (x) = 0
ci (x) ≥ 0
where f (x) is the scalar cost function, x is the vector of decision variables (variables
the solver can tweak to minimize the cost function), ce (x) is the vector-valued function
whose rows are equality constraints, and ci (x) is the vector-valued function whose
rows are inequality constraints. Constraints are equations or inequalities of the decision
256 Chapter 17. Trajectory optimization
variables that constrain what values the solver is allowed to use when searching for an
optimal solution.
The nice thing about Sleipnir is users don’t have to put their system in the form shown
above manually; they can write it in natural mathematical form and it’ll be converted
for them. We’ll cover some examples next.
r = 2.0
problem = OptimizationProblem()
# 1x1 input vector with N timesteps (input at last state doesn't matter)
U = problem.decision_variable(1, N)
Applying constraints
Now, we need to apply dynamics constraints between timesteps.
# Kinematics constraint assuming constant acceleration between timesteps
for k in range(N):
p_k1 = X[0, k + 1]
v_k1 = X[1, k + 1]
p_k = X[0, k]
v_k = X[1, k]
a_k = U[0, k]
# Limit velocity
problem.subject_to(-1 <= X[1, :])
problem.subject_to(X[1, :] <= 1)
# Limit acceleration
problem.subject_to(-1 <= U)
problem.subject_to(U <= 1)
The solver will find the decision variable values that minimize the cost function while
satisfying the constraints.
258 Chapter 17. Trajectory optimization
Figure 17.1: Double integrator position Figure 17.2: Double integrator velocity
Other applications
In retrospect, the solution here seems obvious: if you want to reach the desired position
in the minimum time, you just apply positive max input to accelerate to the max speed,
coast for a while, then apply negative max input to decelerate to a stop at the desired
position. Optimization problems can get more complex than this though. In fact, we can
use this same framework to design optimal trajectories for a drivetrain while satisfying
dynamics constraints, avoiding obstacles, and driving through points of interest.
17.2 Further reading 259
C Feedforwards . . . . . . . . . . . . . . . . . . . . . . . 281
D Derivations . . . . . . . . . . . . . . . . . . . . . . . . . 289
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . 327
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
This page intentionally left blank
Sunset in an airplane over New Mexico
X P1 P2 Y
X P1 P2 Y
+
X P1 Y
±
P2
X P1 ± P2 Y
+
X P1 Y
±
P2
P1 +
X P2 Y
P2
±
+
X P1 Y
∓
P2
P1
X 1±P1 P 2 Y
+
X P1 Y
∓
P2
1 −
X P1 P2 Y
P2
∓
B. Linear-quadratic regulator
This appendix will go into more detail on the linear-quadratic regulator’s derivation and
interesting applications.
B.1 Derivation
Let there be a discrete time linear system defined as
where J represents a trade-off between state excursion and control effort with the
weighting factors Q, R, and N. Q is the weight matrix for error, R is the weight
matrix for control effort, and N is a cross weight matrix between error and control
effort. N is commonly utilized when penalizing the output in addition to the state and
input.
X∞ T
xk Qxk + Nuk
J=
uk NT xk + Ruk
k=0
268 Appendix B. Linear-quadratic regulator
∞
X
Qxk + Nuk
J= xT uT
k k NT xk + Ruk
k=0
X∞
J= (xT T T
k (Qxk + Nuk ) + uk (N xk + Ruk ))
k=0
X∞
J= (xT T T T T
k Qxk + xk Nuk + uk N xk + uk Ruk )
k=0
X∞
J= (xT T T T T
k Qxk + xk Nuk + xk Nuk + uk Ruk )
k=0
X∞
J= (xT T T
k Qxk + 2xk Nuk + uk Ruk )
k=0
X∞
J= (xT T T
k Qxk + uk Ruk + 2xk Nuk )
k=0
The feedback control law which minimizes J subject to the constraint xk+1 = Axk +
Buk is
uk = −Kxk
where K is given by
K = (R + BT SB)−1 (BT SA + NT )
and S is found by solving the discrete time algebraic Riccati equation defined as
or alternatively
AT SA − S − AT SB(R + BT SB)−1 BT SA + Q = 0
with
A = A − BR−1 NT
Q = Q − NR−1 NT
B.1 Derivation 269
If there is no cross-correlation between error and control effort, N is a zero matrix and
the cost functional simplifies to
∞
X
J= (xT T
k Qxk + uk Ruk )
k=0
The feedback control law which minimizes this J subject to xk+1 = Axk + Buk is
uk = −Kxk
where K is given by
K = (R + BT SB)−1 BT SA
and S is found by solving the discrete time algebraic Riccati equation defined as
AT SA − S − AT SB(R + BT SB)−1 BT SA + Q = 0
import numpy as np
import scipy as sp
Keyword arguments:
A -- numpy.array(states x states), system matrix.
B -- numpy.array(states x inputs), input matrix.
Q -- numpy.array(states x states), state cost matrix.
R -- numpy.array(inputs x inputs), control effort cost matrix.
N -- numpy.array(states x inputs), cross weight matrix.
Returns:
K -- numpy.array(inputs x states), controller gain matrix.
"""
P = sp.linalg.solve_discrete_are(a=A, b=B, q=Q, r=R, s=N)
return np.linalg.solve(R + B.T @ P @ B, B.T @ P @ A + N.T)
Other formulations of LQR for finite horizon and discrete time can be seen on Wikipedia
[2].
MIT OpenCourseWare has a rigorous proof of the results shown above [18].
270 Appendix B. Linear-quadratic regulator
However, we may not know how to select costs for some of the states, or we don’t
care what certain internal states are doing. We can address this by writing the cost
functional in terms of the output vector instead of the state vector. Not only can we
make our output contain a subset of states, but we can use any other cost metric we can
think of as long as it’s representable as a linear combination of the states and inputs.[1]
For state feedback with an output cost, we want to minimize the following cost func-
tional.
∞
X
J= (ykT Qyk + uT
k Ruk )
k=0
another model.
B.2 State feedback with output cost 271
∞
X T T !
xk C 0 0 xk
J= Q C D +
uk DT 0 R uk
k=0
Multiply in Q.
∞
X T T !
xk C Q 0 0 xk
J= C D +
uk DT Q 0 R uk
k=0
Thus, state feedback with an output cost can be defined as the following optimization
problem.
The optimal control policy u∗k is K(rk −xk ) where rk is the desired state. Note that
the Q in CT QC is outputs × outputs instead of states × states. K can be computed
via the typical LQR equations based on the algebraic Ricatti equation.
If the output is just the state vector, then C = I, D = 0, and the cost functional
simplifies to that of LQR with a state cost.
X∞ T
xk Q 0 xk
J=
uk 0 R uk
k=0
272 Appendix B. Linear-quadratic regulator
yk+1 = Cxk+1
yk+1 = C(Axk + Buk )
yk+1 = CAxk + CBuk
zk+1 = Aref yk
zk+1 = Aref Cxk
Therefore,
T
xk xk
Factor out from the left side and from the right side of each term.
uk uk
X∞ T
xk (CA − Aref C)T xk
J= Q CA − A ref C CB
uk (CB)T uk
k=0
T !
x 0 0 xk
+ k
uk 0 R uk
X∞ T
xk (CA − Aref C)T
J= T Q CA − Aref C CB
uk (CB)
k=0
0 0 xk
+
0 R uk
Multiply in Q.
X∞ T
xk (CA − Aref C)T Q
J= CA − Aref C CB
uk (CB)T Q
k=0
0 0 xk
+
0 R uk
(B.4)
Thus, implicit model following can be defined as the following optimization prob-
lem.
where
The optimal control policy u∗k is −Kxk . K can be computed via the typical LQR
equations based on the algebraic Ricatti equation.
The control law uimf,k = −Kxk makes xk+1 = Axk + Buimf,k match the open-
loop response of zk+1 = Aref zk .
If the original and desired system have the same states, then C = I and the cost func-
B.4 Time delay compensation 275
tional simplifies to
(A − Aref )T Q(A − Aref ) (A − Aref )T QB
X xk
∞ T | {z } | {z }
Q N xk
J= (B.6)
uk BT Q(A − Aref ) BT QB + R uk
k=0 | {z } | {z }
NT R
Let x = z.
xk+1 = zk+1
Axk + Buimf,k = Aref xk + Bref uk
Buimf,k = Aref xk − Axk + Bref uk
Buimf,k = (Aref − A)xk + Bref uk
uimf,k = B+ ((Aref − A)xk + Bref uk )
uimf,k = −B+ (A − Aref )xk + B+ Bref uk
The first term makes the open-loop poles match that of the reference model, and the
second term makes the input behave like that of the reference model.
time delay using our model and the aforementioned control law. Figure B.2 shows
improved control with the predicted state.[2]
For steady-state controller gains, this method of delay compensation seems to work
better for second-order systems than first-order systems. Figures B.3 and B.5 show time
delay for a drivetrain velocity system and flywheel system respectively. Figures B.4 and
B.6 show that compensating the controller gain significantly reduces the feedback gain.
For systems with fast dynamics and a long time delay, the delay-compensated controller
has an almost open-loop response because only feedforward has a significant effect; this
has poor disturbance rejection. Fixing the source of the time delay is always preferred
for these systems.
Since we are computing the control based on future states and the state exponentially
converges to zero over time, the control action we apply at the current timestep also
converges to zero for longer time delays. During startup, the inputs we use to predict
the future state are zero because there’s initially no input history. This means the initial
inputs are larger to give the system a kick in the right direction. As the input delay
buffer fills up, the controller gain converges to a smaller steady-state value. If one uses
the steady-state controller gain during startup, the transient response may be slow.
All figures shown here use the steady-state control law (the second case in equation
(B.12)).
[2] Input delay and output delay have the same effect on the system, so the time delay can be simulated
We’ll show how to derive this controller gain compensation for continuous and discrete
systems.
We need to find x(t + L) given x(t). Since we know u(t) = −Kx(t) will be applied
over the time interval [t, t + L), substitute it into the continuous model.
ẋ = Ax(t) + Bu(t)
ẋ = Ax(t) + B(−Kx(t))
ẋ = Ax(t) − BKx(t)
ẋ = (A − BK)x(t)
We now have a differential equation for the closed-loop system dynamics. Take the
matrix exponential from the current time t to L in the future to obtain x(t + L).
x(t + L) = e(A−BK)L x(t) (B.7)
This works for t ≥ L, but if t < L, we have no input history for the time interval [t, L).
If we assume the inputs for [t, L) are zero, the state prediction for that interval is
x(L) = eA(L−t) x(t)
The time interval [0, t) has nonzero inputs since it’s in the past and was using the normal
control law.
x(t + L) = e(A−BK)t x(L)
x(t + L) = e(A−BK)t eA(L−t) x(t) (B.8)
Therefore, equations (B.7) and (B.8) give the latency-compensated control law for all
t ≥ 0.
u(t) = −Kx(t + L)
(
−Ke(A−BK)t eA(L−t) x(t) if 0 ≤ t < L
u(t) = (B.9)
−Ke(A−BK)L x(t) if t ≥ L
278 Appendix B. Linear-quadratic regulator
uk = −Kxk+L/T
We need to find xk+L/T given xk . Since we know uk = −Kxk will be applied for
the timesteps k through k + TL , substitute it into the discrete model.
Let T be the duration between timesteps in seconds and L be the amount of time delay
in seconds. TL gives the number of timesteps represented by L.
L
xk+L/T = (A − BK) T xk (B.10)
This works for kT ≥ L where kT is the current time, but if kT < L, we have no input
history for the time interval [kT, L). If we assume the inputs for [kT, L) are zero, the
state prediction for that interval is
L−kT
xL/T = A T xk
T −k
L
xL/T = A xk
The time interval [0, kT ) has nonzero inputs since it’s in the past and was using the
normal control law.
Therefore, equations (B.10) and (B.11) give the latency-compensated control law for
all t ≥ 0.
uk = −Kxk+L/T
B.4 Time delay compensation 279
(
−K(A − BK)k A T −k xk
L
if 0 ≤ k < L
T
uk = L (B.12)
−K(A − BK) T xk if k ≥ TL
If the delay L isn’t a multiple of the sample period T in equation (B.12), we have to
evaluate fractional matrix powers, which can be tricky. Eigen (a C++ library) supports
fractional powers with the pow() member function provided by
<unsupported/Eigen/MatrixFunctions>. SciPy (a Python library) supports fractional
powers with the free function
scipy.linalg.fractional_matrix_power(). If the language you’re using doesn’t provide
such a function, you can try the following approach instead.
Let there be a matrix M raised to a fractional power n. If M is diagonalizable, we can
obtain an exact answer for Mn by decomposing M into PDP−1 where D is a diagonal
matrix, computing Dn as each diagonal element raised to n, then recomposing Mn as
PDn P−1 .
If a matrix raised to a fractional power in equation (B.12) isn’t diagonalizable, we have
to approximate by rounding TL to the nearest integer. This approximation gets worse
as L mod T approaches T2 .
280 Appendix B. Linear-quadratic regulator
Figure B.3: Drivetrain velocity response at Figure B.4: Drivetrain velocity response at
1 ms sample period with 40 ms of output 1 ms sample period with 40 ms of output
lag lag (delay-compensated)
C. Feedforwards
This appendix will show the derivations for alternate feedforward formulations.
where uk is the feedforward input. Note that this feedforward equation does not and
should not take into account any feedback terms. We want to find the optimal uk such
that we minimize the tracking error between rk+1 and rk .
To solve for uk , we need to take the inverse of the nonsquare matrix B. This isn’t
possible, but we can find the pseudoinverse given some constraints on the state tracking
error and control effort. To find the optimal solution for these sorts of trade-offs, one
can define a cost function and attempt to minimize it. To do this, we’ll first solve the
expression for 0.
0 = Buk − (rk+1 − Ark )
282 Appendix C. Feedforwards
This expression will be the state tracking cost we use in our cost function.
Our cost function will use an H2 norm with Q as the state cost matrix with dimen-
sionality states × states and R as the control input cost matrix with dimensionality
inputs × inputs.
C.1.2 Minimization
Given theorem 5.12.1 and corollary 5.12.3, find the minimum of J by taking the partial
derivative with respect to uk and setting the result to 0.
∂J
= 2BT Q(Buk − (rk+1 − Ark )) + 2Ruk
∂uk
0 = 2BT Q(Buk − (rk+1 − Ark )) + 2Ruk
0 = BT Q(Buk − (rk+1 − Ark )) + Ruk
0 = BT QBuk − BT Q(rk+1 − Ark ) + Ruk
BT QBuk + Ruk = BT Q(rk+1 − Ark )
(BT QB + R)uk = BT Q(rk+1 − Ark )
uk = (BT QB + R)−1 BT Q(rk+1 − Ark )
uk = Kf f (rk+1 − Ark )
Figure C.1 shows plant inversion applied to a second-order CIM motor model.
Plant inversion isn’t as effective with both Q and R cost because the R matrix penalized
control effort. The reference tracking with no cost matrices is much better.
C.2 Steady-state feedforward 283
x c = Nx y c
Nx converts desired outputs yc to desired states xc (also known as r). For steady-state,
that is
xss = Nx yss (C.1)
The second steady-state feedforward converts the desired outputs y to the control input
required at steady-state.
u c = Nu y c
Nu converts the desired outputs y to the control input u required at steady-state. For
steady-state, that is
uss = Nu yss (C.2)
284 Appendix C. Feedforwards
ẋ = Ax + Bu
y = Cx + Du
0 = Axss + Buss
yss = Cxss + Duss
0 = (A − I)xss + Buss
yss = Cxss + Duss
C.2 Steady-state feedforward 285
xss = Nx yss
+
Nx xss = yss
uss = Nu yss
uss = Nu (N+
x xss )
uss = Nu N+
x xss
uf f = Nu N+
xr
Continuous:
+
Nx A B 0
= (C.3)
Nu C D 1
uf f = Nu N+
xr (C.4)
Discrete:
+
Nx A−I B 0
= (C.5)
Nu C D 1
uf f,k = Nu N+
x rk (C.6)
Figure C.2: Second-order CIM motor re- Figure C.3: Second-order CIM motor re-
sponse with various feedforwards sponse with plant inversions
This page intentionally left blank
Sunset in an airplane over New Mexico
D. Derivations
which is an analytical solution to the continuous model. Now we want to discretize it.
def
xk = x(kT )
Z kT
xk = eAkT x(0) + eA(kT −τ ) Bu(τ ) dτ
0
Z (k+1)T
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z kT
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z (k+1)T
+ eA((k+1)T −τ ) Bu(τ ) dτ
kT
Z kT
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z (k+1)T
+ eA(kT +T −τ ) Bu(τ ) dτ
kT
Z !
kT
AT AkT A(kT −τ )
xk+1 = e e x(0) + e Bu(τ ) dτ
0
| {z }
xk
Z (k+1)T
+ eA(kT +T −τ ) Bu(τ ) dτ
kT
We assume that u is constant during each timestep, so it can be pulled out of the integral.
Z (k+1)T !
AT A(kT +T −τ )
xk+1 = e xk + e dτ Buk
kT
The second term can be simplified by substituting it with the function v(τ ) = kT +
T − τ . Note that dτ = −dv.
Z v((k+1)T ) !
xk+1 = eAT xk − eAv dv Buk
v(kT )
Z 0
xk+1 = e AT
xk − e Av
dv Buk
T
D.2 Kalman filter as Luenberger observer 291
Z !
T
AT
xk+1 = e xk + eAv dv Buk
0
where a superscript of minus denotes a priori and plus denotes a posteriori estimate.
Combining equation (D.1) and equation (D.2) gives
− −
k+1 = Ax̂k + Buk + L(yk − Cx̂k )
x̂+ (D.3)
The following is a Kalman filter that considers the current update step and the next
predict step together rather than the current predict step and current update step.
Update step
K k = P− T − T
k C (CPk C + R)
−1
(D.4)
x̂+
k = x̂−
k + Kk (yk − Cx̂−
k) (D.5)
P+k = (I − Kk C)P−
k (D.6)
Predict step
x̂+ +
k+1 = Ax̂k + Buk (D.7)
P−
k+1 = AP+
kA
T
+Q (D.8)
Let L = AKk .
− −
k+1 = Ax̂k + Buk + L(yk − Cx̂k )
x̂+ (D.9)
which matches equation (D.3). Therefore, the eigenvalues of the Kalman filter observer
can be obtained by
eig(A − LC)
eig(A − (AKk )(C))
eig(A(I − Kk C)) (D.10)
The predict step is the same as the Kalman filter’s. Therefore, a Luenberger observer
run with prediction and update steps is written as follows.
Predict step
x̂− −
k+1 = Ax̂k + Buk (D.11)
Update step
− −1
x̂+
k+1 = x̂k+1 + A L(yk+1 − ŷk+1 ) (D.12)
ŷk+1 = Cx̂−
k+1 (D.13)
Sunset in an airplane over New Mexico
This appendix is provided for those who are curious about a lower-level interpretation
of control systems. It describes what a transfer function is and shows how they can be
used to analyze dynamical systems. Emphasis is placed on the geometric intuition of
this analysis rather than the frequency domain math. Many tools exclusive to classical
control theory (root locus, Bode plots, Nyquist plots, etc.) aren’t useful for or relevant
to the examples presented in the main chapters, so they would only complicate the
learning process.
With classical control theory’s geometric interpretation, one can perform stability and
robustness analyses and design reasonable controllers for systems on the back of a nap-
kin. However, computing power is much more plentiful nowadays; we should take
advantage of the modern tools this enables to better express the controls designer’s
intent (for example, via optimal control criteria rather than frequency response charac-
teristics).
Classical control theory should only be used when the controls designer cares about
directly shaping the frequency response of a system. Electrical engineers do this a lot,
but all the other university controls students have been forced to learn it too regardless
of its utility in other disciplines.
294 Appendix E. Classical control theory
nonlinear control.
E.2 Transfer functions 295
“What does the Laplace Transform really tell us?” (21 minutes)
Zach Star
https://youtu.be/n2y7n6jw5d0
equation and rearranging the terms to obtain a ratio of the output variable to the input
variable. Equation (E.1) is an example of a transfer function.
zeroes
z }| {
(s − 9 + 9i)(s − 9 − 9i)
H(s) = (E.1)
s(s + 10)
| {z }
poles
R Imaginary poles and zeroes always come in complex conjugate pairs (e.g., −2 +
3i, −2 − 3i).
The locations of the closed-loop poles in the complex plane determine the stability of
the system. Each pole represents a frequency mode of the system, and their location
determines how much of each response is induced for a given input frequency. Figure
[2]We are handwaving Laplace transform derivations because they are complicated and neither relevant
nor useful.
296 Appendix E. Classical control theory
E.2 shows the impulse responses in the time domain for transfer functions with various
pole locations. They all have an initial condition of 1.
Im
Stable Unstable
t
t
t
LHP RHP
t
t t
Re
t t t
Poles in the left half-plane (LHP) are stable; the system’s output may oscillate but
it converges to steady-state. Poles on the imaginary axis are marginally stable; the
system’s output oscillates at a constant amplitude forever. Poles in the right half-plane
(RHP) are unstable; the system’s output grows without bound.
Nonminimum phase zeroes
While poles in the RHP are unstable, the same is not true for zeroes. They can be
characterized by the system initially moving in the wrong direction before heading to-
ward the reference. Since the poles always move toward the zeroes, zeroes impose a
“speed limit” on the system response because it takes a finite amount of time to move
the wrong direction, then change directions.
One example of this type of system is bicycle steering. Try riding a bicycle without
holding the handle bars, then poke the right handle; the bicycle turns right. Further-
more, if one is holding the handlebars and wants to turn left, rotating the handlebars
counterclockwise will make the bicycle fall toward the right. The rider has to lean into
the turn and overpower the nonminimum phase dynamics to go the desired direction.
Another example is a Segway. To move forward by some distance, the Segway must
first roll backward to rotate the Segway forward. Once the Segway starts falling in
298 Appendix E. Classical control theory
that direction, it begins rolling forward to avoid falling over until it reaches the target
distance. At that point, the Segway increases its forward speed to pitch backward and
slow itself down. To come to a stop, the Segway rolls backward again to level itself out.
Pole-zero cancellation
Pole-zero cancellation occurs when a pole and zero are located at the same place in the
s-plane. This effectively eliminates the contribution of each to the system dynamics.
By placing poles and zeroes at various locations (this is done by placing transfer func-
tions in series), we can eliminate undesired system dynamics. While this may appear
to be a useful design tool at first, there are major caveats. Most of these are due to
model uncertainty resulting in poles which aren’t in the locations the controls designer
expected.
Notch filters are typically used to dampen a specific range of frequencies in the system
response. If its band is made too narrow, it can still leave the undesirable dynamics,
but now you can no longer measure them in the response. They are still happening, but
they are what’s called unobservable.
Never pole-zero cancel unstable or nonminimum phase dynamics. If the model doesn’t
quite reflect reality, an attempted pole cancellation by placing a nonminimum phase
zero results in the pole still moving to the zero placed next to it. You have the same
dynamics as before, but the pole is also stuck where it is no matter how much feedback
gain is applied. For an attempted nonminimum phase zero cancellation, you have ef-
fectively placed an unstable pole that’s unobservable. This means the system will be
going unstable and blowing up, but you won’t be able to detect this and react to it.
Keep in mind when making design decisions that the model likely isn’t perfect. The
whole point of feedback control is to be robust to this kind of uncertainty.
+ Z(s)
X(s) G(s) Y (s)
−
H(s)
Y (s) Z(s)G(s)
=
X(s) Z(s) + Z(s)G(s)H(s)
Y (s) G(s)
= (E.2)
X(s) 1 + G(s)H(s)
where positive feedback uses the top sign and negative feedback uses the bottom sign.
Control system with feedback
+
X(s) K G Y (s)
−
Following equation (E.3), the transfer function of figure E.4, a control system diagram
with negative feedback, from input to output is
Y (s) KG
Gcl (s) = = (E.4)
X(s) 1 + KGH
300 Appendix E. Classical control theory
The numerator is the open-loop gain and the denominator is one plus the gain around
the feedback loop, which may include parts of the open-loop gain. As another example,
the transfer function from the input to the error is
E(s) 1
Gcl (s) = = (E.5)
X(s) 1 + KGH
The roots of the denominator of Gcl (s) are different from those of the open-loop trans-
fer function KG(s). These are called the closed-loop poles.
K
G(s) =
(Js + b)(Ls + R) + K 2
K
G(s) =
s((Js + b)(Ls + R) + K 2 )
Compare the step response of this system (figure E.5) with the step response of this
system with L set to zero (figure E.6). For small values of K, both systems are stable
and have nearly indistinguishable step responses due to the exceedingly small contribu-
tion from the fast pole. The high frequency dynamics only cause instability for large
values of K that induce fast system responses. In other words, the system responses of
the second-order model and its first-order approximation are similar for low frequency
operating regimes.
[3] This also has the effect of augmenting the system with a position state, making it a third-order system.
E.2 Transfer functions 301
Figure E.5: Step response of second-order Figure E.6: Step response of first-order
DC brushed motor plant augmented with DC brushed motor plant augmented with
position (L = 230 μH) position (L = 0 μH)
Why can’t unstable poles close to the origin be ignored in the same way? The response
of high frequency stable poles decays rapidly. Unstable poles, on the other hand, repre-
sent unstable dynamics which cause the system output to grow to infinity. Regardless
of how slow these unstable dynamics are, they will eventually dominate the response.
302 Appendix E. Classical control theory
E.3.1 Projections
Consider a two-dimensional Euclidean space R2 shown in figure E.7 (each R is a di-
mension whose domain is the set of real numbers, so R2 is the standard x-y plane).
Ordinarily, we notate points in this plane by their components in the set of basis vectors
{î, ĵ}, where î (pronounced i-hat) is the unit vector in the positive x direction and ĵ
is the unit vector in the positive y direction. Figure E.8 shows an example vector v in
this basis.
E.3 Laplace domain analysis 303
2
v = 2î + 1.5ĵ
1.5
ĵ
x
î
How do we find the coordinates of v in this basis mathematically? As long as the basis
is orthogonal (i.e., the basis vectors are at right angles to each other), we simply take
the orthogonal projection of v onto î and ĵ. Intuitively, this means finding “the amount
of v that points in the direction of î or ĵ”. Note that a set of orthogonal vectors have a
dot product of zero with respect to each other.
More formally, we can calculate projections with the dot product - the projection of v
onto any other vector w is as follows.
v·w
projw v =
|w|
Since î and ĵ are unit vectors, their magnitudes are 1 so the coordinates of v are v · î
and v · ĵ.
We can use this same process to find the coordinates of v in any orthogonal basis. For
example, imagine the basis {î + ĵ, î − ĵ} - the coordinates in this basis are given by
v·(î+ĵ)
√
2
and v·(√î−
2
ĵ)
. Let’s “unwrap” the formula for dot product and look a bit more
closely.
1 X
n
v · (î + ĵ)
√ =√ vi (î + ĵ)i
2 2 i=0
where the subscript i denotes which component of each vector and n is the total number
304 Appendix E. Classical control theory
v · (î + ĵ) 1
√ = √ (2î · î + 1.5ĵ · ĵ)
2 2
v · (î + ĵ) 1
√ = √ (2 + 1.5)
2 2
v · (î + ĵ) 3.5
√ =√
2 2
√
v · (î + ĵ) 3.5 2
√ =
2 2
v · (î + ĵ) √
√ = 1.75 2
2
v · (î − ĵ) 1
√ = √ (2î · î − 1.5ĵ · ĵ)
2 2
v · (î − ĵ) 1
√ = √ (2 − 1.5)
2 2
v · (î − ĵ) 0.5
√ =√
2 2
√
v · (î − ĵ) 0.5 2
√ =
2 2
v · (î − ĵ) √
√ = 0.25 2
2
Figure E.9 shows this result geometrically with respect to the basis {î + ĵ, î − ĵ}.
The previous example was only a change of coordinates in a finite-dimensional vector
space. However, as we will see, the core idea does not change much when we move to
more complicated structures. Observe the formula for the Fourier transform.
Z ∞
ˆ
f (ξ) = f (x)e−2πixξ dx where ξ ∈ R
−∞
E.3 Laplace domain analysis 305
√
0.25 2
v
î + ĵ √
1.75 2
x
î − ĵ
This is fundamentally the same formula we had before. f (x) has taken the place of
vn , e−2πixξ has taken the place of (î + ĵ)i , and the sum over i has turned into an
integral over dx, but the underlying concept is the same. To change coordinates in a
function space, we simply take the orthogonal projection onto our new basis functions.
In the case of the Fourier transform, the function basis is the family of functions of the
form f (x) = e−2πixξ for ξ ∈ R. Since these functions are oscillatory at a frequency
determined by ξ, we can think of this as a “frequency basis”.
R Watch the “Abstract vector spaces” video from 3Blue1Brown’s Essence of linear
algebra series for a more geometric introduction to using functions as a basis.
Now, the Laplace transform is somewhat more complicated - as it turns out, the Fourier
306 Appendix E. Classical control theory
basis is orthogonal, so the analogy to the simpler vector space holds almost-precisely.
The Laplace transform is not orthogonal, so we can’t interpret it strictly as a change of
coordinates in the traditional sense. However, the intuition is the same: we are taking
the orthogonal projection of our original function onto the functions of our new basis
set. Z ∞
F (s) = f (t)e−st dt, where s ∈ C
0
Here, it becomes obvious that the Laplace transform is a generalization of the Fourier
transform in that the basis family is strictly larger (we have allowed the “frequency”
parameter to take complex values, as opposed to merely real values). As a result, the
Laplace basis contains functions that grow and decay, while the Fourier basis does not.
[4] The Dirac delta function is zero everywhere except at the origin. The nonzero region has an infinitesimal
width and has a height such that the area within that region is 1.
E.3 Laplace domain analysis 307
Im(jω)
Re(σ)
oscillation decays and the system converges to zero (i.e., a decaying exponential). Fig-
ure E.2 shows this for various points.
If we move the component frequencies in the Fmajor4 chord example parallel to the
real axis to σ = −25, the resulting time domain response attenuates according to the
decaying exponential e−25t (see figure E.13).
Note that this explanation as a basis isn’t exact because the Laplace basis isn’t orthogonal
(that is, the x and y coordinates affect each other and have cross-talk). In the frequency
domain, we had a basis of sine waves that we represented as delta functions in the
frequency domain. Each frequency contribution was independent of the others. In the
1
Laplace domain, this is not the case; a pure exponential is s−a (a rational function
where a is a real number) instead of a delta function. This function is nonzero at points
that aren’t actually frequencies present in the time domain. Figure E.14 demonstrates
this, which shows the Laplace transform of the Fmajor4 chord plotted in 3D.
Notice how the values of the function around each component frequency decrease ac-
cording to √ 21 2 in the x and y directions (in just the x direction, it would be x1 ).
x +y
E.3 Laplace domain analysis 309
We won’t be computing any Laplace transforms by hand using this formula (everyone in
the real world looks these up in a table anyway). Common Laplace transforms (assum-
ing zero initial conditions) are shown in table E.1. Of particular note are the Laplace
transforms for the derivative, unit step,[5] and exponential decay. We can see that a
derivative is equivalent to multiplying by s, and an integral is equivalent to multiplying
by 1s .
Table E.1: Common Laplace transforms and Laplace transform properties with zero
initial conditions
When these are in unity feedback, the transfer function from the input voltage to the
error is
E(s) 1
=
V (s) 1 + K(s)G(s)
1
E(s) = V (s)
1 + K(s)G(s)
1
E(s) = V (s)
K
1 + (Kp ) (Js+b)(Ls+R)+K 2
1
E(s) = Kp K
V (s)
1+ (Js+b)(Ls+R)+K 2
Notice that the steady-state error is nonzero. To fix this, an integrator must be included
in the controller.
Ki
K(s) = Kp +
s
The same steady-state calculations are performed as before with the new controller.
E(s) 1
=
V (s) 1 + K(s)G(s)
312 Appendix E. Classical control theory
1
E(s) = V (s)
1 + K(s)G(s)
1 1
E(s) =
1 + Kp + s Ki K s
(Js+b)(Ls+R)+K 2
1 1
ess = lim s
s→0
1 + Kp + s Ki K s
(Js+b)(Ls+R)+K 2
1
ess = lim
s→0 Ki
K
1 + Kp + s (Js+b)(Ls+R)+K 2
1 s
ess = lim
s→0 Ki
K
s
1 + Kp + s (Js+b)(Ls+R)+K 2
s
ess = lim
s→0 K
s + (Kp s + Ki ) (Js+b)(Ls+R)+K 2
0
ess =
K
0 + (Kp (0) + Ki ) (J(0)+b)(L(0)+R)+K 2
0
ess = K
Ki bR+K 2
seven state system, there aren’t enough degrees of freedom to place the system’s poles
in desired locations. This will result in poor control.
The math for PID doesn’t assume voltage, a motor, etc. It defines an output based on
derivatives and integrals of its input. We happen to use it for motors because it actually
works pretty well for it because motors are second-order systems.
The following math will be in continuous time, but the same ideas apply to discrete
time. This is all assuming a velocity controller.
Our simple motor model hooked up to a mass is
ω
V = IR + (E.10)
Kv
τ = IKt (E.11)
dω
τ =J (E.12)
dt
For an explanation of where these equations come from, read section 12.1.
dω
First, we’ll solve for dt in terms of V .
Substitute equation (E.11) into equation (E.10).
ω
V = IR +
K
v
τ ω
V = R+
Kt Kv
Substitute in equation (E.12).
J dω
dt ω
V = R+
Kt Kv
dω
Solve for dt .
J dω
dt ω
V = R+
Kt Kv
ω J dω
V − = dt R
Kv Kt
dω Kt ω
= V −
dt JR Kv
dω Kt Kt
=− ω+ V
dt JRKv JR
314 Appendix E. Classical control theory
Now take the Laplace transform. Because the Laplace transform is a linear operator,
we can take the Laplace transform of each term individually. Based on table E.1, dω
dt
becomes sω and ω(t) and V (t) become ω(s) and V (s) respectively (the parenthetical
notation has been dropped for clarity).
Kt Kt
sω = − ω+ V (E.13)
JRKv JR
ω
Solve for the transfer function H(s) = V .
Kt Kt
sω = − ω+ V
JRKv JR
Kt Kt
s+ ω= V
JRKv JR
Kt
ω JR
= Kt
V s + JRK v
V = Kp (ωgoal − ω)
Kt Kt
sω = − ω+ Kp (ωgoal − ω)
JRKv JR
ω
Solve for the transfer function H(s) = ωgoal .
Kt Kt Kp
sω = −
ω+ (ωgoal − ω)
JRKv JR
Kt Kt Kp Kt Kp
sω = − ω+ ωgoal − ω
JRKv JR JR
Kt Kt Kp Kt Kp
s+ + ω= ωgoal
JRKv JR JR
Kt Kp
ω
= JR
ωgoal s+ Kt
+
Kt Kp
JRKv JR
E.3 Laplace domain analysis 315
Kt Kp
This has a pole at − Kt
JRKv + JR . Assuming that that quantity is negative (i.e.,
1
we are stable), that pole corresponds to a time constant of Kt Kt Kp .
JRKv + JR
As can be seen above, a flywheel has a single pole. It therefore only needs a single pole
controller to place that pole anywhere on the real axis.
This analysis assumes that the motor is well coupled to the mass and that the time
constant of the inductor is small enough that it doesn’t factor into the motor equations.
The latter is a pretty good assumption for a CIM motor (see figures E.15 and E.16). If
more mass is added to the motor armature, the response timescales increase and the
inductance matters even less.
Figure E.15: Step response of second- Figure E.16: Step response of first-order
order DC brushed motor plant augmented DC brushed motor plant augmented with
with position (L = 230 μH) position (L = 0 μH)
Next, we’ll try a PD loop. (This will use a perfect derivative, but anyone following along
closely already knows that we can’t really take a derivative here, so the math will need
to be updated at some point. We could switch to discrete time and pick a differentiation
method, or pick some other way of modeling the derivative.)
V = Kp (ωgoal − ω) + Kd s(ωgoal − ω)
Kt K t Kp K t Kd s
sω = − ω+ (ωgoal − ω) + (ωgoal − ω)
JRKv JR JR
Kt K t Kp Kt Kp Kt Kd s Kt Kd s
sω = − ω+ ωgoal − ω+ ωgoal − ω
JRKv JR JR JR JR
ω
Solve for ωgoal .
Kt
ω JR (Kp + Kd s)
= Kt Kd
Kt Kt Kp
ωgoal s 1+ + JRK + JR
JR v
Kt Kt Kp
K +
So, we added a zero at − Kpd and moved our pole to − JRKvKt KJR
d
. This isn’t progress.
1+ JR
We’ve added more complexity to our system and, practically speaking, gotten nothing
good in return. Zeroes should be avoided if at all possible because they amplify un-
wanted high frequency modes of the system and are noisier the faster the system is
sampled. At least this is a stable zero, but it’s still undesirable.
In summary, derivative doesn’t help on an ideal flywheel. Kd may compensate for
unmodeled dynamics such as accelerating projectiles slowing the flywheel down, but
that effect may also increase recovery time; Kd drives the acceleration to zero in the
undesired case of negative acceleration as well as well as the actually desired case of
positive acceleration.
Subsection 6.7.2 covers a superior compensation method that avoids zeroes in the con-
troller, doesn’t act against the desired control action, and facilitates better tracking.
Gain margin and phase margin are two metrics for measuring a system’s relative stabil-
ity. Gain and phase margin are the amounts by which the closed-loop gain and phase
can be varied respectively before the system becomes unstable. In a sense, they are
safety margins for when unmodeled dynamics affect the system response.
For a more thorough explanation of gain and phase margin, watch Brian Douglas’s
video on them.
He has other videos too on classical control methods like Bode and Nyquist plots that
we recommend.
s-plane z-plane
(0, 0) (1, 0)
imaginary axis edge of unit circle
(−∞, 0) (0, 0)
Figure E.17: Mapping of complex plane from s-plane (left) to z-plane (right)
number z is less than one, xk+1 will converge to zero. Values with a magnitude of 1
oscillate forever because xk+1 never decays.
Figure E.18: Single poles in various loca- Figure E.19: Complex conjugate poles in
tions in z-plane various locations in z-plane
where the function f (tn , yn ) is the slope of y at n and T is the sample period for the
discrete system. Each of these methods is essentially finding the area underneath a
curve. The forward and backward Euler methods use rectangles to approximate that
area while the bilinear transform uses trapezoids (see figures E.21 and E.22). Since
these are approximations, there is distortion between the real discrete system’s poles and
the approximate poles. This is in addition to the phase loss introduced by discretizing
at a given sample rate in the first place. For fast-changing systems, this distortion can
quickly lead to instability.
Figures E.23, E.24, and E.25 show simulations of the same controller for different
sampling methods and sample rates, which have varying levels of fidelity to the real
system.
Forward Euler is numerically unstable for low sample rates. The bilinear transform is a
significant improvement due to it being a second-order approximation, but zero-order
hold performs best due to the matrix exponential including much higher orders.
320 Appendix E. Classical control theory
Figure E.20: Zero-order hold of a system Figure E.21: Discretization methods ap-
response plied to velocity data
Table E.3 compares the Taylor series expansions of several common discretization
methods (these are found using polynomial division). The bilinear transform does best
with accuracy trailing off after the third-order term. Forward Euler has no second-
order or higher terms, so it undershoots. Backward Euler has twice the second-order
term and overshoots the remaining higher order terms as well.
E.5 Discretization methods 321
Figure E.23: Sampling methods for system Figure E.24: Sampling methods for system
simulation with T = 0.1 s simulation with T = 0.05 s
Table E.3: Taylor series expansions of discretization methods (scalar case). The zero-
order hold discretization method is exact.
322 Appendix E. Classical control theory
Glossary
control effort A term describing how much force, pressure, etc. an actuator is exerting.
control input The input of a plant used for the purpose of controlling it.
control law Also known as control policy, is a mathematical formula used by the con-
troller to determine the input u that is sent to the plant. This control law
is designed to drive the system from its current state to some other desired
state.
control system Monitors and controls the behavior of a system.
controller Applies an input to a plant to bring about a desired system state by driving
the difference between a reference signal and the output to zero.
feedback controller Used in positive or negative feedback with a plant to bring about
324 Glossary
gain A proportional value that shows the relationship between the magnitude of an
input signal to the magnitude of an output signal at steady-state.
gain margin See section E.3.7 on gain and phase margin.
model A set of mathematical equations that reflects some aspect of a physical system’s
behavior.
noise immunity The quality of a system to have its performance or stability unaffected
by noise in the outputs (see also: robustness).
observer In control theory, a system that estimates the internal state of a given real
system from measurements of the input and output of the real system.
open-loop gain The gain directly from the input to the output, ignoring loops.
output Measurements from sensors.
output-based control Controls the system’s state via the outputs.
overshoot The amount by which a system’s state surpasses the reference after rising
toward it.
time-invariant The system’s fundamental response does not change over time.
tracking In control theory, the process of making the output of a control system follow
the reference.
unity feedback A feedback network in a control system diagram with a feedback gain
of 1.
This page intentionally left blank
Sunset near Porter Academic building at UCSC
Bibliography
Online
[1] Wikimedia Commons. Runge-Kutta methods: Explicit Runge-Kutta methods. URL:
https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#Explicit_
Runge.E2.80.93Kutta_methods (visited on 06/18/2020) (cited on page 100).
[2] Wikipedia Commons. Linear-quadratic regulator. URL: https : / / en . wikipedia .
org / wiki / Linear % E2 % 80 % 93quadratic _ regulator (visited on 03/24/2018)
(cited on page 269).
[3] Wikipedia Commons. Matrix calculus identities. URL: https://en.wikipedia.org/
wiki/Matrix_calculus#Identities (visited on 08/03/2021) (cited on page 43).
[4] Declan Freeman-Gleason. A Dive into WPILib Trajectories. 2020. URL: https :
//pietroglyph.github.io/trajectory-presentation/ (cited on page 248).
[5] Sean Humbert. Why do we have to linearize around an equilibrium point? URL:
https://www.cds.caltech.edu/%7Emurray/courses/cds101/fa02/faq/02- 10-
09_linearization.html (visited on 07/12/2018) (cited on page 104).
[6] Simon J. Julier and Jeffrey K. Uhlmann. A New Extension of the Kalman Filter
to Nonlinear Systems. URL: https://www.cs.unc.edu/~welch/kalman/media/pdf/
Julier1997_SPIE_KF.pdf (visited on 09/29/2019) (cited on page 175).
[7] Matthew Kelly. An Introduction to Trajectory Optimization: How to Do Your
Own Direct Collocation. URL: https://www.matthewpeterkelly.com/research/
MatthewKelly_IntroTrajectoryOptimization_SIAM_Review_2017.pdf (vis-
ited on 08/27/2019) (cited on page 260).
328 Bibliography
[8] Edgar Kraft. A Quaternion-based Unscented Kalman Filter for Orientation Track-
ing. URL: https://kodlab.seas.upenn.edu/uploads/Arun/UKFpaper.pdf (visited
on 07/11/2018) (cited on page 175).
[9] Roger Labbe. Kalman and Bayesian Filters in Python. URL: https://github.com/
rlabbe / Kalman - and - Bayesian - Filters - in - Python/ (visited on 04/29/2020)
(cited on page 135).
[10] Charles F. Van Loan. Computing Integrals Involving the Matrix Exponential. URL:
https://file.tavsys.net/control/papers/Computing%20Integrals%20Involving%
20the % 20Matrix % 20Exponential % 2C % 20Van % 20Loan . pdf (visited on
12/20/2021) (cited on page 84).
[11] Andrew W. Long et al. The Banana Distribution is Gaussian: A Localization
Study with Exponential Coordinates. 2008. URL: https://rpk.lcsr.jhu.edu/wp-
content/uploads/2014/08/p34_Long12_The- Banana- Distribution.pdf (cited
on page 135).
[12] Luca, Oriolo, and Vendittelli. Control of Wheeled Mobile Robots: An Experi-
mental Overview. URL: https : / / www . diag . uniroma1 . it / ~labrob / pub / papers /
RAMSETE_Chap_LNCIS270.pdf (visited on 08/09/2018) (cited on pages 124,
125).
[13] Simon J.A. Malham. An introduction to Lagrangian and Hamiltonian Mechan-
ics. URL: https:/ / www . macs . hw . ac . uk / ~simonm / mechanics . pdf (visited on
08/23/2016) (cited on page 223).
[14] Rudolph van der Merwe and Eric A. Wan. The Square-Root Unscented Kalman
Filter for State and Parameter Estimation. URL: https://www.researchgate.net/
publication / 3908304 _ The _ Square - Root _ Unscented _ Kalman _ Filter _ for _
State_and_Parameter-Estimation (visited on 04/25/2022) (cited on page 175).
[15] J. Nocedal and S. Wright. Numerical Optimization. 2008. URL: https://www.math.
uci.edu/~qnie/Publications/NumericalOptimization.pdf (cited on page 259).
[16] Christoph Sprunk. Planning Motion Trajectories for Mobile Robots Using Splines.
2008. URL: http://www2.informatik.uni-freiburg.de/~lau/students/Sprunk2008.
pdf (cited on page 248).
[17] Russ Tedrake. Underactuated Robotics. 2019. URL: https://underactuated.mit.
edu (cited on page 132).
[18] Russ Tedrake. Chapter 9. Linear Quadratic Regulators. URL: https://underactuated.
mit.edu/lqr.html (visited on 03/22/2020) (cited on page 269).
[19] Gabriel A. Terejanu. Unscented Kalman Filter Tutorial. URL: https://web.archive.
org/web/20220000000000*/https://cse.sc.edu/~terejanu/files/tutorialUKF.
pdf (visited on 09/28/2022) (cited on page 175).
[20] Michael Triantafyllou. lec19.pdf | Maneuvering and Control of Surface and Un-
derwater Vehicles (13.49). URL: https://ocw.mit.edu/courses/2-154-maneuvering-
Bibliography 329
and-control-of-surface-and-underwater-vehicles-13-49-fall-2004/resources/
lec19/ (visited on 04/02/2022) (cited on page 90).
[21] Jeffrey Uhlmann. First-Hand:The Unscented Transform. URL: https://ethw.org/
First - Hand : The _ Unscented _ Transform (visited on 07/27/2021) (cited on
page 175).
[22] Eric A. Wan and Rudolph van der Merwe. The Unscented Kalman Filter for
Nonlinear Estimation. URL: https : / / groups . seas . harvard . edu / courses / cs281 /
papers/unscented.pdf (visited on 06/26/2018) (cited on page 175).
This page intentionally left blank
Sunset near Porter Academic building at UCSC
Index
I wanted to fix the information disparity so more people could appreciate the beauty
and elegance I saw in control theory. This book streamlines the learning process to
make that possible.
I wrote the initial draft of this book as a final project for an undergraduate technical
writing class I took at UCSC in Spring 2017 (CMPE 185). It was a 13-page IEEE-
formatted paper intended as a reference manual and guide to state-space control
that summarized the three graduate controls classes I had taken that year. I kept
working on it the following year to flesh it out, and it eventually became long enough
to make into a proper book. I’ve been adding to it ever since as I learn new things.
I contextualized the material within FRC because it’s always been a significant part
of my life, and it’s a useful application sandbox. I maintain implementations of many
of this book’s tools in the FRC standard library (WPILib).
– Tyler Veness