Controls Engineering in FRC

Download as pdf or txt
Download as pdf or txt
You are on page 1of 352

Controls Engineering in the

FIRST Robotics Competition


Graduate-level control theory for high schoolers

Tyler Veness
Controls Engineering in the
FIRST Robotics Competition
Graduate-level control theory for high schoolers

Tyler Veness
Copyright © 2017-2024 Tyler Veness
HTTPS://GITHUB.COM/CALCMOGUL/CONTROLS-ENGINEERING-IN-FRC
Licensed under the Creative Commons Attribution-ShareAlike 4.0 International Li-
cense (the “License”). You may not use this file except in compliance with the License.
You may obtain a copy of the License at https://creativecommons.org/licenses/by-sa/
4.0/. Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDI-
TIONS OF ANY KIND, either express or implied. See the License for the specific language
governing permissions and limitations under the License.
Generated from commit d8141f0 made on February 23, 2024. The latest version can
be downloaded from https://file.tavsys.net/control/controls-engineering-in-frc.pdf.
Tree next to Oakes Circle at UCSC

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

0 Notes to the reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


0.1 Prerequisites 1
0.2 Structure of this book 1
0.3 Ethos of this book 2
0.4 Mindset of an egoless engineer 3
0.5 Request for feedback 3

I Fundamentals of control theory

1 Control system basics . . . . . . . . . . . . . . . . . . . . . . . . . . 7


1.1 What is gain? 8
1.2 Block diagrams 8
1.3 Open-loop and closed-loop systems 9
1.4 Feedforward 10
iv Contents

1.5 Why feedback control? 11

2 PID controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Proportional term 14
2.2 Derivative term 14
2.3 Integral term 16
2.4 PID controller definition 19
2.5 Response types 19
2.6 Manual tuning 20
2.7 Limitations 21

3 Application advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Mechanical vs software solutions 23
3.2 Actuator saturation 24

4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1 Derivatives 27
4.1.1 Power rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Product rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.3 Chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.4 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Integrals 29
4.3 Change of variables 30
4.4 Tables 31
4.4.1 Common derivatives and integrals . . . . . . . . . . . . . . . . . . . . . . . . . 31

II Modern control theory

5 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1 Vectors 35
5.2 Linear combinations, span, and basis vectors 36
5.3 Linear transformations and matrices 36
5.4 Matrix multiplication 36
Contents v

5.5 The determinant 37


5.6 Inverse matrices, column space, and null space 37
5.7 Nonsquare matrices 37
5.8 Eigenvectors and eigenvalues 38
5.9 Miscellaneous notation and operators 38
5.9.1 Special constant matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.9.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.10 Matrix definiteness 39
5.11 Common control theory matrix equations 40
5.11.1 Continuous algebraic Riccati equation (CARE) . . . . . . . . . . . . . . . . . 40
5.11.2 Discrete algebraic Riccati equation (DARE) . . . . . . . . . . . . . . . . . . . 41
5.11.3 Continuous Lyapunov equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.11.4 Discrete Lyapunov equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.12 Matrix calculus 42
5.12.1 Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.12.2 Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.12.3 Useful identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Continuous state-space control . . . . . . . . . . . . . . . . 45


6.1 From PID control to model-based control 45
6.2 What is a dynamical system? 46
6.3 Continuous state-space notation 47
6.3.1 What is state-space? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.4 Eigenvalues and stability 48
6.5 Closed-loop controller 50
6.6 Model augmentation 51
6.6.1 Plant augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.6.2 Controller augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.6.3 Observer augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.6.4 Output augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.7 Integral control 53
6.7.1 Plant augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.7.2 Input error estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
vi Contents

6.8 Double integrator 56


6.9 Elevator 57
6.9.1 Continuous state-space model . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.9.2 Model augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.9.3 Gravity feedforward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.9.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.9.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.10 Flywheel 60
6.10.1 Continuous state-space model . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.10.2 Model augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.10.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.10.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.10.5 Flywheel model without encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.10.6 Voltage compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.10.7 Do flywheels need PD control? . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.11 Single-jointed arm 67
6.11.1 Continuous state-space model . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.11.2 Model augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.11.3 Gravity feedforward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.11.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.11.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.12 Controllability and observability 72
6.12.1 Controllability matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.12.2 Controllability Gramian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.12.3 Controllability of specific states . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.12.4 Stabilizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.12.5 Observability matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.12.6 Observability Gramian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.12.7 Observability of specific states . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.12.8 Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7 Discrete state-space control . . . . . . . . . . . . . . . . . . . . 75


7.1 Continuous to discrete pole mapping 76
7.1.1 Discrete system stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.2 Discrete system behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1.3 Nyquist frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Effects of discretization on controller performance 78
7.2.1 Sample delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Contents vii

7.2.2 Sample rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79


7.3 Linear system discretization 79
7.3.1 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.3.2 Matrix exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.3.3 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.4 Discrete state-space notation 85
7.5 Closed-loop controller 85
7.6 Pole placement 86
7.7 Linear-quadratic regulator 87
7.7.1 The intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.7.2 The mathematical definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.7.3 Bryson’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.7.4 Pole placement vs LQR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.8 Feedforward 92
7.8.1 Plant inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.8.2 Unmodeled dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.8.3 Do feedforwards affect stability? . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.9 Numerical integration methods 96
7.9.1 Butcher tableaus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.9.2 Forward Euler method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.9.3 Runge-Kutta fourth-order method . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.9.4 Dormand-Prince method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8 Nonlinear control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


8.1 Introduction 103
8.2 Linearization 104
8.3 Lyapunov stability 104
8.3.1 Lyapunov stability for linear systems . . . . . . . . . . . . . . . . . . . . . . . 105
8.4 Affine systems 106
8.4.1 Feedback linearization for reference tracking . . . . . . . . . . . . . . . . . . 106
8.5 Pendulum 107
8.5.1 State-space model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.6 Holonomic drivetrains 109
8.6.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.6.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.6.3 Holonomic vs nonholonomic control . . . . . . . . . . . . . . . . . . . . . . . . 110
viii Contents

8.7 Differential drive 110


8.7.1 Velocity subspace state-space model . . . . . . . . . . . . . . . . . . . . . . . 110
8.7.2 Heading state-space model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.7.3 Linear time-varying model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.7.4 Improving model accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.7.5 Cross track error controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.7.6 Nonlinear observer design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.8 Ramsete unicycle controller 123
8.8.1 Velocity and turning rate command derivation . . . . . . . . . . . . . . . . . . 124
8.8.2 Nonlinear controller equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.8.3 Dimensional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.9 Linear time-varying unicycle controller 129
8.10 Further reading 132

III Estimation and localization


9 Stochastic control theory . . . . . . . . . . . . . . . . . . . . . 135
9.1 Terminology 135
9.2 State observers 136
9.2.1 Luenberger observer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.2.2 Separation principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.3 Introduction to probability 140
9.3.1 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3.2 Expected value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.3.3 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.3.4 Joint probability density functions . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.3.5 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
9.3.6 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
9.3.7 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.3.8 Marginal probability density functions . . . . . . . . . . . . . . . . . . . . . . . 145
9.3.9 Conditional probability density functions . . . . . . . . . . . . . . . . . . . . . 145
9.3.10 Bayes’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3.11 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3.12 Conditional variances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3.13 Random vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3.14 Covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Contents ix

9.3.15 Relations for independent random vectors . . . . . . . . . . . . . . . . . . . . 147


9.3.16 Gaussian random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.4 Linear stochastic systems 150
9.4.1 State vector expectation evolution . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.4.2 Error covariance matrix evolution . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.4.3 Measurement vector expectation . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.4.4 Measurement covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.5 Two-sensor problem 152
9.5.1 Two noisy (independent) observations . . . . . . . . . . . . . . . . . . . . . . 152
9.5.2 Single noisy observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.5.3 Single noisy observations: a Gaussian case . . . . . . . . . . . . . . . . . . . 153
9.6 Kalman filter 156
9.6.1 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.6.2 Predict and update equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.6.3 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.6.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.6.5 Kalman filter as Luenberger observer . . . . . . . . . . . . . . . . . . . . . . . 166
9.7 Kalman smoother 168
9.7.1 Predict and update equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.7.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
9.8 Extended Kalman filter 173
9.9 Unscented Kalman filter 175
9.9.1 Square-root UKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
9.10 Multiple model adaptive estimation 176

10 Pose estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177


10.1 Forward Euler integration 177
10.2 Pose exponential 177
10.2.1 What is a group? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.2.2 What is a pose? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.2.3 What is a twist? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.2.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.2.5 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
10.3 Pose correction 183
x Contents

IV System modeling

11 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11.1 Linear motion 187
11.2 Angular motion 187
11.3 Vectors 188
11.3.1 Basic vector operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
11.3.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
11.4 Curvilinear motion 189
11.5 Differential drive kinematics 190
11.5.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
11.5.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.6 Mecanum drive kinematics 193
11.6.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
11.6.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.7 Swerve drive kinematics 196
11.7.1 Inverse kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.7.2 Forward kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

12 Newtonian mechanics examples . . . . . . . . . . . . . . . 201


12.1 DC brushed motor 202
12.1.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
12.1.2 Calculating constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
12.1.3 Current limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.2 Elevator 205
12.2.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
12.3 Flywheel 207
12.3.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
12.3.2 Calculating constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
12.4 Single-jointed arm 210
12.4.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
12.4.2 Calculating constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
12.5 Pendulum 213
12.5.1 Force derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
12.5.2 Torque derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Contents xi

12.5.3 Energy derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215


12.6 Differential drive 217
12.6.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12.6.2 Calculating constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

13 Lagrangian mechanics examples . . . . . . . . . . . . . . 223


13.1 Single-jointed arm 223
13.2 Double-jointed arm 223
13.3 Cart-pole 224

14 System identification . . . . . . . . . . . . . . . . . . . . . . . . . 225


14.1 Ordinary least squares 225
14.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
14.2 1-DOF mechanism feedforward model 227
14.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
14.2.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
14.2.3 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
14.2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
14.3 1-DOF mechanism state-space model 230
14.4 Drivetrain left/right velocity state-space model 231
14.5 Drivetrain linear/angular velocity state-space model 237

V Motion planning

15 Motion profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241


15.1 1-DOF motion profiles 241
15.1.1 Jerk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
15.1.2 Profile selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
15.1.3 Profile equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
15.1.4 Other profile types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
15.1.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
15.2 2-DOF motion profiles 248
xii Contents

16 Configuration spaces . . . . . . . . . . . . . . . . . . . . . . . . . 249


16.1 Introduction 249
16.2 Waypoint planning 250
16.3 Waypoint traversal 253
16.4 State machine validation with safe sets 253

17 Trajectory optimization . . . . . . . . . . . . . . . . . . . . . . . 255


17.1 Solving optimization problems with Sleipnir 255
17.1.1 Double integrator minimum time . . . . . . . . . . . . . . . . . . . . . . . . . . 256
17.1.2 Optimizing the problem formulation . . . . . . . . . . . . . . . . . . . . . . . . 259
17.2 Further reading 259

VI Appendices

A Simplifying block diagrams . . . . . . . . . . . . . . . . . . . . 263


A.1 Cascaded blocks 263
A.2 Combining blocks in parallel 263
A.3 Removing a block from a feedforward loop 264
A.4 Eliminating a feedback loop 265
A.5 Removing a block from a feedback loop 265

B Linear-quadratic regulator . . . . . . . . . . . . . . . . . . . . 267


B.1 Derivation 267
B.2 State feedback with output cost 270
B.3 Implicit model following 272
B.3.1 Following reference system matrix . . . . . . . . . . . . . . . . . . . . . . . . . 272
B.3.2 Following reference input matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 275
B.4 Time delay compensation 275
B.4.1 Continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
B.4.2 Discrete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Contents xiii

C Feedforwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C.1 QR-weighted linear plant inversion 281
C.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C.1.2 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
C.2 Steady-state feedforward 283
C.2.1 Continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
C.2.2 Discrete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
C.2.3 Deriving steady-state input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

D Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
D.1 Linear system zero-order hold 289
D.2 Kalman filter as Luenberger observer 291
D.2.1 Luenberger observer with separate prediction and update . . . . . . . . . . 292

E Classical control theory . . . . . . . . . . . . . . . . . . . . . . . 293


E.1 Classical vs modern control theory 294
E.2 Transfer functions 294
E.2.1 Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
E.2.2 Parts of a transfer function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
E.2.3 Transfer functions in feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
E.3 Laplace domain analysis 302
E.3.1 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
E.3.2 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
E.3.3 Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
E.3.4 Laplace transform definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
E.3.5 Steady-state error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
E.3.6 Do flywheels need PD control? . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
E.3.7 Gain margin and phase margin . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
E.4 s-plane to z-plane 317
E.4.1 Discrete system stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
E.4.2 Discrete system behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
E.5 Discretization methods 319
E.6 Phase loss 322
E.7 System identification with Bode plots 322

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
xiv Contents

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Online 327

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Flower bush near Spencer’s Market in Santa Maria, CA

Preface

When I was a high school student on FIRST Robotics Competition (FRC) team 3512,
I had to learn control theory from scattered internet sources that either weren’t rigorous
enough or assumed too much prior knowledge. After I took graduate-level control
theory courses from University of California, Santa Cruz for my bachelor’s degree, I
realized that the concepts weren’t difficult when presented well, but the information
wasn’t broadly accessible outside academia.
I wanted to fix the information disparity so more people could appreciate the beauty
and elegance I saw in control theory. This book streamlines the learning process to
make that possible.
I wrote the initial draft of this book as a final project for an undergraduate technical
writing class I took at UCSC in Spring 2017 (CMPE 185). It was a 13-page IEEE-
formatted paper intended as a reference manual and guide to state-space control that
summarized the three graduate controls classes I had taken that year. I kept working
on it the following year to flesh it out, and it eventually became long enough to make
into a proper book. I’ve been adding to it ever since as I learn new things.
I contextualized the material within FRC because it’s always been a significant part of
my life, and it’s a useful application sandbox. I’ve since contributed implementations
of many of this book’s tools to the FRC standard library (WPILib) and maintain them.
xvi Preface

Acknowledgements
Thanks to my controls engineering instructors Dejan Milutinović and Gabriel Elkaim
for introducing me to the field of control theory. They taught me what it means to be
a controls engineer, and I appreciated their classes’s pragmatic perspective focused on
intuition and application.
Trees by Baskin Engineering building at UCSC

0. Notes to the reader

0.1 Prerequisites
Knowledge of basic algebra and complex numbers is assumed. Some introductory
physics and calculus will be taught as necessary.

0.2 Structure of this book


This book consists of five parts and a collection of appendices that address the four
tasks a controls engineer carries out: derive a model of the system (kinematics), design
a controller for the model (control theory), design an observer to estimate the current
state of the model (localization), and plan how the controller is going to drive the model
to a desired state (motion planning).
Part I “Fundamentals of control theory” introduces the basics of control theory and
teaches the fundamentals of PID controller design.
Part II “Modern control theory” first provides a crash course in the geometric intuition
behind linear algebra and covers enough of the mechanics of evaluating matrix algebra
for the reader to follow along in later chapters. It covers state-space representation, con-
trollability, and observability. The intuition gained in part I and the notation of linear
algebra are used to model and control linear multiple-input, multiple-output (MIMO)
systems and covers discretization, LQR controller design, LQE observer design, and
feedforwards. Then, these concepts are applied to design and implement controllers for
2 Chapter 0. Notes to the reader

real systems. The examples from part IV are converted to state-space representation,
implemented, and tested with a discrete controller.
Part II also introduces the basics of nonlinear control system analysis with Lyapunov
functions. It presents an example of a nonlinear controller for a unicycle-like vehicle as
well as how to apply it to a two-wheeled vehicle. Since nonlinear control isn’t the focus
of this book, we mention other books and resources for further reading.
Part III “Estimation and localization” introduces the field of stochastic control theory.
The Luenberger observer and the probability theory behind the Kalman filter is taught
with several examples of creative applications of Kalman filter theory.
Part IV “System modeling” introduces the basic calculus and physics concepts required
to derive the models used in the previous chapters. It walks through the derivations for
several common FRC subsystems. Then, methods for system identification are dis-
cussed for empirically measuring model parameters.
Part V “Motion planning” covers planning how the robot will get from its current state
to some desired state in a manner achievable by its dynamics. It introduces motion
profiles with one degree of freedom for simple maneuvers. Trajectory optimization
methods are presented for generating profiles with higher degrees of freedom.
The appendices provide further enrichment that isn’t required for a passing understand-
ing of the material. This includes derivations for many of the results presented and
used in the mainmatter of the book.
The Python scripts used to generate the plots in the case studies double as reference
implementations of the techniques discussed in their respective chapters. They are
available in this book’s Git repository. Its location is listed on the copyright page.

0.3 Ethos of this book


This book is intended as both a tutorial for new students and as a reference manual for
more experienced readers who need to review a thing or two. While it isn’t comprehen-
sive, the reader will hopefully learn enough to either implement the concepts presented
themselves or know where to look for more information.
Some parts are mathematically rigorous, but I believe in giving students a solid theo-
retical foundation with emphasis on intuition so they can apply it to new problems. To
achieve deep understanding of the topics in this book, math is unavoidable. With that
said, I try to provide practical and intuitive explanations whenever possible.
Most teaching resources separate linear and nonlinear control with the latter being re-
served for a different course. Here, they are introduced together because the concepts
0.4 Mindset of an egoless engineer 3

of nonlinear control apply often, and it isn’t that much of a leap (if Lyapunov stability
isn’t included). The control and estimation chapters cover relevant tools for dealing
with nonlinearities like linearization when appropriate.

0.4 Mindset of an egoless engineer


Engineering has a mindset, not just a skillset. Engineers have a unique way of approach-
ing problems, and the following maxim summarizes what I hope to teach my robotics
students (with examples drawn from controls engineering).
“Engineer based on requirements, not an ideology.”
Engineering is filled with trade-offs. The tools should fit the job, and not every problem
is a nail waiting to be struck by a hammer. Instead, assess the minimum requirements
(min specs) for a solution to the task at hand and do only enough work to satisfy them;
exceeding your specifications is a waste of time and money. If you require performance
or maintainability above the min specs, your min specs were chosen incorrectly by
definition.
Controls engineering is pragmatic in a similar respect: solve. the. problem. For con-
trol of nonlinear systems, plant inversion is elegant on paper but doesn’t work with
an inaccurate model, yet using a theoretically incorrect solution like linear approxima-
tions of the nonlinear system works well enough to be used industry-wide. There are
more sophisticated controllers than PID, but we use PID anyway for its versatility and
simplicity. Sometimes the inferior solutions are more effective or have a more desir-
able cost-benefit ratio than what the control system designer considers ideal or clean.
Choose the tool that is most effective.
Solutions need to be good enough, but do not need to be perfect. We want to avoid
integrators as they introduce instability, but we use them anyway because they work
well for meeting tracking specifications. One should not blindly defend a design or fol-
low an ideology, because there is always a case where its antithesis is a better option.
The engineer should be able to determine when this is the case, set aside their ego, and
do what will meet the specifications of their client (e.g., system response characteris-
tics, maintainability, usability). Preferring one solution over another for pragmatic or
technical reasons is fine, but the engineer should not care on a personal level which
sufficient solution is chosen.

0.5 Request for feedback


While we have tried to write a book that makes the topics of control theory approach-
able, it still may be dense or fast-paced for some readers (it covers three classes of
4 Chapter 0. Notes to the reader

feedback control, two of which are for graduate students, in one short book). Please
send us feedback, corrections, or suggestions through the GitHub link listed on the
copyright page. New examples that demonstrate key concepts and make them more
accessible are also appreciated.
I
Fundamentals of
control theory

1 Control system basics . . . . . . . . . . . . . . . . 7

2 PID controllers . . . . . . . . . . . . . . . . . . . . . . 13

3 Application advice . . . . . . . . . . . . . . . . . . 23

4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
This page intentionally left blank
Road near walking trail off of Rice Ranch Road in Santa Maria, CA

1. Control system basics

Control systems are all around us and we interact with them daily. A small list of ones
you may have seen includes heaters and air conditioners with thermostats, cruise control
and the anti-lock braking system (ABS) on automobiles, and fan speed modulation
on modern laptops. Control systems monitor or control the behavior of systems like
these and may consist of humans controlling them directly (manual control), or of only
machines (automatic control).
How can we prove closed-loop controllers on an autonomous car, for example, will
behave safely and meet the desired performance specifications in the presence of un-
certainty? Control theory is an application of algebra and geometry used to analyze
and predict the behavior of systems, make them respond how we want them to, and
make them robust to disturbances and uncertainty.
Controls engineering is, put simply, the engineering process applied to control theory.
As such, it’s more than just applied math. While control theory has some beautiful math
behind it, controls engineering is an engineering discipline like any other that is filled
with trade-offs. The solutions control theory gives should always be sanity checked and
informed by our performance specifications. We don’t need to be perfect; we just need
to be good enough to meet our specifications.[1]

R Most resources for advanced engineering topics assume a level of knowledge

[1] See section 0.4 for more on engineering.


8 Chapter 1. Control system basics

well above that which is necessary. Part of the problem is the use of jargon.
While it efficiently communicates ideas to those within the field, new people
who aren’t familiar with it are lost. Therefore, it’s important to define terms
before using them. See the glossary for a list of words and phrases commonly
used in control theory, their origins, and their meaning. Links to the glossary
are provided for certain words throughout the book and will use this color.

1.1 What is gain?


Gain is a proportional value that shows the relationship between the magnitude of an
input signal to the magnitude of an output signal at steady-state. Many systems contain a
method by which the gain can be altered, providing more or less “power” to the system.
Figure 1.1 shows a system with a hypothetical input and output. Since the output is
twice the amplitude of the input, the system has a gain of 2.

input system output

t K t

Figure 1.1: Demonstration of system with a gain of K = 2

1.2 Block diagrams


When designing or analyzing a control system, it is useful to model it graphically. Block
diagrams are used for this purpose. They can be manipulated and simplified systemat-
ically (see appendix A). Figure 1.2 is an example of one.

+
input open-loop output

feedback

Figure 1.2: Block diagram with nomenclature


1.3 Open-loop and closed-loop systems 9

The open-loop gain is the total gain from the sum node at the input (the circle) to the
output branch. This would be the system’s gain if the feedback loop was disconnected.
The feedback gain is the total gain from the output back to the input sum node. A sum
node’s output is the sum of its inputs.
Figure 1.3 is a block diagram with more formal notation in a feedback configuration.

+
X P1 Y

P2

Figure 1.3: Feedback block diagram

∓ means “minus or plus” where a minus represents negative feedback.

1.3 Open-loop and closed-loop systems


The system or collection of actuators being controlled by a control system is called the
plant. A controller is used to drive the plant from its current state to some desired state
(the reference). We’ll be using the following notation for relevant quantities in block
diagrams.

r(t) reference u(t) control input


e(t) error y(t) output
Controllers which don’t include information measured from the plant’s output are called
open-loop or feedforward controllers. Figure 1.4 shows a plant with a feedforward
controller.

Feedforward u(t)
r(t) y(t)
controller

Figure 1.4: Open-loop control system

Controllers which incorporate information fed back from the plant’s output are called
closed-loop or feedback controllers. Figure 1.5 shows a plant with a feedback controller.
10 Chapter 1. Control system basics

+ e(t) Feedback u(t) y(t)


r(t)
− controller

Figure 1.5: Closed-loop control system

Note that the input and output of a system are defined from the plant’s point of view.
The negative feedback controller shown is driving the difference between the reference
and output, also known as the error, to zero.
Figure 1.6 shows a plant with feedforward and feedback controllers.

Feedforward
controller

+ e(t) + u(t) y(t)


r(t) Feedback
− controller +

Figure 1.6: Control system with feedforward and feedback

1.4 Feedforward
Feedback control can be effective for reference tracking (making a system’s output
follow a desired reference signal), but it’s a reactionary measure; the system won’t start
applying control effort until the system is already behind. If we could tell the con-
troller about the desired movement and required input beforehand, the system could
react quicker and the feedback controller could do less work. A controller that feeds
information forward into the plant like this is called a feedforward controller.
A feedforward controller injects information about the system’s dynamics (like a model
does) or the desired movement. The feedforward handles parts of the control actions
1.5 Why feedback control? 11

we already know must be applied to make a system track a reference, then feedback
compensates for what we do not or cannot know about the system’s behavior at runtime.
There are two types of feedforwards: model-based feedforward and feedforward for
unmodeled dynamics. The first solves a mathematical model of the system for the
inputs required to meet desired velocities and accelerations. The second compensates
for unmodeled forces or behaviors directly so the feedback controller doesn’t have to.
Both types can facilitate simpler feedback controllers; we’ll cover examples of each in
later chapters.

1.5 Why feedback control?


Let’s say we are controlling a DC brushed motor. With just a mathematical model and
knowledge of all current states of the system (i.e., angular velocity), we can predict all
future states given the future voltage inputs. Why then do we need feedback control?
If the system is disturbed in any way that isn’t modeled by our equations, like a load was
applied to the armature, or voltage sag in the rest of the circuit caused the commanded
voltage to not match the actual applied voltage, the angular velocity of the motor will
deviate from the model over time.
To combat this, we can take measurements of the system and the environment to detect
this deviation and account for it. For example, we could measure the current position
and estimate an angular velocity from it. We can then give the motor corrective com-
mands as well as steer our model back to reality. This feedback allows us to account
for uncertainty and be robust to it.
This page intentionally left blank
On trail between McHenry Library and Media Theater at UCSC

2. PID controllers

The PID controller is a commonly used feedback controller consisting of proportional,


integral, and derivative terms, hence the name. This chapter will build up the definition
of a PID controller term by term while trying to provide intuition for how each of them
behaves.
First, we’ll get some nomenclature for PID controllers out of the way. The reference
is called the setpoint (the desired position) and the output is called the process variable
(the measured position). Below are some common variable naming conventions for
relevant quantities.

r(t) setpoint u(t) control input


e(t) error y(t) output
The error e(t) is r(t) − y(t).
For those already familiar with PID control, this book’s interpretation won’t be con-
sistent with the classical intuition of “past”, “present”, and “future” error. We will be
approaching it from the viewpoint of modern control theory with proportional con-
trollers applied to different physical quantities we care about. This will provide a more
complete explanation of the derivative term’s behavior for constant and moving set-
points, and this intuition will carry over to the modern control methods covered later
in this book.
The proportional term drives the position error to zero, the derivative term drives the
14 Chapter 2. PID controllers

velocity error to zero, and the integral term accumulates the area between the setpoint
and output plots over time (the integral of position error) and adds the current total to
the control input. We’ll go into more detail on each of these.

2.1 Proportional term


The Proportional term drives the position error to zero.

Definition 2.1.1 — Proportional controller.

u(t) = Kp e(t) (2.1)

where Kp is the proportional gain and e(t) is the error at the current time t.

Figure 2.1 shows a block diagram for a system controlled by a P controller.

+ e(t) u(t) y(t)


r(t)

Kp e(t) Plant

Figure 2.1: P controller block diagram

Proportional gains act like “software-defined springs” that pull the system toward the
desired position. Recall from physics that we model springs as F = −kx where F
is the force applied, k is a proportional constant, and x is the displacement from the
equilibrium point. This can be written another way as F = k(0 − x) where 0 is the
equilibrium point. If we let the equilibrium point be our feedback controller’s setpoint,
the equations have a one-to-one correspondence.

F = k(r − x)
u(t) = Kp e(t) = Kp (r(t) − y(t))

so the “force” with which the proportional controller pulls the system’s output toward
the setpoint is proportional to the error, just like a spring.

2.2 Derivative term


The Derivative term drives the velocity error to zero.
2.2 Derivative term 15

Definition 2.2.1 — PD controller.

de
u(t) = Kp e(t) + Kd (2.2)
dt

where Kp is the proportional gain, Kd is the derivative gain, and e(t) is the error
at the current time t.

Figure 2.2 shows a block diagram for a system controlled by a PD controller.

Kp e(t)
+ e(t) + u(t) y(t)
r(t) Plant
− +
Kd de(t)
dt

Figure 2.2: PD controller block diagram

A PD controller has a proportional controller for position (Kp ) and a proportional con-
troller for velocity (Kd ). The velocity setpoint is implicitly provided by how the posi-
tion setpoint changes over time. Figure 2.3 shows an example for an elevator.
To prove a PD controller is just two proportional controllers, we will rearrange the
equation for a PD controller.
ek − ek−1
u k = Kp e k + K d
dt
where uk is the control input at timestep k and ek is the error at timestep k. ek is defined
as ek = rk − yk where rk is the setpoint and yk is the current output at timestep k.

(rk − yk ) − (rk−1 − yk−1 )


uk = Kp (rk − yk ) + Kd
dt
rk − yk − rk−1 + yk−1
uk = Kp (rk − yk ) + Kd
dt
rk − rk−1 − yk + yk−1
uk = Kp (rk − yk ) + Kd
dt
(rk − rk−1 ) − (yk − yk−1 )
uk = Kp (rk − yk ) + Kd
dt
16 Chapter 2. PID controllers

Figure 2.3: PD controller on an elevator

 
rk − rk−1 yk − yk−1
uk = Kp (rk − yk ) + Kd −
dt dt

Notice how rk −r dt
k−1
is the velocity of the setpoint. By the same reasoning, yk −y
dt
k−1
is
the system’s velocity at a given timestep. That means the Kd term of the PD controller
is driving the estimated velocity to the setpoint velocity.
If the setpoint is constant, the implicit velocity setpoint is zero, so the Kd term slows
the system down if it’s moving. This acts like a “software-defined damper”. These
are commonly seen on door closers, and their damping force increases linearly with
velocity.

2.3 Integral term


The Integral term accumulates the area between the setpoint and output plots over time
(i.e., the integral of position error) and adds the current total to the control input. Ac-
cumulating the area between two curves is called integration.
2.3 Integral term 17

Definition 2.3.1 — PI controller.


Z t
u(t) = Kp e(t) + Ki e(τ ) dτ (2.3)
0

where Kp is the proportional gain, Ki is the integral gain, e(t) is the error at the
current time t, and τ is the integration variable.
The integral integrates from time 0 to the current time t. We use τ for the integration
because we need a variable to take on multiple values throughout the integral, but
we can’t use t because we already defined that as the current time.

Figure 2.4 shows a block diagram for a system controlled by a PI controller.

Kp e(t)
+ e(t) + u(t) y(t)
r(t) Plant
− +
∫t
Ki 0
e(τ ) dτ

Figure 2.4: PI controller block diagram

When the system is close to the setpoint in steady-state, the proportional term may be
too small to pull the output all the way to the setpoint, and the derivative term is zero.
This can result in steady-state error, as shown in figure 2.5.
A common way of eliminating steady-state error is to integrate the error and add it to
the control input. This increases the control effort until the system converges. Figure
2.5 shows an example of steady-state error for a flywheel, and figure 2.6 shows how
an integrator added to the flywheel controller eliminates it. However, too high of an
integral gain can lead to overshoot, as shown in figure 2.7.

R There are better approaches to fix steady-state error like using feedforwards or
constraining when the integral control acts using other knowledge of the system.
We will discuss these in more detail when we get to modern control theory.
18 Chapter 2. PID controllers

Figure 2.5: P controller on a flywheel with steady-state error

Figure 2.6: PI controller on a flywheel Figure 2.7: PI controller on a flywheel with


without steady-state error overshoot from large Ki gain
2.4 PID controller definition 19

2.4 PID controller definition


When these three terms are combined, one gets the typical definition for a PID con-
troller.

Definition 2.4.1 — PID controller.


Z t
de
u(t) = Kp e(t) + Ki e(τ ) dτ + Kd (2.4)
0 dt

where Kp is the proportional gain, Ki is the integral gain, Kd is the derivative gain,
e(t) is the error at the current time t, and τ is the integration variable.

Figure 2.8 shows a block diagram for a system controlled by a PID controller.

Kp e(t)

+ e(t) ∫t + u(t) y(t)


r(t) Ki e(τ ) dτ Plant
− 0 + +

Kd de(t)
dt

Figure 2.8: PID controller block diagram

2.5 Response types


A system driven by a PID controller generally has three types of responses: under-
damped, overdamped, and critically damped. These are shown in figure 2.9.
For the step responses in figure 2.9, rise time is the time the system takes to initially
reach the reference after applying the step input. Settling time is the time the system
takes to settle at the reference after the step input is applied.
An underdamped response oscillates around the reference before settling. An over-
damped response is slow to rise and does not overshoot the reference. A critically
damped response has the shortest rise time without oscillating around the reference
(i.e., overshooting then undershooting).
20 Chapter 2. PID controllers

Figure 2.9: PID controller response types

2.6 Manual tuning


These steps apply to position PID controllers. Velocity PID controllers typically don’t
need Kd .
1. Set Kp , Ki , and Kd to zero.
2. Increase Kp until the output starts to oscillate around the setpoint.
3. Increase Kd as much as possible without introducing jittering in the system re-
sponse.
If the setpoint follows a trapezoid profile (see section 15.1), tuning becomes a lot easier.
Plot the position setpoint, velocity setpoint, measured position, and measured velocity.
The velocity setpoint can be obtained via numerical differentiation of the position set-
point (i.e., vdesired,k = rk −r∆t
k−1
). Increase Kp until the position tracks well, then
increase Kd until the velocity tracks well.
If the controller settles at an output above or below the setpoint, one can increase Ki
such that the controller reaches the setpoint in a reasonable amount of time. However,
a steady-state feedforward is strongly preferred over integral control (especially for ve-
locity PID control).

R Note: Adding an integral gain to the controller is an incorrect way to eliminate


2.7 Limitations 21

steady-state error. A better approach would be to tune it with an integrator


added to the plant, but this requires a model. Since we are doing output-based
rather than model-based control, our only option is to add an integrator to the
controller.

Beware that if Ki is too large, integral windup can occur. Following a large change in
setpoint, the integral term can accumulate an error larger than the maximal control in-
put. As a result, the system overshoots and continues to increase until this accumulated
error is unwound.

2.7 Limitations
PID’s heuristic method of tuning is a reasonable choice when there is no a priori knowl-
edge of the system dynamics. However, controllers with much better response can be
developed if a dynamical model of the system is known. Furthermore, PID only applies
to single-input, single-output (SISO) systems; we’ll cover methods for multiple-input,
multiple-output (MIMO) control in part II of this book.
This page intentionally left blank
Treeline by Crown/Merril bus stop at UCSC

3. Application advice

3.1 Mechanical vs software solutions


While this book focuses on controls engineering applications in FRC, controls isn’t
necessary to have a successful robot; FRC is primarily a mechanical competition, not
a software competition. We spend over six weeks designing and building a robot, but
we often allocate just two days for software testing. Despite this, teams still field com-
petitive robots, which is a testament to how simple and easy to use the FRC software
ecosystem is nowadays. Focus on a reliabile mechanical design first because a good
mechanical design can succeed with simple software, but the robot can’t succeed with
unreliable hardware.
The solution to a design problem may be a tradeoff between mechanical and software
complexity. For example, for a mechanism that only needs two positions, a solenoid
connected to a pneumatic actuator may take less effort and be more reliable than a
motor, rotary encoder, and software feedback control. If one can get the software
solution working though, the robot may not need the added space and weight of a
compressor and air tanks.
My rule of thumb for evaluating designs is to prefer elegant mechanical solutions over
comprehensive software solutions because it’s easier to make mechanical solutions re-
liable in competition. Well-placed sensors can also drastically improve robot perfor-
mance and reduce driver cognitive load. An example would be a limit switch and
match timer for automatically deploying minibots in the 2011 FRC game as soon as
24 Chapter 3. Application advice

the endgame starts. In many cases, manual processes can be automated later and given
a manual fallback if the associated software or sensor fails. Be cautious with designs
that require closed-loop control to function.
When should problems be solved in hardware instead of software with clever controls?
Controls can handle disturbances like battery voltage drop or measurement noise, but
there are limits. For example, there’s nothing software can do to work around a drive-
train gearbox seizing up or throwing a chain. Sometimes, you’re better off just fixing
the root cause in hardware.
Design robot mechanisms for controllability. FRC team 971’s “Mechanical Design for
Controllability” seminar goes into more detail. Two of the important takeaways from
it are:
• Reduce gearbox backlash
• Choose motors and gear reductions that provide adequate control authority

“Spartan Series / Mechanical Design for Controllability” (1 hour, 31 minutes)


Travis Schuh
https://youtu.be/VNfFn-gcfFI

Remember, “fix it in software” isn’t always the answer. The remaining chapters of this
book assume you’ve done the engineering analysis and concluded that your chosen de-
sign would benefit from more sophisticated controls, and you have the time or expertise
to make it work.

3.2 Actuator saturation


A feedback controller calculates its output based on the error between the reference
and the current state. Plant in the real world don’t have unlimited control authority
available for the feedback controller to apply. When the actuator limits are reached,
the feedback controller acts as if the gain has been temporarily reduced (i.e., it will
have reduced control authority).
We’ll try to explain this through a bit of math. Let’s say we have a controller u =
k(r − x) where u is the control effort, k is the gain, r is the reference, and x is the
current state. Let umax be the limit of the actuator’s output which is less than the
3.2 Actuator saturation 25

uncapped value of u and kmax be the associated maximum gain. We will now compare
the capped and uncapped controllers for the same reference and current state.

umax < u
kmax (r − x) < k(r − x)
kmax < k

For the inequality to hold, kmax must be less than the original value for k. This reduced
gain is evident in a system response when there is a linear change in state instead of
an exponential one as it approaches the reference. This is due to the control effort no
longer following a decaying exponential plot. Once the system is closer to the reference,
the controller will stop saturating and produce realistic controller values again.
This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura

4. Calculus

This book uses derivatives and integrals occasionally to represent small changes in val-
ues over small changes in time and the infinitesimal sum of values over time respec-
tively. The formulas and tables presented here are all you’ll need to carry through with
the math in later chapters.
If you are interested in more information after reading this chapter, 3Blue1Brown does
a fantastic job of introducing them in his Essence of calculus series. We recommend
reading lesson chapters 1 through 3, 5, and 10 through 13 for a solid foundation.

“Essence of calculus”
3Blue1Brown
https://www.3blue1brown.com/topics/calculus

4.1 Derivatives
Derivatives are expressions for the slope of a curve at arbitrary points. Common nota-
tions for this operation on a function like f (x) include
28 Chapter 4. Calculus

Leibniz notation Lagrange notation


d
dx f (x) f ′ (x)
d2
dx2 f (x) f ′′ (x)
3
d
dx3 f (x) f ′′′ (x)
d4
dx4 f (x) f (4) (x)
n
d
dxn f (x) f (n) (x)

Table 4.1: Notation for derivatives of f (x)

Lagrange notation is usually voiced as “f prime of x”, “f double-prime of x”, etc.

4.1.1 Power rule

f (x) = xn
f ′ (x) = nxn−1

4.1.2 Product rule


This is for taking the derivative of the product of two expressions.

h(x) = f (x)g(x)
h′ (x) = f ′ (x)g(x) + f (x)g ′ (x)

4.1.3 Chain rule


This is for taking the derivative of nested expressions.

h(x) = f (g(x))
h′ (x) = f ′ (g(x)) · g ′ (x)

For example,
5
h(x) = (3x + 2)
h′ (x) = 5 (3x + 2) · (3)
4

h′ (x) = 15 (3x + 2)
4
4.2 Integrals 29

4.1.4 Partial derivatives


A partial derivative of a function of several variables is its derivative with respect to one
of those variables, with the others held constant (as opposed to the total derivative, in
which all variables are allowed to vary). Partial derivatives use ∂ instead of d in Leibniz
notation.
For example, let h(x, y) = 3xy + 2x. For the partial derivative with respect to x, y is
treated as a constant.
∂h(x, y)
= 3y + 2
∂x
For the partial derivative with respect to y, x is treated as a constant, so the second
term becomes zero.
∂h(x, y)
= 3x
∂y

4.2 Integrals
The integral is the inverse operation of the derivative and calculates the area under a
curve. Here is an example of one based on table 4.2.
Z
eat dt
1 at
e +C
a

The arbitrary constant C is needed because when you take a derivative, constants are
discarded because vertical offsets don’t affect the slope. When performing the inverse
operation, we don’t have enough information to determine the constant.
However, we can provide bounds for the integration.
Z t
eat dt
0
 t
1 at
e +C
a
  0 
1 at 1 a·0
e +C − e +C
a a
   
1 at 1
e +C − +C
a a
1 at 1
e +C − −C
a a
30 Chapter 4. Calculus

1 at 1
e −
a a

When we do this, the constant cancels out.

4.3 Change of variables


Change of variables is a technique for simlifying problems in which expressions are
replaced with new variables to make the problem more tractible. This can mean either
the problem is more straightforward or it matches a common form for which tools for
finding solutions are readily available. Here’s an example of integration which utilizes
it. Z
cos ωt dt

Let u = ωt.

du = ω dt
1
dt = du
ω

Now substitute the expressions for u and dt in.


Z
1
cos u du
ω
Z
1
cos u du
ω
1
sin u + C
ω
1
sin ωt + C
ω

Another example, which will be relevant when we actually cover state-space notation
(ẋ = Ax + Bu), is a closed-loop state-space system.

ẋ = (A − BK)x + BKr
ẋ = Acl x + Bcl ucl

where Acl = A − BK, Bcl = BK, and ucl = r. Since it matches the form of the
open-loop system, all the same analysis tools will work with it.
4.4 Tables 31

4.4 Tables
4.4.1 Common derivatives and integrals
R
f (x) dx f (x) f ′ (x)
ax a 0
1 2
2 ax ax a
1 a+1 a
a+1 x x axa−1
1 ax
ae eax aeax
− cos(x) sin(x) cos(x)
sin(x) cos(x) − sin(x)
cos(x) − sin(x) − cos(x)
− sin(x) − cos(x) sin(x)

Table 4.2: Common derivatives and integrals


This page intentionally left blank
II
Modern control
theory

5 Linear algebra . . . . . . . . . . . . . . . . . . . . . . 35

6 Continuous state-space control . . . . . . 45

7 Discrete state-space control . . . . . . . . . 75

8 Nonlinear control . . . . . . . . . . . . . . . . . . . 103


This page intentionally left blank
Grass clearing by Interdisciplinary Sciences building and Thimann La

5. Linear algebra

Modern control theory borrows concepts from linear algebra. At first, linear algebra
may appear very abstract, but there are simple geometric intuitions underlying it. We’ll
be linking to 3Blue1Brown’s Essence of linear algebra video series because it’s better
at conveying that intuition than static text.

“Essence of linear algebra preview” (5 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/eola-preview

5.1 Vectors
36 Chapter 5. Linear algebra

“Vectors, what even are they?” (5 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/vectors

5.2 Linear combinations, span, and basis vectors

“Linear combination, span, and basis vectors” (10 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/span

5.3 Linear transformations and matrices

“Linear transformations and matrices” (11 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/linear-transformations

5.4 Matrix multiplication


5.5 The determinant 37

“Matrix multiplication as composition” (10 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/matrix-multiplication

5.5 The determinant

“The determinant” (10 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/determinant

5.6 Inverse matrices, column space, and null space

“Inverse matrices, column space, and null space” (12 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/inverse-matrices

5.7 Nonsquare matrices


38 Chapter 5. Linear algebra

“Nonsquare matrices as transformations between dimensions” (4 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/nonsquare-matrices

5.8 Eigenvectors and eigenvalues

“Eigenvectors and eigenvalues” (17 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/eigenvalues

5.9 Miscellaneous notation and operators


This book works with two-dimensional matrices in the sense that they only have rows
and columns. The dimensionality of these matrices is specified by row first, then col-
umn. For example, a matrix with two rows and three columns would be a two-by-three
matrix. A square matrix has the same number of rows as columns. Matrices commonly
use capital letters while vectors use lowercase letters.

5.9.1 Special constant matrices


I is the identity matrix, a typically square matrix with ones along its diagonal and zeroes
elsewhere. 0 is a matrix filled with zeroes and 1 is a matrix filled with ones. An optional
subscript m × n denotes the matrix having m rows and n columns.
     
1 0 0 0 0 1 1
I3×3 = 0 1 0 03×2 = 0 0 13×2 = 1 1
0 0 1 0 0 1 1

5.9.2 Operators
Transpose
The T in AT denotes transpose, which flips the matrix across its diagonal such that the
rows become columns and vice versa.
5.10 Matrix definiteness 39

A symmetric matrix is equal to its transpose.


Pseudoinverse
The + in B+ denotes the Moore-Penrose pseudoinverse given by B+ = (BT B)−1 BT .
The pseudoinverse is used when the matrix is nonsquare and thus not invertible to
produce a close approximation of an inverse in the least squares sense.
Diagonal
A diagonal matrix has elements along its diagonal and zeroes elsewhere (e.g., the iden-
 T
tity matrix). Let x = x1 . . . xn .
 
x1 0 · · · 0
 .. 
 0 x2 . 
diag(x) = diag(x1 , . . . , xn ) = 
.


 .. ..
. 0
0 · · · 0 xn

A block diagonal matrix has matrices along


h 1its2 3diagonal.
i diag() works similarly for
1 2
constructing one. Let A = [ 3 4 ] and B = 4 5 6 .
789
 
1 2 0 0 0
  3 4 0 0 0
A 0  
diag(A, B) = =
0 0 1 2 3
0 B 0 0 4 5 6
0 0 7 8 9

Operations on the diag() argument are applied element-wise.


1 
x21
0 ··· 0
   0 .. 
1 
1
2 . 

diag =  x2

x2  .. .. 
 . . 0
0 · · · 0 x12
n

Trace
tr(A) denotes the trace of the square matrix A, which is defined as the sum of the
elements on the main diagonal (top-left to bottom-right).

5.10 Matrix definiteness


40 Chapter 5. Linear algebra

Type Eigenvalues Notation


Negative definite All λ < 0 M<0
Negative semidefinite All λ ≤ 0 M≤0
Positive semidefinite All λ ≥ 0 M≥0
Positive definite All λ > 0 M>0
Indefinite Positive and negative N/A

Table 5.1: Eigenvalue distribution and notation for each type of matrix definiteness.
Let M be a matrix.

Type Definition
Negative definite xT Mx < 0 for all x
Negative semidefinite xT Mx ≤ 0 for all x
Positive semidefinite xT Mx ≥ 0 for all x
Positive definite xT Mx > 0 for all x
Indefinite ∃x, y ∋ xT Mx < 0 < yT My

Table 5.2: Rigorous definition for each type of matrix definiteness. Let M be a matrix
and let x and y be nonzero vectors.

5.11 Common control theory matrix equations


Here’s some common matrix equations from control theory we’ll use later on. Solvers
for them exist in Drake (C++) and SciPy (Python).

5.11.1 Continuous algebraic Riccati equation (CARE)


The continuous algebraic Riccati equation (CARE) appears in the solution to the con-
tinuous time LQ problem.

AX + XA − XBR−1 BT X + Q = 0 (5.1)
5.11 Common control theory matrix equations 41

5.11.2 Discrete algebraic Riccati equation (DARE)


The discrete algebraic Riccati equation (DARE) appears in the solution to the discrete
time LQ problem.
X = AT XA − (AT XB)(R + BT XB)−1 BT XA + Q (5.2)

Snippet 5.1 computes the unique stabilizing solution to the discrete algebraic Riccati
equation. A robust implementation should also enforce the following preconditions:
1. Q = QT ≥ 0 and R = RT > 0.
2. (A, B) is a stabilizable pair (see subsection 6.12.4).
3. (A, C) is a detectable pair where Q = CT C (see section 6.12.8).
#include <Eigen/Cholesky>
#include <Eigen/Core>
#include <Eigen/LU>

template <int States, int Inputs>


Eigen::Matrix<double, States, States> DARE(
const Eigen::Matrix<double, States, States>& A,
const Eigen::Matrix<double, States, Inputs>& B,
const Eigen::Matrix<double, States, States>& Q,
const Eigen::Matrix<double, Inputs, Inputs>& R) {
// [1] E. K.-W. Chu, H.-Y. Fan, W.-W. Lin & C.-S. Wang
// "Structure-Preserving Algorithms for Periodic Discrete-Time
// Algebraic Riccati Equations",
// International Journal of Control, 77:8, 767-788, 2004.
// DOI: 10.1080/00207170410001714988
//
// Implements SDA algorithm on p. 5 of [1] (initial A, G, H are from (4)).
using StateMatrix = Eigen::Matrix<double, States, States>;

StateMatrix A_k = A;
StateMatrix G_k = B * R.llt().solve(B.transpose());
StateMatrix H_k;
StateMatrix H_k1 = Q;

do {
H_k = H_k1;

StateMatrix W = StateMatrix::Identity() + G_k * H_k;

auto W_solver = W.lu();


StateMatrix V_1 = W_solver.solve(A_k);
StateMatrix V_2 = W_solver.solve(G_k.transpose()).transpose();

G_k += A_k * V_2 * A_k.transpose();


H_k1 = H_k + V_1.transpose() * H_k * A_k;
A_k *= V_1;
} while ((H_k1 - H_k).norm() > 1e-10 * H_k1.norm());

return H_k1;
}

Snippet 5.1. Discrete algebraic Riccati equation solver in C++


42 Chapter 5. Linear algebra

5.11.3 Continuous Lyapunov equation


The continuous Lyapunov equation appears in controllability/observability analysis of
continuous time systems.
AX + XAT + Q = 0 (5.3)

5.11.4 Discrete Lyapunov equation


The discrete Lyapunov equation appears in controllability/observability analysis of dis-
crete time systems.
AXAT − X + Q = 0 (5.4)

5.12 Matrix calculus


Matrix calculus uses partial derivatives. See subsection 4.1.4 for how they work.
We’ll need a vector-valued function to demonstrate some common operations in matrix
calculus. Let f (x, u) be a vector-valued function defined as
     
f1 (x, u) x1 u1
 ..   ..   .. 
f (x, u) =  .  where x =  .  and u =  . 
fm (x, u) xm un

5.12.1 Jacobian
The Jacobian is the first-order partial derivative of a vector-valued function with respect
to one of its vector arguments. The columns of the Jacobian of f are filled with partial
derivatives of f ’s rows with respect to each of the argument’s elements. For example,
the Jacobian of f with respect to x is
 ∂f1 ∂f1 
. . . ∂x
h i ∂x1 m
∂f (x, u)  .. .. .. 
= ∂f∂x(x,u)
. . . ∂f (x,u)
∂x
=  . . . 
∂x 1 m
∂fm ∂fm
∂x1 . . . ∂xm

∂f1
∂x1 is the partial derivative of the first row of f with respect to the first row of x, and
so on for all rows of f and x. This has m2 permutations and thus produces a square
matrix.
The Jacobian of f with respect to u is
 ∂f1 ∂f1 
...
h i ∂u1 ∂un
∂f (x, u)  .. 
= ∂f (x,u)
∂u1 ... ∂f (x,u)
∂un
=  ... ..
. . 
∂u ∂fm ∂fm
∂u1 ... ∂un
5.12 Matrix calculus 43
∂f1
∂u1 is the partial derivative of the first row of f with respect to the first row of u, and so
on for all rows of f and u. This has m × n permutations and can produce a nonsquare
matrix if m ̸= n.

5.12.2 Hessian
The Hessian is the second-order partial derivative of a vector-valued function with re-
spect to one of its vector arguments. For example, the Hessian of f with respect to x
is  ∂ 2 f1 
∂ 2 f1
. . .
∂ 2 f (x, u) h ∂ 2 f (x,u) i  ∂x1
2 ∂x 2

.. 
m

= . . . ∂ 2 f (x,u)
=  .. ..
. 
∂x2 ∂x1 2 ∂xm 2  . . 
2 2
∂ fm
∂x2
. . . ∂∂xf2m
1 m

and the Hessian of f with respect to u is


 ∂ 2 f1 ∂ 2 f1

...
h i  .
∂u21 ∂u2n
∂ 2 f (x, u) ∂ 2 f (x,u) ∂ 2 f (x,u)
= .. .. 

∂u2
= ∂u21
... ∂u2n  .. . . 
∂ 2 fm ∂ 2 fm
∂u21
... ∂u2n

5.12.3 Useful identities


Here’s some useful matrix calculus identities pulled from Wikipedia’s table [3].

∂xT Ax
Theorem 5.12.1 ∂x = 2Ax where A is symmetric.

∂(Ax+b)T C(Dx+e)
Theorem 5.12.2 ∂x = AT C(Dx + e) + DT CT (Ax + b)
44 Chapter 5. Linear algebra

∂(Ax+b)T C(Ax+b)
Corollary 5.12.3 ∂x = 2AT C(Ax + b) where C is symmetric.
Proof:
∂(Ax + b)T C(Ax + b)
= AT C(Ax + b) + AT CT (Ax + b)
∂x
∂(Ax + b)T C(Ax + b)
= (AT C + AT CT )(Ax + b)
∂x
C is symmetric, so

∂(Ax + b)T C(Ax + b)


= (AT C + AT C)(Ax + b)
∂x
∂(Ax + b)T C(Ax + b)
= 2AT C(Ax + b)
∂x
Night sky above Dufour Street in Santa Cruz, CA

6. Continuous state-space control

When we want to command a system to a set of states, we design a controller with


certain control laws to do it. PID controllers use the system outputs with proportional,
integral, and derivative control laws. In state-space, we also have knowledge of the
system states so we can do better.
Modern control theory uses state-space representation to model and control systems.
State-space representation models systems as a set of state, input, and output variables
related by first-order differential equations that describe how the system’s state changes
over time given the current states and inputs.

6.1 From PID control to model-based control


As mentioned before, controls engineers have a more general framework to describe
control theory than just PID control. PID controller designers are focused on fiddling
with controller parameters relating to the current, past, and future error rather than the
underlying system states. Integral control is a commonly used tool, and some people
use integral action as the majority of the control action. While this approach works in
a lot of situations, it is an incomplete view of the world.
Model-based control has a completely different mindset. Controls designers using
model-based control care about developing an accurate model of the system, then driv-
ing the states they care about to zero (or to a reference). Integral control is added with
uerror estimation if needed to handle model uncertainty, but we prefer not to use it be-
46 Chapter 6. Continuous state-space control

cause its response is hard to tune and some of its destabilizing dynamics aren’t visible
during simulation.
Why use model-based control in FRC? Poor build season schedule management often
leads to the software team:
1. Not getting enough time to verify basic functionality and test/tune feedback con-
trollers.
2. Spending dedicated software testing time troubleshooting mechanical/electrical
issues within recently integrated subsystems instead.
Model-based control (one of the focuses of this book) avoids both problems because
it lets software teams test basic functionality in simulation much earlier in the build
season and tune their feedback controllers automatically.

6.2 What is a dynamical system?


A dynamical system is a system whose motion varies according to a set of differen-
tial equations. A dynamical system is considered linear if the differential equations
describing its dynamics consist only of linear operators. Linear operators are things
like constant gain multiplications, derivatives, and integrals. You can define reasonably
accurate linear models for pretty much everything you’ll see in FRC with just those
relations.
But let’s say you have a DC brushed motor hooked up to a power supply and you applied
a constant voltage to it from rest. The motor approaches a steady-state angular velocity,
but the shape of the angular velocity curve over time isn’t a line. In fact, it’s a decaying
exponential curve akin to 
ω = ωmax 1 − e−t

where ω is the angular velocity and ωmax is the maximum angular velocity. If DC
brushed motors are said to behave linearly, then why is this?
Linearity refers to a system’s equations of motion, not its time domain response. The
equation defining the motor’s change in angular velocity over time looks like

ω̇ = −aω + bV

where ω̇ is the derivative of ω with respect to time, V is the input voltage, and a and
b are constants specific to the motor. This equation, unlike the one shown before, is
actually linear because it only consists of multiplications and additions relating the input
V and current state ω.
6.3 Continuous state-space notation 47

Also of note is that the relation between the input voltage and the angular velocity
of the output shaft is a linear regression. You’ll see why if you model a DC brushed
motor as a voltage source and generator producing back-EMF (in the equation above,
bV corresponds to the voltage source and −aω corresponds to the back-EMF). As you
increase the input voltage, the back-EMF increases linearly with the motor’s angular
velocity. If there was a friction term that varied with the angular velocity squared (air
resistance is one example), the relation from input to output would be a curve. Friction
that scales with just the angular velocity would result in a lower maximum angular
velocity, but because that term can be lumped into the back-EMF term, the response
is still linear.

6.3 Continuous state-space notation


6.3.1 What is state-space?
Recall from last chapter that 2D space has two axes: x and y. We represent locations
within this space as a pair of numbers packaged in a vector, and each coordinate is a
measure of how far to move along the corresponding axis. State-space is a Cartesian
coordinate system with an axis for each state variable, and we represent locations within
it the same way we do for 2D space: with a list of numbers in a vector. Each element
in the vector corresponds to a state of the system.
In state-space notation, there are also input and output vectors with corresponding input
and output spaces. Inputs drive the system to other points in the state-space, and outputs
are sensor measurements that have a linear relationship to the state and input. Since the
mapping from state and input to change in state is a system of equations, it’s natural to
write it in matrix form.

6.3.2 Definition
Below is the continuous version of state-space notation.
48 Chapter 6. Continuous state-space control

Definition 6.3.1 — Continuous state-space notation.

ẋ = Ax + Bu (6.1)
y = Cx + Du (6.2)

A system matrix x state vector


B input matrix u input vector
C output matrix y output vector
D feedthrough matrix

Matrix Rows × Columns Matrix Rows × Columns


A states × states x states × 1
B states × inputs u inputs × 1
C outputs × states y outputs × 1
D outputs × inputs

Table 6.1: State-space matrix dimensions

The change in state and the output are linear combinations of the state vector and the
input vector. The A and B matrices are used to map the state vector x and the input
vector u to a change in the state vector ẋ. The C and D matrices are used to map the
state vector x and the input vector u to an output vector y.

6.4 Eigenvalues and stability


If a system is stable, its output will tend toward equilibrium (steady-state) over time.
For a general system ẋ = f (x, u), equilibrium points are points where ẋ = 0. If we
let x = 0 and u = 0 in ẋ = Ax + Bu, we can see that ẋ = 0, so x = 0 is an
equilibrium point.
We’d like to know whether all possible unforced system trajectories (u = 0) move
toward or away from the equilibrium point. If we solve the system of linear differential
equations ẋ = Ax, we get x(t) = eAt x0 .[1] eAt is the superposition of eλj t terms

[1] Section 7.3 will explain why the matrix exponential eAt shows up here.
6.4 Eigenvalues and stability 49

where {λj } is the set of A’s eigenvalues.[2]


For now, let’s consider when all the eigenvalues are real numbers.

 λj t
λj < 0, e decays to zero (stable)
λj = 0, eλj t = 1 (marginally stable)


λj > 0, eλj t grows to infinity (unstable)

So the system tends toward the equilibrium point (i.e., it’s stable) if λj < 0 for all j.
Now let’s consider when the eigenvalues are complex numbers. What does that mean
for the system response? Let λj = aj + bj i. Each of the exponential terms in the
solution can be written as
eλj t = e(aj +bj i)t = eaj t eibj t

The complex exponential can be rewritten using Euler’s formula.[3]


eibj t = cos(bj t) + i sin(bj t)

Therefore,
eλj t = eaj t (cos(bj t) + i sin(bj t))

When the eigenvalue’s imaginary part bj ̸= 0, it contributes oscillation to the real part’s
response.
The eigenvalues of A are called poles.[4] Figure 6.1 shows the impulse responses in
the time domain for systems with various pole locations in the complex plane (real
numbers on the x-axis and imaginary numbers on the y-axis). Each response has an
initial condition of 1.
[2]We’re handwaving why this is the case, but it’s a consequence of eAt being diagonalizable.
[3] Euler’sformula may seem surprising at first, but it’s rooted in the fact that complex exponentials are
rotations in the complex plane around the origin. If you can imagine walking around the unit circle traced
by that rotation, you’ll notice the real part of your position oscillates between −1 and 1 over time. That is,
complex exponentials manifest as oscillations in real life.

“What is Euler’s formula actually saying? | Ep. 4 Lockdown live math” (51 minutes)
3Blue1Brown
https://youtu.be/ZxYOEwM6Wbk

[4] This name comes from classical control theory. See subsection E.2.2 for more.
50 Chapter 6. Continuous state-space control

Im
Stable Unstable

t
t
t

LHP RHP
t

t t

Re

t t t

Figure 6.1: Impulse response vs pole location

Poles in the left half-plane (LHP) are stable; the system’s output may oscillate but
it converges to steady-state. Poles on the imaginary axis are marginally stable; the
system’s output oscillates at a constant amplitude forever. Poles in the right half-plane
(RHP) are unstable; the system’s output grows without bound.

R Imaginary poles always come in complex conjugate pairs (e.g., −2 + 3i, −2 −


3i).

6.5 Closed-loop controller


With the control law u = K(r−x), we can derive the closed-loop state-space equations.
We’ll discuss where this control law comes from in subsection 7.7.
First is the state update equation. Substitute the control law into equation (6.1).

ẋ = Ax + BK(r − x)
ẋ = Ax + BKr − BKx
ẋ = (A − BK)x + BKr (6.3)

Now for the output equation. Substitute the control law into equation (6.2).

y = Cx + D(K(r − x))
6.6 Model augmentation 51

y = Cx + DKr − DKx
y = (C − DK)x + DKr (6.4)

Instead of commanding the system to a state using the vector u directly, we can now
specify a vector of desired states through r and the controller will choose values of u
for us over time that drive the system toward the reference.
The eigenvalues of A − BK are the poles of the closed-loop system. Therefore, the
rate of convergence and stability of the closed-loop system can be changed by moving
the poles via the eigenvalues of A − BK. A and B are inherent to the system, but
K can be chosen arbitrarily by the controller designer. For equation (6.3) to reach
steady-state, the eigenvalues of A − BK must be in the left-half plane. There will be
steady-state error if Ar ̸= 0.

Symbol Name Rows × Columns


A system matrix states × states
B input matrix states × inputs
C output matrix outputs × states
D feedthrough matrix outputs × inputs
K controller gain matrix inputs × states
r reference vector states × 1
x state vector states × 1
u input vector inputs × 1
y output vector outputs × 1

Table 6.2: Controller matrix dimensions

6.6 Model augmentation


This section will teach various tricks for manipulating state-space models with the goal
of demystifying the matrix algebra at play. We will use the augmentation techniques
discussed here in the section on integral control.
Matrix augmentation is the process of appending rows or columns to a matrix. In
state-space, there are several common types of augmentation used: plant augmentation,
52 Chapter 6. Continuous state-space control

controller augmentation, and observer augmentation.

6.6.1 Plant augmentation


Plant augmentation is the process of adding a state to a model’s state vector and adding
a corresponding row to the A and B matrices.

6.6.2 Controller augmentation


Controller augmentation is the process of adding a column to a controller’s K matrix.
This is often done in combination with plant augmentation to add controller dynamics
relating to a newly added state.

6.6.3 Observer augmentation


Observer augmentation is closely related to plant augmentation. In addition to adding
entries to the observer matrix K,[5] the observer is using this augmented plant for esti-
mation purposes. This is better explained with an example.
By augmenting the plant with a bias term with no dynamics (represented by zeroes in its
rows in A and B), the observer will attempt to estimate a value for this bias term that
makes the model best reflect the measurements taken of the real system. Note that we’re
not collecting any data on this bias term directly; it’s what’s known as a hidden state.
Rather than our inputs and other states affecting it directly, the observer determines a
value for it based on what is most likely given the model and current measurements. We
just tell the plant what kind of dynamics the term has and the observer will estimate it
for us.

6.6.4 Output augmentation


Output augmentation is the process of adding rows to the C matrix. This is done to
help the controls designer visualize the behavior of internal states or other aspects of
the system in MATLAB or Python Control. C matrix augmentation doesn’t affect state
feedback, so the designer has a lot of freedom here. Noting that the output is defined
as y = Cx + Du, the following row augmentations of C may prove useful. Of course,
D needs to be augmented with zeroes as well in these cases to maintain the correct
matrix dimensionality.
Since u = −Kx, augmenting C with −K makes the observer estimate the control
input u applied.

y = Cx + Du
[5] Observers use a matrix K to steer the state estimate toward the true state in the same way controllers

use a matrix K to steer the current state toward a desired state. We’ll cover this in more detail in part III.
6.7 Integral control 53
     
y C D
= x+ u
u −K 0

This works because K has the same number of columns as states.


Various states can also be produced in the output with I matrix augmentation.

6.6.5 Examples
Snippet 6.1 shows how one packs together the following augmented matrix in Python
using np.block().  
A B
C D

import numpy as np

A = np.array([[1, 2], [3, 4]])


B = np.array([[5], [6]])
C = np.array([[7, 8]])
D = np.array([[9]])

tmp = np.block([[A, B], [C, D]])

Snippet 6.1. Matrix augmentation example: block

Snippet 6.2 shows how one packs together the same augmented matrix in Python using
array slices.
import numpy as np

A = np.array([[1, 2], [3, 4]])


B = np.array([[5], [6]])
C = np.array([[7, 8]])
D = np.array([[9]])

tmp = np.empty((3, 3))


tmp[:2, :2] = A # tmp[0:2, 0:2] = A
tmp[:2, 2:] = B # tmp[0:2, 2:3] = B
tmp[2:, :2] = C # tmp[2:3, 0:2] = C
tmp[2:, 2:] = D # tmp[2:3, 2:3] = D

Snippet 6.2. Matrix augmentation example: array slices

Section 6.7 demonstrates model augmentation for different types of integral control.

6.7 Integral control


A common way of implementing integral control is to add an additional state that is the
integral of the error of the variable intended to have zero steady-state error.
54 Chapter 6. Continuous state-space control

We’ll present two methods:


1. Augment the plant. For an arm, one would add an “integral of position” state.
2. Estimate the “error” in the control input (the difference between what was applied
versus what was observed to happen) via the observer and compensate for it.
We’ll call this “input error estimation”.
In FRC, avoid integral control unless you have a very good reason to use it. Integral
control adds significant complexity, and steady-state error can often be avoided with a
motion profile, a well-tuned feedforward, and proportional feedback (i.e., more deter-
ministic options you should be using anyway).

6.7.1 Plant augmentation


Caveats
First, unconstrained integral control exhibits integral windup on a unit step input. When
the error reaches zero, the integrator may still have a large positive accumulated error
from the initial ramp-up. The integrator makes the system overshoot until the accu-
mulated error is unwound by negative errors. Poor tuning can lead to instability.[6]
Integrating only when close to the reference somewhat mitigates integral windup.
Second, unconstrained integral control is a poor choice for modeled dynamics com-
pensation. Feedforwards provide more precise compensation since we already know
beforehand how to counteract the undesirable dynamics.
Third, unconstrained integral control is a poor choice for unmodeled dynamics compen-
sation. To choose proper gains, the integrator must be tuned online when the unmod-
eled dynamics are present, which may be inconvenient or unsafe in some circumstances.
Furthermore, accumulation even when the system is following the model means it still
compensates for modeled dynamics despite our intent otherwise. Prefer the approach
in subsection 6.7.2.
Implementation
We want to augment the system with an integral term that integrates the error e =
r − y = r − Cx.
Z
xI = e dt

ẋI = e = r − Cx

[6]With proper tuning, proportional control should generally be doing much more work than integral

control.
6.7 Integral control 55

The plant is augmented as


˙      
x A 0 x B 0
= + u+ r
xI −C 0 xI 0 I
˙      
x A 0 x B 0 u
= +
xI −C 0 xI 0 I r

The controller is augmented as

u = K(r − x) − KI xI
   
  r x
u = K KI −
0 xI

6.7.2 Input error estimation


Models can predict system behavior, but unmodeled disturbances make the observed
system behavior deviate from the model. We want to react to these disturbances quickly
and improve the model’s predictive power in the face of these disturbances.
Input error estimation uses a state observer (covered in chapter 9) to estimate the differ-
ence between the provided model input and a hypothetical input that makes the model
match the observed behavior. Subtracting this value from the provided input compen-
sates for unmodeled disturbances, and adding it to the state observer’s input makes the
model better predict the system’s future behavior.
First, we’ll consider the one-dimensional case. Let uerror be the difference between
the hypothetical input with disturbances and the provided input. The uerror term is
then added to the system as follows.

ẋ = Ax + B (u + uerror )

u + uerror is the hypothetical input actually applied to the system.

ẋ = Ax + Bu + Buerror

The following equation generalizes this to a multiple-input system.

ẋ = Ax + Bu + Berror uerror

where Berror is a column vector that maps uerror to changes in the rest of the state
the same way B does for u. Berror is only a column of B if uerror corresponds to an
existing input within u.
56 Chapter 6. Continuous state-space control

Given the above equation, we’ll augment the plant as


 ˙      
x A Berror x B
= + u
uerror 0 0 uerror 0
 
  x
y= C 0 + Du
uerror

Notice how the state is augmented with uerror . With this model, the observer will
estimate both the state and the uerror term. The controller is augmented similarly. r
is augmented with a zero for the goal uerror term.
u = K (r − x) − kerror uerror
   
  r x
u = K kerror −
0 uerror

where kerror is a column vector with a 1 in a given row if uerror should be applied to
that input or a 0 otherwise.
This process can be repeated for an arbitrary error which can be corrected via some
linear combination of the inputs.

6.8 Double integrator


The double integrator has two states (position and velocity) and one input (acceleration).
Their relationship can be expressed by the following system of differential equations,
where x is position, v is velocity, and a is acceleration.
ẋ = v
v̇ = a
 T  T
We want to put these into the form ẋ = Ax + Bu. Let x = x v and u = a .
First, add missing terms so that all equations have the same states and inputs. Then,
sort them by states followed by inputs.
ẋ = 0x + 1v + 0a
v̇ = 0x + 0v + 1a
Now, factor out the constants into matrices.
˙     
x 0 1 x 0  
= + a
v 0 0 v 1
ẋ = Ax + Bu
6.9 Elevator 57

6.9 Elevator
This elevator consists of a DC brushed motor attached to a pulley that drives a mass up
or down.

R ωp

I
ωm

Vemf
m v
V G
r

circuit mechanics

Figure 6.2: Elevator system diagram

6.9.1 Continuous state-space model


Using equation (12.15), the position and velocity derivatives of the elevator can be
written as

ẋ = v (6.5)
2
G Kt GKt
v̇ = − v+ V (6.6)
Rr2 mKv Rrm

Factor out v and V into column vectors.


 ˙  h G2 K i    GK   
v = − Rr2 mKt v v + Rrmt V

Augment the matrix equation with the position state x, which has the model equation
ẋ = v. The matrix elements corresponding to v will be 1, and the others will be 0 since
they don’t appear, so ẋ = 0x + 1v + 0V . The existing rows will have zeroes inserted
where x is multiplied in.
˙ " #   
x 0 1 x 0  
= 2 + V
v 0 − RrG2 mK
Kt
v
v GKt
Rrm
58 Chapter 6. Continuous state-space control

Theorem 6.9.1 — Elevator state-space model.

ẋ = Ax + Bu
y = Cx + Du
   
x position
x= = y = x = position u = V = voltage
v velocity

" #
0 1
A= G2 K t (6.7)
0 − Rr2 mKv
 
0
B = GKt (6.8)
 Rrm 
C= 1 0 (6.9)
D=0 (6.10)

6.9.2 Model augmentation


As per subsection 6.7.2, we will now augment the model so a uerror state is added to
the control input.
The plant and observer augmentations should be performed before the model is dis-
cretized. After the controller gain is computed with the unaugmented discrete model,
the controller may be augmented. Therefore, the plant and observer augmentations
assume a continuous model and the controller augmentation assumes a discrete con-
troller.  
x
xaug =  v  y = x u = V
uerror
   
A B B  
Aaug = Baug = Caug = C 0 Daug = D (6.11)
01×2 0 0
 
  r
Kaug = K 1 raug = (6.12)
0

This will compensate for unmodeled dynamics such as gravity. However, using a con-
stant voltage feedforward to counteract gravity is preferred over uerror estimation in
this case because it results in a simpler controller with similar performance.
6.9 Elevator 59

6.9.3 Gravity feedforward


Input voltage is proportional to force and gravity is a constant force, so a constant volt-
age feedforward can compensate for gravity. We’ll model gravity as an acceleration
disturbance −g. To compensate for it, we want to find a voltage that is equal and oppo-
site to it. The bottom row of the continuous elevator model contains the acceleration
terms.

Buf f = −(unmodeled dynamics)

where B is the motor acceleration term from B and uf f is the voltage feedforward.

Buf f = −(−g)
Buf f = g
GKt
uf f = g
Rrm
Rrmg
uf f =
GKt

6.9.4 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[7] creates and tests a controller for it. Figure 6.3 shows the closed-loop
system response.

6.9.5 Implementation
C++ and Java implementations of this elevator controller are available online.[8][9]

[7] https://github.com/calcmogul/frccontrol/blob/main/examples/elevator.py
[8] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/

StateSpaceElevator/cpp/Robot.cpp
[9] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/

wpilibj/examples/statespaceelevator/Robot.java
60 Chapter 6. Continuous state-space control

Figure 6.3: Elevator response

6.10 Flywheel
This flywheel consists of a DC brushed motor attached to a spinning mass of non-
negligible moment of inertia.

6.10.1 Continuous state-space model


By equation (12.18)

G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
Factor out ω and V into column vectors.
 ˙  h G2 K i    GK   
ω = − Kv RJt ω + RJt V
6.10 Flywheel 61

R ωf

I J
ωm

Vemf
V G

circuit mechanics

Figure 6.4: Flywheel system diagram

Theorem 6.10.1 — Flywheel state-space model.

ẋ = Ax + Bu
y = Cx + Du

x = ω = angular velocity y = ω = angular velocity u = V = voltage

G2 Kt
A=− (6.13)
Kv RJ
GKt
B= (6.14)
RJ
C=1 (6.15)
D=0 (6.16)

6.10.2 Model augmentation


As per subsection 6.7.2, we will now augment the model so a uerror state is added to
the control input.
The plant and observer augmentations should be performed before the model is dis-
cretized. After the controller gain is computed with the unaugmented discrete model,
the controller may be augmented. Therefore, the plant and observer augmentations
assume a continuous model and the controller augmentation assumes a discrete con-
62 Chapter 6. Continuous state-space control

Figure 6.5: Flywheel response

troller.  
ω
x= y=ω u=V
uerror
   
A B B  
Aaug = Baug = Caug = C 0 Daug = D (6.17)
0 0 0
 
  r
Kaug = K 1 raug = (6.18)
0

This will compensate for unmodeled dynamics such as projectiles slowing down the
flywheel.

6.10.3 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[10] creates and tests a controller for it. Figure 6.5 shows the closed-loop
system response.
Notice how the control effort when the reference is reached is nonzero. This is a plant
inversion feedforward compensating for the system dynamics attempting to slow the
flywheel down when no voltage is applied.
[10] https://github.com/calcmogul/frccontrol/blob/main/examples/flywheel.py
6.10 Flywheel 63

6.10.4 Implementation
C++ and Java implementations of this flywheel controller are available online.[11][12]

6.10.5 Flywheel model without encoder


In the FIRST Robotics Competition, we can get the current drawn for specific channels
on the power distribution panel. We can theoretically use this to estimate the angular
velocity of a DC motor without an encoder. We’ll start with the flywheel model derived
earlier as equation (12.18).

GKt G2 K t
ω̇ = V − ω
RJ Kv RJ
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
Next, we’ll derive the current I as an output.
ω
V = IR +
Kv
ω
IR = V −
Kv
1 1
I=− ω+ V
Kv R R

Therefore,

[11] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/

StateSpaceFlywheel/cpp/Robot.cpp
[12] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/

wpilibj/examples/statespaceflywheel/Robot.java
64 Chapter 6. Continuous state-space control

Theorem 6.10.2 — Flywheel state-space model without encoder.

ẋ = Ax + Bu
y = Cx + Du

x = ω = angular velocity y = I = current u = V = voltage

G2 Kt
A=− (6.19)
Kv RJ
GKt
B= (6.20)
RJ
1
C=− (6.21)
Kv R
1
D= (6.22)
R

Notice that in this model, the output doesn’t provide any direct measurements of the
state. To estimate the full state (also known as full observability), we only need the
outputs to collectively include linear combinations of every state[13] . We’ll revisit this
in chapter 9 with an example that uses range measurements to estimate an object’s
orientation.
The effectiveness of this model’s observer is heavily dependent on the quality of the
current sensor used. If the sensor’s noise isn’t zero-mean, the observer won’t converge
to the true state.

6.10.6 Voltage compensation


To improve controller tracking, one may want to use the voltage renormalized to the
power rail voltage to compensate for voltage drop when current spikes occur. This can
be done as follows.
Vnominal
V = Vcmd (6.23)
Vrail

where V is the controller’s new input voltage, Vcmd is the old input voltage, Vnominal
is the rail voltage when effects like voltage drop due to current draw are ignored, and
Vrail is the real rail voltage.

[13]While the flywheel model’s outputs are a linear combination of both the states and inputs, inputs don’t

provide new information about the states. Therefore, they don’t affect whether the system is observable.
6.10 Flywheel 65

To drive the model with a more accurate voltage that includes voltage drop, the recip-
rocal can be used.
Vrail
V = Vcmd (6.24)
Vnominal

where V is the model’s new input voltage. Note that if both the controller compensation
and model compensation equations are applied, the original voltage is obtained. The
model input only drops from ideal if the compensated controller voltage saturates.

6.10.7 Do flywheels need PD control?


PID controllers typically control voltage to a motor in FRC independent of the equa-
tions of motion of that motor. For position PID control, large values of Kp can lead
to overshoot and Kd is commonly used to reduce overshoots. Let’s consider a flywheel
controlled with a standard PID controller. Why wouldn’t Kd provide damping for ve-
locity overshoots in this case?
PID control is designed to control second-order and first-order systems well. It can be
used to control a lot of things, but struggles when given higher order systems. It has
three degrees of freedom. Two are used to place the two poles of the system, and the
third is used to remove steady-state error. With higher order systems like a one input,
seven state system, there aren’t enough degrees of freedom to place the system’s poles
in desired locations. This will result in poor control.
The math for PID doesn’t assume voltage, a motor, etc. It defines an output based on
derivatives and integrals of its input. We happen to use it for motors because it actually
works pretty well for it because motors are second-order systems.
The following math will be in continuous time, but the same ideas apply to discrete
time. This is all assuming a velocity controller.
Our simple motor model hooked up to a mass is
ω
V = IR + (6.25)
Kv
τ = IKt (6.26)

τ =J (6.27)
dt

For an explanation of where these equations come from, read section 12.1.

First, we’ll solve for dt in terms of V .
66 Chapter 6. Continuous state-space control

Substitute equation (6.26) into equation (6.25).


ω
V = IR +
K
  v
τ ω
V = R+
Kt Kv
Substitute in equation (6.27).

J dω
dt ω
V = R+
Kt Kv

Solve for dt .

J dω
dt ω
V = R+
Kt Kv
ω J dω
V − = dt R
Kv Kt
 
dω Kt ω
= V −
dt JR Kv
dω Kt Kt
=− ω +
|{z} V
|{z} (6.28)
dt JRK
|{z} | {z v} x JR
|{z} u
ẋ A B

There’s one stable open-loop pole at − JRK


Kt
v
. Let’s try a simple P controller.

u = K(r − x)
V = Kp (ωgoal − ω)

Closed-loop models have the form ẋ = (A − BK)x + BKr. Therefore, the closed-
loop poles are the eigenvalues of A − BK.
ẋ = (A − BK)x + BKr
      
Kt Kt Kt
ω̇ = − − (Kp ) ω + (Kp )(ωgoal )
JRKv JR JR
 
Kt K t Kp Kt Kp
ω̇ = − + ω+ ωgoal
JRKv JR JR
 
Kt Kp
This closed-loop flywheel model has one pole at − JRKKt
v
+ JR . It therefore only
needs one P controller to place that pole anywhere on the real axis. A derivative term
6.11 Single-jointed arm 67

is unnecessary on an ideal flywheel. It may compensate for unmodeled dynamics such


as accelerating projectiles slowing the flywheel down, but that effect may also increase
recovery time; Kd drives the acceleration to zero in the undesired case of negative
acceleration as well as well as the actually desired case of positive acceleration.
This analysis assumes that the motor is well coupled to the mass and that the time con-
stant of the inductor is small enough that it doesn’t factor into the motor equations. The
latter is a pretty good assumption for a CIM motor with the following constants: J =
3.2284 × 10−6 kg-m2 , b = 3.5077 × 10−6 N -m-s, Ke = Kt = 0.0181 V /rad/s,
R = 0.0902 Ω, and L = 230 μH. Notice the slight wiggle in figure 6.6 compared
to figure 6.7. If more mass is added to the motor armature, the response timescales
increase and the inductance matters even less.

Figure 6.6: Step response of second-order Figure 6.7: Step response of first-order DC
DC brushed motor plant augmented with brushed motor plant augmented with posi-
position (L = 230 μH) tion (L = 0 μH)

Subsection 6.7.2 covers a superior compensation method that avoids zeroes in the con-
troller, doesn’t act against the desired control action, and facilitates better tracking.

6.11 Single-jointed arm


This single-jointed arm consists of a DC brushed motor attached to a pulley that spins
a straight bar in pitch.
68 Chapter 6. Continuous state-space control

m
R ωarm l
I
ωm

Vemf
V G

circuit mechanics

Figure 6.8: Single-jointed arm system diagram

6.11.1 Continuous state-space model


Using equation (12.23), the angle and angular rate derivatives of the arm can be written
as

θ̇arm = ωarm (6.29)


2
G Kt GKt
ω̇arm = − ωarm + V (6.30)
Kv RJ RJ

Factor out ωarm and V into column vectors.


 ˙  h G2 K i     
ωarm = − Kv RJt ωarm + GK RJ
t
V

Augment the matrix equation with the angle state θarm , which has the model equation
θ̇arm = ωarm . The matrix elements corresponding to ωarm will be 1, and the others
will be 0 since they don’t appear, so θ̇arm = 0θarm + 1ωarm + 0V . The existing rows
will have zeroes inserted where θarm is multiplied in.
 ˙  " #   
θarm 0 1 θarm 0  
= 2 + GKt V
ωarm 0 −K G Kt
v RJ
ωarm RJ
6.11 Single-jointed arm 69

Theorem 6.11.1 — Single-jointed arm state-space model.

ẋ = Ax + Bu
y = Cx + Du
   
θarm angle
x= = y = θarm = angle u = V = voltage
ωarm angular velocity

" #
0 1
A= G2 K t (6.31)
0 − Kv RJ
 
0
B = GKt (6.32)
 RJ 
C= 1 0 (6.33)
D=0 (6.34)

6.11.2 Model augmentation


As per subsection 6.7.2, we will now augment the model so a uerror state is added to
the control input.
The plant and observer augmentations should be performed before the model is dis-
cretized. After the controller gain is computed with the unaugmented discrete model,
the controller may be augmented. Therefore, the plant and observer augmentations
assume a continuous model and the controller augmentation assumes a discrete con-
troller.  
x
xaug = y = θarm u = V
uerror
   
A B B  
Aaug = Baug = Caug = C 0 Daug = D (6.35)
01×2 0 0
 
  r
Kaug = K 1 raug = (6.36)
0

This will compensate for unmodeled dynamics such as gravity or other external loading
from lifted objects. However, if only gravity compensation is desired, a feedforward
of the form uf f = Vgravity cos θ is preferred where Vgravity is the voltage required
to hold the arm level with the ground and θ is the angle of the arm with the ground.
70 Chapter 6. Continuous state-space control

6.11.3 Gravity feedforward


Input voltage is proportional to torque and gravity is a constant force, but the torque
applied against the motor varies according to the arm’s angle. We’ll use sum of torques
to find a compensating torque.
We’ll model gravity as an acceleration disturbance −g. To compensate for it, we want
to find a torque that is equal and opposite to the torque applied to the arm by gravity.
The bottom row of the continuous elevator model contains the angular acceleration
terms, so Buf f is angular acceleration caused by the motor; JBuf f is the torque.
JBuf f = −(r × F)
JBuf f = −(rF cos θ)

Torque is usually written as rF sin θ where θ is the angle between the r and F vectors,
but θ in this case is being measured from the horizontal axis rather than the vertical
one, so the force vector is π4 radians out of phase. Thus, an angle of 0 results in the
maximum torque from gravity being applied rather than the minimum.
The force of gravity mg is applied at the center of the arm’s mass. For a uniform beam,
this is halfway down its length, or L2 where L is the length of the arm.
  
L
JBuf f = − (−mg) cos θ
2
L
JBuf f = mg cos θ
2
GKt
B= RJ , so
GKt L
J uf f = mg cos θ
RJ 2
RJ L
uf f = mg cos θ
JGKt 2
L Rmg
uf f = cos θ
2 GKt
L
2 can be adjusted according to the location of the arm’s center of mass.

6.11.4 Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[14] creates and tests a controller for it. Figure 6.9 shows the closed-loop
system response.
[14] https://github.com/calcmogul/frccontrol/blob/main/examples/single_jointed_arm.py
6.11 Single-jointed arm 71

Figure 6.9: Single-jointed arm response

6.11.5 Implementation
C++ and Java implementations of this single-jointed arm controller are available on-
line.[15][16]

[15] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibcExamples/src/main/cpp/examples/

StateSpaceArm/cpp/Robot.cpp
[16] https://github.com/wpilibsuite/allwpilib/blob/main/wpilibjExamples/src/main/java/edu/wpi/first/

wpilibj/examples/statespacearm/Robot.java
72 Chapter 6. Continuous state-space control

6.12 Controllability and observability


6.12.1 Controllability matrix
A system is controllable if it can be steered from any state to any state by a finite
sequence of admissible inputs.
The controllability matrix can be used to determine if a system is controllable.

Theorem 6.12.1 — Controllability. A continuous time-invariant linear state-


space model is controllable if and only if
 
rank B AB · · · An−1 B = n (6.37)

where rank is the number of linearly independent rows in a matrix and n is the
number of states.

The controllability matrix in equation (6.37) being rank-deficient means the inputs can-
not apply transforms along all axes in the state-space; the transformation the matrix
represents is collapsed into a lower dimension.
The condition number of the controllability matrix C is defined as σσmax (C)
min (C)
where σmax
[17]
is the maximum singular value and σmin is the minimum singular value. As this
number approaches infinity, one or more of the states becomes uncontrollable. This
number can also be used to tell us which actuators are better than others for the given
system; a lower condition number means that the actuators have more control authority.

6.12.2 Controllability Gramian


While the rank of the observability matrix can tell us whether the system is control-
lable, it won’t tell us which specific states are controllable or how controllable. The
controllability Gramian can be used to determine these things.
If A is stable, the controllability Gramian Wc is the unique solution to the following
continuous Lyapunov equation.

AWc + Wc AT + BBT = 0 (6.38)

Alternatively, Z ∞
T
Wc = eAτ BBT eA τ
dτ (6.39)
0

[17] Singular values are a generalization of eigenvalues for nonsquare matrices.


6.12 Controllability and observability 73

If the solution is positive definite, the system is controllable. The eigenvalues of Wc


represent how controllable their respective states are (larger means more controllable).

6.12.3 Controllability of specific states


If you want to know if a specific state is controllable, first find its corresponding eigen-
value λ in A. Then, that state is controllable if
 
rank λI − A B = n (6.40)

where n is the number of states.

6.12.4 Stabilizability
Stabilizability is a weaker form of controllability. A system is considered stabilizable
if one of the following conditions is true:
1. All uncontrollable states can be stabilized
2. All unstable states are controllable

6.12.5 Observability matrix


A system is observable if the state, whatever it may be, can be inferred from a finite
sequence of outputs.
Observability and controllability are mathematical duals; controllability proves that a
sequence of inputs exists that drives the system to any state, and observability proves
that a sequence of outputs exists that drives the state estimate to any true state.
The observability matrix can be used to determine if a system is observable.

Theorem 6.12.2 — Observability. A continuous time-invariant linear state-space


model is observable if and only if
 
C
 CA 
 
rank  ..  = n (6.41)
 . 
CAn−1

where rank is the number of linearly independent rows in a matrix and n is the
number of states.

The observability matrix in equation (6.41) being rank-deficient means the outputs do
not contain contributions from every state. That is, not all states are mapped to a linear
74 Chapter 6. Continuous state-space control

combination in the output. Therefore, the outputs alone are insufficient to estimate all
the states.
The condition number of the observability matrix O is defined as σσmax (O)
min (O)
where σmax
[17]
is the maximum singular value and σmin is the minimum singular value. As this
number approaches infinity, one or more of the states becomes unobservable. This
number can also be used to tell us which sensors are better than others for the given
system; a lower condition number means the outputs produced by the sensors are better
indicators of the system state.

6.12.6 Observability Gramian


While the rank of the observability matrix can tell us whether the system is observable,
it won’t tell us which specific states are observable or how observable. The observability
Gramian can be used to determine these things.
If A is stable, the observability Gramian Wo is the unique solution to the following
continuous Lyapunov equation.
AT W o + W o A + C T C = 0 (6.42)

Alternatively, Z ∞
T
Wo = eA τ CT CeAτ dτ (6.43)
0

If the solution is positive definite, the system is observable. The eigenvalues of Wo


represent how observable their respective states are (larger means more observable).

6.12.7 Observability of specific states


If you want to know if a specific state is observable, first find its corresponding eigen-
value λ in A. Then, that state is observable if
 
λI − A
rank =n (6.44)
C

where n is the number of states.

6.12.8 Detectability
Detectability is a weaker form of observability. A system is considered detectable if
one of the following conditions is true:
1. All unobservable states are stable
2. All unstable states are observable
Chaparral by Merril Apartments at UCSC

7. Discrete state-space control

The complex plane discussed so far deals with continuous systems. In decades past,
plants and controllers were implemented using analog electronics, which are continuous
in nature. Nowadays, microprocessors can be used to achieve cheaper, less complex
controller designs. Discretization converts the continuous model we’ve worked with so
far from a differential equation like

ẋ = x − 3 (7.1)

to a difference equation like


xk+1 − xk
= xk − 3
∆T
xk+1 − xk = (xk − 3)∆T
xk+1 = xk + (xk − 3)∆T (7.2)

where xk refers to the value of x at the k th timestep. The difference equation is run
with some update period denoted by T , by ∆T , or sometimes sloppily by dt.[1]
While higher order terms of a differential equation are derivatives of the state variable
(e.g., ẍ in relation to equation (7.1)), higher order terms of a difference equation are
delayed copies of the state variable (e.g., xk−1 with respect to xk in equation (7.2)).
[1] The discretization of equation (7.1) to equation (7.2) uses the forward Euler discretization method.
76 Chapter 7. Discrete state-space control

7.1 Continuous to discrete pole mapping


When a continuous system is discretized, its poles in the LHP map to the inside of a
unit circle. Table 7.1 contains a few common points and figure 7.1 shows the mapping
visually.

Continuous Discrete
(0, 0) (1, 0)
imaginary axis edge of unit circle
(−∞, 0) (0, 0)

Table 7.1: Mapping from continuous to discrete

Figure 7.1: Mapping of complex plane from continuous (left) to discrete (right)

7.1.1 Discrete system stability


Eigenvalues of a system that are within the unit circle are stable, but why is that? Let’s
consider a scalar equation xk+1 = axk . a < 1 makes xk+1 converge to zero. The
same applies to a complex number like z = x + yi for xk+1 = zxk . If the magnitude
of the complex number z is less than one, xk+1 will converge to zero. Values with a
magnitude of 1 oscillate forever because xk+1 never decays.
7.1 Continuous to discrete pole mapping 77

7.1.2 Discrete system behavior


As ω increases in s = jω, a pole in the discrete plane moves around the perimeter of
the unit circle. Once it hits ω2s (half the sampling frequency) at (−1, 0), the pole wraps
around. This is due to poles faster than the sample frequency folding down to below
the sample frequency (that is, higher frequency signals alias to lower frequency ones).
You may notice that poles can be placed at (0, 0) in the discrete plane. This is known
as a deadbeat controller. An Nth -order deadbeat controller decays to the reference in
N timesteps. While this sounds great, there are other considerations like control effort,
robustness, and noise immunity.
If poles from (1, 0) to (0, 0) on the x-axis approach infinity, then what do poles from
(−1, 0) to (0, 0) represent? Them being faster than infinity doesn’t make sense. Poles
in this location exhibit oscillatory behavior similar to complex conjugate pairs. See
figures 7.2 and 7.3. The jaggedness of these signals is due to the frequency of the
system dynamics being above the Nyquist frequency (twice the sample frequency). The
discretized signal doesn’t have enough samples to reconstruct the continuous system’s
dynamics.

Figure 7.2: Single poles in various loca- Figure 7.3: Complex conjugate poles in
tions in discrete plane various locations in discrete plane

7.1.3 Nyquist frequency


To completely reconstruct a signal, the Nyquist-Shannon sampling theorem states that
it must be sampled at a frequency at least twice the maximum frequency it contains.
The highest frequency a given sample rate can capture is called the Nyquist frequency,
which is half the sample frequency. This is why recorded audio is sampled at 44.1 kHz.
The maximum frequency a typical human can hear is about 20 kHz, so the Nyquist
78 Chapter 7. Discrete state-space control

Figure 7.4: The original signal is a 1.5 Hz sine wave, which means its Nyquist frequency
is 1.5 Hz. The signal is being sampled at 2 Hz, so the aliased signal is a 0.5 Hz sine
wave.

frequency is 20 kHz and the minimum sampling frequency is 40 kHz. (44.1 kHz in
particular was chosen for unrelated historical reasons.)
Frequencies above the Nyquist frequency are folded down across it. The higher fre-
quency and the folded down lower frequency are said to alias each other.[2] Figure 7.4
demonstrates aliasing.
The effect of these high-frequency aliases can be reduced with a low-pass filter (called
an anti-aliasing filter in this application).

7.2 Effects of discretization on controller performance


7.2.1 Sample delay
Implementing a discrete control system is easier than implementing a continuous one,
but discretization has drawbacks. A microcontroller updates the system input in dis-
crete intervals of duration T ; it’s held constant between updates. This introduces an
average sample delay of T2 . Large delays can make a stable controller in the continuous

[2] The def


aliases of a frequency f can be expressed as falias (N ) = |f − N fs |. For example, if a 200
Hz sine wave is sampled at 150 Hz, the observer will see a 50 Hz signal instead of a 200 Hz one.
7.3 Linear system discretization 79

domain become unstable in the discrete domain since it can’t react as quickly to output
changes. Here are a few ways to combat this.
• Run the controller with a high sample rate.
• Designing the controller in the analog domain with enough phase margin to com-
pensate for any phase loss that occurs as part of discretization.
• Convert the plant to the digital domain and design the controller completely in
the digital domain.

7.2.2 Sample rate


Running a feedback controller at a faster update rate doesn’t always mean better control.
In fact, you may be using more computational resources than you need. However, here
are some reasons for running at a faster update rate.
Firstly, if you have a discrete model of the system, that model can more accurately
approximate the underlying continuous system. Not all controllers use a model though.
Secondly, the controller can better handle fast system dynamics. If the system can move
from its initial state to the desired one in under 250 ms, you obviously want to run the
controller with a period less than 250 ms. When you reduce the sample period, you’re
making the discrete controller more accurately reflect what the equivalent continuous
controller would do (controllers built from analog circuit components like op-amps are
continuous).
Running at a lower sample rate only causes problems if you don’t take into account
the response time of your system. Some systems like heaters have outputs that change
on the order of minutes. Running a control loop at 1 kHz doesn’t make sense for this
because the plant input the controller computes won’t change much, if at all, in 1 ms.

7.3 Linear system discretization


We’re going to discretize the following continuous time state-space model

ẋ = Ac x + Bc u + w
y = Cc x + D c u + v

where w is the process noise, v is the measurement noise, and both are zero-mean
white noise sources with covariances of Qc and Rc respectively. w and v are defined
as normally distributed random variables.

w ∼ N (0, Qc )
v ∼ N (0, Rc )
80 Chapter 7. Discrete state-space control

Discretization is done using a zero-order hold. That is, the input is only updated at
discrete intervals and it’s held constant between samples.

7.3.1 Taylor series

R Watch the “Taylor series” video from 3Blue1Brown’s Essence of calculus series
for an explanation of how the Taylor series expansion works.

“Taylor series” (22 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/taylor-series

The definition for the matrix exponential and the approximations below all use the
Taylor series expansion. The Taylor series is a method of approximating a function
like et via the summation of weighted polynomial terms like tk . et has the following
Taylor series around t = 0.
X∞ n
t
et =
n=0
n!

where a finite upper bound on the number of terms produces an approximation of et .


As n increases, the polynomial terms increase in power and the weights by which they
are multiplied decrease. For et and some other functions, the Taylor series expansion
equals the original function for all values of t as the number of terms approaches infin-
ity.[3] Figure 7.5 shows the Taylor series expansion of et around t = 0 for a varying
number of terms.
We’ll expand the first few terms of the Taylor series expansion in equation (7.3) for
X = AT so we can compare it with other methods.

X
3
1 1 1
(AT )k = I + AT + A2 T 2 + A3 T 3
k! 2 6
k=0

[3] Functions for which their Taylor series expansion converges to and also equals it are called analytic

functions.
7.3 Linear system discretization 81

Figure 7.5: Taylor series expansions of et around t = 0 for n terms

Table 7.2 compares discretization methods for the matrix case, and table 7.3 compares
their Taylor series expansions. These use a more complex formula which we won’t
present here.

Method Conversion to Ad
Ac T
Zero-order hold e
 −1
Bilinear I + 12 Ac T I − 12 Ac T
−1
Backward Euler (I − Ac T )
Forward Euler I + Ac T

Table 7.2: Discretization methods (matrix case). The zero-order hold discretization
method is exact.

Each of them has different stability properties. The bilinear transform preserves the
(in)stability of the continuous time system.
82 Chapter 7. Discrete state-space control

Method Taylor series expansion


Zero-order hold I + Ac T + 12 A2c T 2 + 16 A3c T 3 + . . .
Bilinear I + Ac T + 12 A2c T 2 + 14 A3c T 3 + . . .
Backward Euler I + Ac T + A2c T 2 + A3c T 3 + . . .
Forward Euler I + Ac T

Table 7.3: Taylor series expansions of discretization methods (matrix case). The zero-
order hold discretization method is exact.

7.3.2 Matrix exponential

R Watch the “How (and why) to raise e to the power of a matrix” video from
3Blue1Brown’s Essence of linear algebra series for a visual introduction to the
matrix exponential.

“How (and why) to raise e to the power of a matrix” (27 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/matrix-exponents

Definition 7.3.1 — Matrix exponential. Let X be an n × n matrix. The ex-


ponential of X denoted by eX is the n × n matrix given by the following power
series.
X∞
1 k
eX = X (7.3)
k!
k=0

where X0 is defined to be the identity matrix I with the same dimensions as X.

To understand why the matrix exponential is used in the discretization process, consider
the scalar differential equation ẋ = ax. The solution to this type of differential equation
uses an exponential.
ẋ = ax
7.3 Linear system discretization 83

dx
= ax(t)
dt
dx = ax(t) dt
1
dx = a dt
x(t)
Z t Z t
1
dx = a dt
0 x(t) 0
ln(x(t))|t0 = at|t0
ln(x(t)) − ln(x(0)) = at − a · 0
ln(x(t)) − ln(x0 ) = at
 
x(t)
ln = at
x0
x(t)
= eat
x0
x(t) = eat x0

This solution generalizes via the matrix exponential to the set of differential equations
ẋ = Ax + Bu we use to describe systems.[4]
x(t) = eAt x0 + A−1 (eAt − I)Bu

where x0 contains the initial conditions and u is the constant input from time 0 to t. If
the initial state is the current system state, then we can describe the system’s state over
time as
xk+1 = eAT xk + A−1 (eAT − I)Buk
or more compactly,
xk+1 = Ad xk + Bd uk

where T is the time between samples xk and xk+1 . Theorem 7.3.1 has more efficient
ways to compute Ad and Bd .

7.3.3 Definition
The model can be discretized as follows
xk+1 = Ad xk + Bd uk + wk
[4] See section D.1 for a complete derivation of the linear system zero-order hold.
84 Chapter 7. Discrete state-space control

y k = Cd x k + D d u k + v k

with covariances

wk ∼ N (0, Qd )
vk ∼ N (0, Rd )

Theorem 7.3.1 — Linear system zero-order hold.

Ad = e Ac T (7.4)
Z T
Bd = eAc τ dτ Bc = A−1
c (Ad − I)Bc (7.5)
0
Cd = Cc (7.6)
Dd = Dc (7.7)
Z T
T
Qd = eAc τ Qc eAc τ dτ (7.8)
τ =0
1
Rd = Rc (7.9)
T

where subscripts c and d denote matrices for the continuous or discrete systems
respectively, T is the sample period of the discrete system, and eAc T is the matrix
exponential of Ac T .
Ad and Bd can be computed in one step as
 

Ac Bc   
T
0 0 Ad Bd
e =
0 I

and Qd can be computed as


 

−Ac Qc   
A−1
T
0 AT
c −Ad d Qd
Φ=e =
0 ATd

where Qd = ΦT
2,2 Φ1,2 [10].
7.4 Discrete state-space notation 85

See appendix D.1 for derivations.


To see why Rc is being divided by T , consider the discrete white noise sequence vk
and the (non-physically realizable) continuous white noise process v. Whereas Rd,k =
E[vk vkT ] is a covariance matrix, Rc (t) defined by E[v(t)vT (τ )] = Rc (t)δ(t − τ ) is a
spectral density matrix (the Dirac function δ(t−τ ) has units of 1/sec). The covariance
matrix Rc (t)δ(t − τ ) has infinite-valued elements. The discrete white noise sequence
can be made to approximate the continuous white noise process by shrinking the pulse
lengths (T ) and increasing their amplitude, such that Rd → T1 Rc .
That is, in the limit as T → 0, the discrete noise sequence tends to one of infinite-
valued pulses of zero duration such that the area under the “impulse” autocorrelation
function is Rd T . This is equal to the area Rc under the continuous white noise impulse
autocorrelation function.

7.4 Discrete state-space notation


Below is the discrete version of state-space notation.

Definition 7.4.1 — Discrete state-space notation.

xk+1 = Axk + Buk (7.10)


yk = Cxk + Duk (7.11)

A system matrix x state vector


B input matrix u input vector
C output matrix y output vector
D feedthrough matrix

7.5 Closed-loop controller


With the control law uk = K(rk − xk ), we can derive the closed-loop state-space
equations. We’ll discuss where this control law comes from in subsection 7.7.
First is the state update equation. Substitute the control law into equation (7.10).

xk+1 = Axk + BK(rk − xk )


xk+1 = Axk + BKrk − BKxk
xk+1 = (A − BK)xk + BKrk (7.12)
86 Chapter 7. Discrete state-space control

Now for the output equation. Substitute the control law into equation (7.11).
yk = Cxk + D(K(rk − xk ))
yk = Cxk + DKrk − DKxk
yk = (C − DK)xk + DKrk (7.13)

Instead of commanding the system to a state using the vector uk directly, we can now
specify a vector of desired states through rk and the controller will choose values of
uk for us over time that drive the system toward the reference.
The eigenvalues of A − BK are the poles of the closed-loop system. Therefore, the
rate of convergence and stability of the closed-loop system can be changed by moving
the poles via the eigenvalues of A − BK. A and B are inherent to the system, but
K can be chosen arbitrarily by the controller designer. For equation (7.12) to reach
steady-state, the eigenvalues of A − BK must be in the left-half plane. There will be
steady-state error if Ark ̸= rk .

Symbol Name Rows × Columns


A system matrix states × states
B input matrix states × inputs
C output matrix outputs × states
D feedthrough matrix outputs × inputs
K controller gain matrix inputs × states
r reference vector states × 1
x state vector states × 1
u input vector inputs × 1
y output vector outputs × 1

Table 7.4: Controller matrix dimensions

7.6 Pole placement


This is the practice of placing the poles of a closed-loop system directly to produce a de-
sired response. Python Control offers several pole placement algorithms for generating
controller or observer gains from a set of poles.
7.7 Linear-quadratic regulator 87

Since all our applications will be discrete systems, we’ll place poles in the discrete
domain (the z-plane). The s-plane’s LHP maps to the inside of a unit circle (see figure
7.6).

Figure 7.6: Mapping of complex plane from continuous (left) to discrete (right)

Pole placement should only be used if you know what you’re doing. It’s much easier to
let LQR place the poles for you, which we’ll discuss next.

7.7 Linear-quadratic regulator


7.7.1 The intuition
We can demonstrate the basic idea behind the linear-quadratic regulator with the fol-
lowing flywheel model.
ẋ = ax + bu

where a is a negative constant representing the back-EMF of the motor, x is the angular
velocity, b is a positive constant that maps the input voltage to some change in angular
velocity (angular acceleration), u is the voltage applied to the motor, and ẋ is the angular
acceleration. Discretized, this equation would look like

xk+1 = ad x + bd uk

If the angular velocity starts from zero and we apply a positive voltage, we’d see the
motor spin up to some constant speed following an exponential decay, then stay at that
88 Chapter 7. Discrete state-space control

speed. If we throw in the control law uk = kp (rk − xk ), we can make the system
converge to a desired state rk through proportional feedback. In what manner can we
pick the constant kp that balances getting to the target angular velocity quickly with
getting there efficiently (minimal oscillations or excessive voltage)?
We can solve this problem with something called the linear-quadratic regulator. We’ll
define the following cost function that includes the states and inputs:

X
J= (Q(rk − xk )2 + Ru2k )
k=0

We want to minimize this while obeying the constraint that the system follow our fly-
wheel dynamics xk+1 = ad xk + bd uk .
The cost is the sum of the squares of the error and the input for all time. If the controller
gain kp we pick in the control law uk = kp (rk − xk ) is stable, the error rk − xk and
the input uk will both go to zero and give us a finite cost. Q and R let us decide
how much the error and input contribute to the cost (we will require that Q ≥ 0 and
R > 0 for reasons that will be clear shortly[5] ). Penalizing error more will make the
controller more aggressive, while penalizing the input more will make the controller
less aggressive. We want to pick a kp that minimizes the cost.
There’s a common trick for finding the value of a variable that minimizes a function of
that variable. We’ll take the derivative (the slope) of the cost function with respect to
the input uk , set the derivative to zero, then solve for uk . When the slope is zero, the
function is at a minimum or maximum. Now, the cost function we picked is quadratic.
All the terms are strictly positive on account of the squared variables and nonnegative
weights, so our cost is strictly positive and the quadratic function is concave up. The
uk we found is therefore a minimum.
The actual process of solving for uk is mathematically intensive and outside the scope
of this explanation (appendix B references a derivation for those curious). The rest of
this section will describe the more general form of the linear-quadratic regulator and
how to use it.

7.7.2 The mathematical definition


Instead of placing the poles of a closed-loop system manually, the linear-quadratic
regulator (LQR) places the poles for us based on acceptable relative error and control
[5] Lets consider the boundary conditions on the weights Q and R. If we set Q to zero, error doesn’t

contribute to the cost, so the optimal solution is to not move. This minimizes the sum of the inputs over time.
If we let R be zero, the input doesn’t contribute to the cost, so infinite inputs are allowed as they minimize
the sum of the errors over time. This isn’t useful, so we require that the input be penalized with a nonzero
R.
7.7 Linear-quadratic regulator 89

Figure 7.7: Cost-to-go for elevator model

effort costs. This method of controller design uses a quadratic function for the cost-
to-go defined as the sum of the error and control effort over time for the linear system
xk+1 = Axk + Buk .

X 
J= xT T
k Qxk + uk Ruk
k=0

where J represents a trade-off between state excursion and control effort with the
weighting factors Q and R. LQR is a control law u that minimizes the cost func-
tional. Figure 7.7 shows the optimal cost-to-go for an elevator model. Pole placement,
on the other hand, will have a cost-to-go above this for an arbitrary state vector (in this
case, an arbitrary position-velocity pair).
The cost-to-go looks effectively constant along the velocity axis because the velocity
is contributing much less to the cost-to-go than position.[6] In other words, it’s much
more expensive to correct for a position error than a velocity error. This difference
[6]While it may not look like it at this scale, the elevator’s cost-to-go along both state axes is strictly
90 Chapter 7. Discrete state-space control

in cost is reflected by LQR’s selected position feedback gain of Kp = 234.041 and


selected velocity feedback gain of Kd = 5.603.
The minimum of LQR’s cost functional is found by setting the derivative of the cost
functional to zero and solving for the control law uk . However, matrix calculus is used
instead of normal calculus to take the derivative.
The feedback control law that minimizes J is shown in theorem 7.7.1.

Theorem 7.7.1 — Linear-quadratic regulator.



X 
u∗k = arg min xT T
k Qxk + uk Ruk
uk
k=0
subject to xk+1 = Axk + Buk (7.14)

If the system is controllable, the optimal control policy u∗k that drives all the states
to zero is −Kxk . To converge to nonzero states, a reference vector rk can be added
to the state xk .
uk = K(rk − xk ) (7.15)

This means that optimal control can be achieved with simply a set of proportional gains
on all the states. To use the control law, we need knowledge of the full state of the
system. That means we either have to measure all our states directly or estimate those
we do not measure.
See appendix B for how K is calculated. If the result is finite, the controller is guaran-
teed to be stable and robust with a phase margin of 60 degrees [20].

R LQR design’s Q and R matrices don’t need discretization, but the K calculated
for continuous time and discrete time systems will be different. The discrete
time gains approach the continuous time gains as the sample period tends to
zero.

7.7.3 Bryson’s rule


The next obvious question is what values to choose for Q and R. While this can
be more of an art than a science, Bryson’s rule provides a good starting point. With

increasing away from the origin (convex upward). This means there’s a global minimum at the origin, and
the system is globally stable; no matter what state you start from, you’re guaranteed to reach the origin with
this controller.
7.7 Linear-quadratic regulator 91

Bryson’s rule, the diagonals of the Q and R matrices are chosen based on the maxi-
mum acceptable value for each state and actuator. The nondiagonal elements are zero.
The balance between state excursion and control effort penalty (Q and R respectively)
can be adjusted using the weighting factor ρ.
   
ρ 1
Q = diag R = diag
x2max u2max

where xmax is the vector of acceptable state tolerances, umax is the vector of accept-
able input tolerances, and ρ is a tuning constant. Small values of ρ penalize control
effort while large values of ρ penalize state excursions. Large values would be chosen
in applications like fighter jets where performance is necessary. Spacecrafts would use
small values to conserve their limited fuel supply.

7.7.4 Pole placement vs LQR


This example uses the following continuous second-order model for a CIM motor (a
DC brushed motor).
 b   
−J Kt
0    
A= J B= 1 C= 1 0 D= 0
− KLe − R L L

Figure 7.8 shows the response using various discrete pole placements and using LQR
with the following cost matrices.
 1 
0  
Q = 202 R = 1212
0 0

With Bryson’s rule, this means an angular velocity tolerance of 20 rad/s, an infinite
current tolerance (in other words, we don’t care what the current does), and a voltage
tolerance of 12 V.
Notice with pole placement that as the current pole moves toward the origin, the control
effort becomes more aggressive.
92 Chapter 7. Discrete state-space control

Figure 7.8: Second-order CIM motor response with pole placement and LQR

7.8 Feedforward
There are two types of feedforwards: model-based feedforward and feedforward for
unmodeled dynamics. The first solves a mathematical model of the system for the
inputs required to meet desired velocities and accelerations. The second compensates
for unmodeled forces or behaviors directly so the feedback controller doesn’t have to.
Both types can facilitate simpler feedback controllers; we’ll cover examples of each.

7.8.1 Plant inversion


Plant inversion is a method of model-based feedforward that solves the plant for the
input that will make the plant track a desired state. This is called inversion because in a
block diagram, the inverted plant feedforward and plant cancel out to produce a unity
system from input to output.

u(t)
r(t) y(t)

Figure 7.9: Open-loop control system with plant inversion feedforward

While it can be an effective tool, the following should be kept in mind.


7.8 Feedforward 93

1. Don’t invert an unstable plant. If the expected plant doesn’t match the real plant
exactly, the plant inversion will still result in an unstable system. Stabilize the
plant first with feedback, then inject an inversion.
2. Don’t invert a nonminimum phase system. The advice for pole-zero cancellation
in subsection E.2.2 applies here.
Setup
Let’s start with the equation for the reference dynamics

rk+1 = Ark + Buk

where uk is the feedforward input. Note that this feedforward equation does not and
should not take into account any feedback terms. We want to find the optimal uk such
that we minimize the tracking error between rk+1 and rk .

rk+1 − Ark = Buk

To solve for uk , we need to take the inverse of the nonsquare matrix B. This isn’t
possible, but we can find the pseudoinverse given some constraints on the state tracking
error and control effort. To find the optimal solution for these sorts of trade-offs, one
can define a cost function and attempt to minimize it. To do this, we’ll first solve the
expression for 0.
0 = Buk − (rk+1 − Ark )

This expression will be the state tracking cost we use in the following cost function as
an H2 norm.

J = (Buk − (rk+1 − Ark ))T (Buk − (rk+1 − Ark ))

Minimization
Given theorem 5.12.1, find the minimum of J by taking the partial derivative with
respect to uk and setting the result to 0.

∂J
= 2BT (Buk − (rk+1 − Ark ))
∂uk
0 = 2BT (Buk − (rk+1 − Ark ))
0 = 2BT Buk − 2BT (rk+1 − Ark )
2BT Buk = 2BT (rk+1 − Ark )
BT Buk = BT (rk+1 − Ark )
94 Chapter 7. Discrete state-space control

uk = (BT B)−1 BT (rk+1 − Ark )

(BT B)−1 BT is the Moore-Penrose pseudoinverse of B denoted by B+ .

Theorem 7.8.1 — Linear plant inversion. Given the discrete model xk+1 =
Axk + Buk , the plant inversion feedforward is

uk = B+ (rk+1 − Ark ) (7.16)

where B+ is the Moore-Penrose pseudoinverse of B, rk+1 is the reference at the


next timestep, and rk is the reference at the current timestep.

Discussion
Linear plant inversion in theorem 7.8.1 compensates for reference dynamics that don’t
follow how the model inherently behaves. If they do follow the model, the feedforward
has nothing to do as the model already behaves in the desired manner. When this occurs,
rk+1 − Ark will return a zero vector.
For example, a constant reference requires a feedforward that opposes system dynamics
that would change the state over time. If the system has no dynamics, then A = I and
thus
uk = B+ (rk+1 − Irk )
uk = B+ (rk+1 − rk )
For a constant reference, rk+1 = rk .
uk = B+ (rk − rk )
uk = B+ (0)
uk = 0

so no feedforward is required to hold a system with no dynamics at a constant reference,


as expected.
Figure 7.10 shows plant inversion applied to a second-order CIM motor model. Plant
inversion accounts for the motor back-EMF and eliminates steady-state error.

7.8.2 Unmodeled dynamics


In addition to plant inversion, one can include feedforwards for unmodeled dynamics.
Consider an elevator model which doesn’t include gravity. A constant voltage offset
7.8 Feedforward 95

Figure 7.10: Second-order CIM motor response with plant inversion

can be used compensate for this. The feedforward takes the form of a voltage constant
because voltage is proportional to force applied, and the force is acting in only one
direction at all times.

uk = Vapp (7.17)

where Vapp is a constant. Another feedforward holds a single-jointed arm steady in the
presence of gravity. It has the following form.

uk = Vapp cos θ (7.18)

where Vapp is the voltage required to keep the single-jointed arm level with the ground,
and θ is the angle of the arm relative to the ground. Therefore, the force applied is
greatest when the arm is parallel with the ground and zero when the arm is perpendicular
to the ground (at that point, the joint supports all the weight).
Note that the elevator model could be augmented easily enough to include gravity and
still be linear, but this wouldn’t work for the single-jointed arm since a trigonometric
function is required to model the gravitational force in the arm’s rotating reference
frame.[7]
[7]While the applied torque of the motor is constant throughout the arm’s range of motion, the torque

caused by gravity in the opposite direction varies according to the arm’s angle.
96 Chapter 7. Discrete state-space control

7.8.3 Do feedforwards affect stability?


Feedforwards have no effect on stability because they don’t use the system state. We’ll
demonstrate this for a plant inversion feedforward.
Let uk = K(rk − xk ) be a feedback controller for the discrete model xk+1 = Axk +
Buk .

xk+1 = Axk + Buk


xk+1 = Axk + B(K(rk − xk ))
xk+1 = Axk + BK(rk − xk )
xk+1 = Axk + BKrk − BKxk
xk+1 = (A − BK)xk + BKrk

The system is stable if the eigenvalues of A − BK are within the unit circle. Now add
the plant inversion feedforward controller B+ (rk+1 − Ark ) to uk .

xk+1 = Axk + Buk


xk+1 = Axk + B(K(rk − xk ) + B+ (rk+1 − Ark ))
xk+1 = Axk + BK(rk − xk ) + rk+1 − Ark
xk+1 = Axk + BKrk − BKxk + rk+1 − Ark
xk+1 = (A − BK)xk + rk+1 + (BK − A)rk
xk+1 = (A − BK)xk + rk+1 − (A − BK)rk

The multiplicative term on xk is still A−BK, so the feedforward didn’t affect stability.
It still affects the system response and steady-state error though.

7.9 Numerical integration methods


Most systems don’t have linear dynamics and their differential equations can’t be solved
analytically. Instead, we’ll have to approximate their solutions with numerical integra-
tion.
7.9 Numerical integration methods 97

7.9.1 Butcher tableaus


Butcher tableaus are a more succinct representation for explicit and implicit Runge-
Kutta numerical integration methods. Here’s the general structure for explicit methods.

0
c2 a2,1
.. .. ..
. . .
cs as,1 ... as,s−1
b1 ... ... bs

where s is the number of stages in the method, the matrix [aij ] is the Runge-Kutta
matrix, b1 , . . . , bs are the weights, and c1 , . . . , cs are the nodes. The top-left quadrant
contains the sums of the rows in the top-right quadrant. Each column in the right half
corresponds to a k coefficient from k1 to ks .
The family of solutions to ẋ = f (t, x) is given by

k1 = f (tk , xk )
k2 = f (tk + c2 h, xk + h(a2,1 k1 ))
..
.
ks = f (tk + cs h, xk + h(as,1 k1 + . . . + as,s−1 ks−1 ))
Xs
xk+1 = xk + h bi k i
i=1

where h is the timestep duration.

7.9.2 Forward Euler method


The simplest explicit Runge-Kutta integration method is forward Euler integration. We
don’t recommend using it because it suffers from numerical stability issues. We’ll
demonstrate how to translate its Butcher tableau into equations that integrate ẋ =
f (t, x) from 0 to h.

0
k1 = f (t+0h,xk ) 1
xk+1 = xk + h(1k1 )
98 Chapter 7. Discrete state-space control

Remove zeroed out terms.


k1 = f (t, xk )
xk+1 = xk + hk1
Simplify.
xk+1 = xk + hf (t, xk )

In FRC, our differential equations are of the form ẋ = f (x, u) where u is held constant
between timesteps. Since it’s time-invariant, we can ignore the time argument of the
integration method. This gives theorem 7.9.1.

Theorem 7.9.1 — Forward Euler integration. Given the differential equation


ẋ = f (xk , uk ), this method solves for xk+1 at h seconds in the future. u is assumed
to be held constant between timesteps.

xk+1 = xk + hf (xk , uk ) 0
1

7.9.3 Runge-Kutta fourth-order method


The most common method we’ll cover is Runge-Kutta fourth-order (RK4). It’s simple
and accurate for most systems we’ll see in FRC. We’ll demonstrate how to translate its
Butcher tableau into equations that integrate ẋ = f (t, x) from 0 to h.

0
1 1
k1 = f (t+0h, xk ) 2 2
1 1
1 1 2 0 2
k2 = f (t+ h,xk + h( k1 ))
2 2 1 0 0 1
1 1 1 1 1 1
k3 = f (t+ h,xk + h(0k1 + k2 )) 6 3 3 6
2 2
k4 = f (t+1h, xk + h(0k1 + 0k2 + 1k3 ))
1 1 1 1
xk+1 = xk + h( k1 + k2 + k3 + k4 )
6 3 3 6

Remove zeroed out terms.


k1 = f (t, xk )
7.9 Numerical integration methods 99

1 1
k2 = f (t + h, xk + h k1 )
2 2
1 1
k3 = f (t + h, xk + h k2 )
2 2
k4 = f (t + h, xk + hk3 )
 
1 1 1 1
xk+1 = xk + h k1 + k2 + k3 + k4
6 3 3 6
1
6 is usually factored out of the last equation to reduce the number of floating point
operations.
1
xk+1 = xk + h (k1 + 2k2 + 2k3 + k4 )
6

In FRC, our differential equations are of the form ẋ = f (x, u) where u is held constant
between timesteps. Since it’s time-invariant, we can ignore the time argument of the
integration method. This gives theorem 7.9.2.

Theorem 7.9.2 — Runge-Kutta fourth-order integration. Given the differential


equation ẋ = f (xk , uk ), this method solves for xk+1 at h seconds in the future. u
is assumed to be held constant between timesteps.

0
k1 = f (xk , uk ) 1 1
1 2 2
k2 = f (xk + h k1 , uk ) 1 1
2 0 2
2
1 1 0 0 1
k3 = f (xk + h k2 , uk ) 1 1 1 1
2 6 3 3 6
k4 = f (xk + hk3 , uk )
1
xk+1 = xk + h (k1 + 2k2 + 2k3 + k4 )
6

Here’s a reference implementation.


#include <Eigen/Core>
#include <units/time.h>

/**
* Performs 4th order Runge-Kutta integration of dx/dt = f(x, u) for dt.
*
* @param f The function to integrate. It must take two arguments x and u.
100 Chapter 7. Discrete state-space control

* @param x The initial value of x.


* @param u The value u held constant over the integration period.
* @param dt The time over which to integrate.
*/
template <typename F, typename T, typename U>
T RK4(F&& f, T x, U u, units::second_t dt) {
const auto h = dt.value();

T k1 = f(x, u);
T k2 = f(x + h * 0.5 * k1, u);
T k3 = f(x + h * 0.5 * k2, u);
T k4 = f(x + h * k3, u);

return x + h / 6.0 * (k1 + 2.0 * k2 + 2.0 * k3 + k4);


}

Snippet 7.1. RK4 implementation in C++

Other methods of Runge-Kutta integration exist with various properties [1], but the
one presented here is popular for its high accuracy relative to the amount of floating
point operations (FLOPs) it requires.

7.9.4 Dormand-Prince method


Dormand-Prince (RKDP) is a fourth-order method with fifth-order error checking. It
uses an adaptive stepsize to enforce an upper bound on the integration error.

Theorem 7.9.3 — Dormand-Prince integration. Given the differential equation


ẋ = f (xk , uk ), this method solves for xk+1 at h seconds in the future. u is assumed
to be held constant between timesteps. It has the following Butcher tableau.

0
1 1
5 5
3 3 9
10 40 40
4
5
44
45 − 56
15
32
9
8
9
19372
6561 − 25360
2187
64448
6561 − 212
729
1 9017
3168 − 355
33
46732
5247
49
176 − 18656
5103

1 35
384 0 500
1113
125
192 − 2187
6784
11
84
35
384 0 500
1113
125
192 − 2187
6784
11
84 0
5179
57600 0 7571
16695
393
640 − 339200
92097 187
2100
1
40

The first row of coefficients below the table divider gives the fifth-order accurate
solution. The second row gives an alternative solution which, when subtracted from
the first solution, gives the error estimate.
7.9 Numerical integration methods 101

Here’s a reference implementation.


#include <algorithm>
#include <array>
#include <cmath>

#include <Eigen/Core>
#include <units/time.h>

/**
* Performs adaptive Dormand-Prince integration of dx/dt = f(x, u) for dt.
*
* @param f The function to integrate. It must take two arguments x
and
* u.
* @param x The initial value of x.
* @param u The value u held constant over the integration period.
* @param dt The time over which to integrate.
* @param maxError The maximum acceptable truncation error. Usually a small
* number like 1e-6.
*/
template <typename F, typename T, typename U>
T RKDP(F&& f, T x, U u, units::second_t dt, double maxError = 1e-6) {
// See https://en.wikipedia.org/wiki/Dormand%E2%80%93Prince_method for the
// Butcher tableau the following arrays came from.

constexpr int kDim = 7;

// clang-format off
constexpr double A[kDim - 1][kDim - 1]{
{ 1.0 / 5.0},
{ 3.0 / 40.0, 9.0 / 40.0},
{ 44.0 / 45.0, -56.0 / 15.0, 32.0 / 9.0},
{19372.0 / 6561.0, -25360.0 / 2187.0, 64448.0 / 6561.0, -212.0 /
729.0},
{ 9017.0 / 3168.0, -355.0 / 33.0, 46732.0 / 5247.0, 49.0 / 176.0,
-5103.0 / 18656.0},
{ 35.0 / 384.0, 0.0, 500.0 / 1113.0, 125.0 / 192.0,
-2187.0 / 6784.0, 11.0 / 84.0}};
// clang-format on

constexpr std::array<double, kDim> b1{


35.0 / 384.0, 0.0, 500.0 / 1113.0, 125.0 / 192.0, -2187.0 / 6784.0,
11.0 / 84.0, 0.0};
constexpr std::array<double, kDim> b2{5179.0 / 57600.0, 0.0,
7571.0 / 16695.0, 393.0 / 640.0,
-92097.0 / 339200.0, 187.0 / 2100.0,
1.0 / 40.0};

T newX;
double truncationError;

double dtElapsed = 0.0;


double h = dt.value();

// Loop until we've gotten to our desired dt


while (dtElapsed < dt.value()) {
do {
// Only allow us to advance up to the dt remaining
h = std::min(h, dt.value() - dtElapsed);
102 Chapter 7. Discrete state-space control

// clang-format off
T k1 = f(x, u);
T k2 = f(x + h * (A[0][0] * k1), u);
T k3 = f(x + h * (A[1][0] * k1 + A[1][1] * k2), u);
T k4 = f(x + h * (A[2][0] * k1 + A[2][1] * k2 + A[2][2] * k3), u);
T k5 = f(x + h * (A[3][0] * k1 + A[3][1] * k2 + A[3][2] * k3 + A[3][3]
* k4), u);
T k6 = f(x + h * (A[4][0] * k1 + A[4][1] * k2 + A[4][2] * k3 + A[4][3]
* k4 + A[4][4] * k5), u);
// clang-format on

// Since the final row of A and the array b1 have the same coefficients
// and k7 has no effect on newX, we can reuse the calculation.
newX = x + h * (A[5][0] * k1 + A[5][1] * k2 + A[5][2] * k3 +
A[5][3] * k4 + A[5][4] * k5 + A[5][5] * k6);
T k7 = f(newX, u);

truncationError = (h * ((b1[0] - b2[0]) * k1 + (b1[1] - b2[1]) * k2 +


(b1[2] - b2[2]) * k3 + (b1[3] - b2[3]) * k4 +
(b1[4] - b2[4]) * k5 + (b1[5] - b2[5]) * k6 +
(b1[6] - b2[6]) * k7))
.norm();

h *= 0.9 * std::pow(maxError / truncationError, 1.0 / 5.0);


} while (truncationError > maxError);

dtElapsed += h;
x = newX;
}

return x;
}

Snippet 7.2. RKDP implementation in C++


Trees by Interdisciplinary Sciences building at UCSC

8. Nonlinear control

While many tools exist for designing controllers for linear systems, all systems in real-
ity are inherently nonlinear. We’ll briefly mention some considerations for nonlinear
systems.

8.1 Introduction
Recall from linear system theory that we defined systems as having the following form:

ẋ = Ax + Bu + w
y = Cx + Du + v

In this equation, A and B are constant matrices, which means they are both time-
invariant and linear (all transformations on the system state are linear ones, and those
transformations remain the same for all time). In nonlinear and time-variant systems,
the state evolution and output are defined by arbitrary functions of the current states
and inputs.

ẋ = f (x, u, w)
y = h(x, u, v)

Nonlinear functions come up regularly when attempting to control the pose of a vehicle
in the global coordinate frame instead of the vehicle’s rotating local coordinate frame.
104 Chapter 8. Nonlinear control

Converting from one to the other requires applying a rotation matrix, which consists of
sine and cosine operations. These functions are nonlinear.

8.2 Linearization
One way to control nonlinear systems is to linearize the model around a reference point.
Then, all the powerful tools that exist for linear controls can be applied. This is done by
taking the Jacobians of f and h with respect to the state and input vectors. See section
5.12 for more on Jacobians.
∂f (x, u, w) ∂f (x, u, w)
A= B=
∂x ∂u
∂h(x, u, v) ∂h(x, u, v)
C= D=
∂x ∂u

Linearization of a nonlinear equation is a Taylor series expansion to only the first-order


terms (that is, terms whose variables have exponents on the order of x1 ). This is where
the small angle approximations for sin θ and cos θ (θ and 1 respectively) come from.
Higher order partial derivatives can be added to better approximate the nonlinear dy-
namics. We typically only linearize around equilibrium points[1] because we are inter-
ested in how the system behaves when perturbed from equilibrium. An FAQ on this
goes into more detail [5]. To be clear though, linearizing the system around the current
state as the system evolves does give a closer approximation over time.
Note that linearization with static matrices (that is, with a time-invariant linear system)
only works if the original system in question is feedback linearizable.

8.3 Lyapunov stability


Lyapunov stability is a fundamental concept in nonlinear control, so we’re going to give
a brief overview of what it is so students can research it further.
Since the state evolution in nonlinear systems is defined by a function rather than a con-
stant matrix, the system’s poles as determined by linearization move around. Nonlinear
control uses Lyapunov stability to determine if nonlinear systems are stable. From a
linear control theory point of view, Lyapunov stability says the system is stable if, for
a given initial condition, all possible eigenvalues of A from that point on remain in the
left-half plane. However, nonlinear control uses a different definition.

[1] Equilibrium points are points where ẋ = 0. At these points, the system is in steady-state.
8.3 Lyapunov stability 105

Lyapunov stability means that the system trajectory can be kept arbitrarily close to
the origin by starting sufficiently close to it. Lyapunov’s direct method uses a function
consisting of the energy in a system or derivatives of the system’s state to prove stability
around an equilibrium point. This is done by showing that the function, and thus its
inputs, decay to some ground state. More rigorously, the value function V (x) must be
positive definite and equal zero at the equilibrium point

V (x) > 0
V (0) = 0

and its derivative V̇ (x) must be negative definite.

dV ∂V dx
V̇ (x) = = ≤0
dt ∂x dt

More than one Lyapunov function can prove stability, and if one function doesn’t prove
it, another candidate should be tried. For this reason, we refer to these functions as
Lyapunov candidate functions.

8.3.1 Lyapunov stability for linear systems


We’re going to find stability criteria for the linear system ẋ = Ax using Lyapunov
theory. Let’s use the following Lyapunov candidate function.

V (x) = xT Px where P = PT > 0

This function is positive definite by definition. Its derivative is

V̇ (x) = ẋT Px + xT Ṗx + xT Pẋ


V̇ (x) = ẋT Px + xT 0x + xT Pẋ
V̇ (x) = ẋT Px + xT Pẋ
V̇ (x) = (Ax)T Px + xT P(Ax)
V̇ (x) = xT AT Px + xT PAx
V̇ (x) = xT (AT P + PA)x

For this function to be negative definite, AT P + PA must be negative definite. Since


P is positive definite, the only way to satisfy that condition is if A is negative definite
(i.e., A is stable).
106 Chapter 8. Nonlinear control

8.4 Affine systems


Let x = x0 + δx and u = u0 + δu where δx and δu are perturbations from (x0 , u0 ).
A first-order linearization of ẋ = f (x, u) around (x0 , u0 ) gives
∂f (x, u) ∂f (x, u)
ẋ ≈ f (x0 , u0 ) + δx + δu
∂x x0 ,u0 ∂u x0 ,u0
∂f (x, u) ∂f (x, u)
ẋ = f (x0 , u0 ) + δx + δu
∂x x0 ,u0 ∂u x0 ,u0

An affine system is a linear system with a constant offset in the dynamics. If (x0 , u0 ) is
an equilibrium point, f (x0 , u0 ) = 0, the resulting model is linear, and LQR works as
usual. If (x0 , u0 ) is, say, the current operating point rather than an equilibrium point,
the easiest way to correctly apply LQR is
1. Find a control input u0 that makes (x0 , u0 ) an equilibrium point.
2. Obtain an LQR for the linearized system.
3. Add u0 to the LQR’s control input.
A control-affine system is of the form ẋ = f (x)+g(x)u. Since it has separable control
inputs, u0 can be derived via plant inversion as follows.
ẋ = f (x0 ) + g(x0 )u0
0 = f (x0 ) + g(x0 )u0
g(x0 )u0 = −f (x0 )
u0 = −g −1 (x0 )f (x0 ) (8.1)

For the control-affine system ẋ = f (x) + Bu, u0 would be


u0 = −B+ f (x0 ) (8.2)

8.4.1 Feedback linearization for reference tracking


Feedback linearization lets us erase the nonlinear dynamics of a system so we can apply
our own (usually linear) dynamics for reference tracking. To do this, we will perform a
similar procedure as in subsection 7.8.1 and solve for u given the reference dynamics
in ṙ.
ṙ = f (x) + Bu
Bu = ṙ − f (x)
u = B+ (ṙ − f (x)) (8.3)
8.5 Pendulum 107

R To use equation (8.3) in a discrete controller, one can approximate ṙ with


rk+1 −rk
T
where T is the time period between the two references.

8.5 Pendulum
8.5.1 State-space model
Below is the model for a pendulum
g
θ̈ = − sin θ
l

where θ is the angle of the pendulum and l is the length of the pendulum.
Since state-space representation requires that only single derivatives be used, they should
be broken up as separate states. We’ll reassign θ̇ to be ω so the derivatives are easier
to keep straight for state-space representation.
g
ω̇ = − sin θ
l
Now separate the states.

θ̇ = ω
g
ω̇ = − sin θ
l
 T
This makes our state vector θ ω and our nonlinear model the following.
 
ω
f (x, u) =
− gl sin θ

Linearization around θ = 0
To apply our tools for linear control theory, the model must be a linear combination
of the states and inputs (addition and multiplication by constants). Since this model is
nonlinear on account of the sine function, we should linearize it.
Linearization finds a tangent line to the nonlinear dynamics at a desired point in the state-
space. The Taylor series is a way to approximate arbitrary functions with polynomials,
so we can use it for linearization.
The taylor series expansion for sin θ around θ = 0 is θ − 16 θ3 + 1 5
120 θ − . . .. We’ll
take just the first-order term θ to obtain a linear function.

θ̇ = ω
108 Chapter 8. Nonlinear control
g
ω̇ = − θ
l
Now write the model in state-space representation. We’ll write out the system of equa-
tions with the zeroed variables included to assist with this.

θ̇ = 0θ + 1ω
g
ω̇ = − θ + 0ω
l
Factor out θ and ω into a column vector.
˙   
θ 0 1 θ
= (8.4)
ω − gl 0 ω

Linearization with the Jacobian


Here’s the original nonlinear model in state-space representation.
 
ω
f (x, u) =
− gl sin θ

If we want to linearize around an arbitrary point, we can take the Jacobian with respect
to x.
 
∂f (x, u) 0 1
=
∂x − gl cos θ 0

For full state feedback, knowledge of all states is required. If not all states are measured
directly, an estimator can be used to supplement them.
We may only be measuring θ in the pendulum example, not θ̇, so we’ll need to estimate
the latter. The C matrix the observer would use in this case is
 
C= 1 0

Therefore, the output vector is

y = Cx
 
  θ
y= 1 0
ω
y = 1θ + 0ω
y=θ
8.6 Holonomic drivetrains 109

8.6 Holonomic drivetrains


8.6.1 Model
Holonomic drivetrains have three degrees of freedom: x, y, and heading. They are
described by the following kinematics.
˙   
x cos θ − sin θ 0 vx,chassis
y  =  sin θ cos θ 0 vy,chassis 
θ 0 0 1 ωchassis

where vx,chassis is the velocity ahead in the chassis frame, vy,chassis is the velocity to
the left in the chassis frame, and ωchassis is the angular velocity in the chassis frame.
This can be written in state-space notation as
˙      
x 0 0 0 x cos θ − sin θ 0 vx,chassis
 y  = 0 0 0 y  +  sin θ cos θ 0 vy,chassis 
θ 0 0 0 θ 0 0 1 ωchassis

8.6.2 Control
This control-affine model is fully actuated but nonlinear in the chassis frame. However,
we can apply linear control theory to the error dynamics in the global frame instead.
Note how equation (8.5)’s state vector contains the global pose and its input vector
contains the global linear and angular velocities.
˙      
x 0 0 0 x 1 0 0 vx,global
y  = 0 0 0 y  + 0 1 0 vy,global  (8.5)
θ 0 0 0 θ 0 0 1 ωglobal
    
vx,global cos θ − sin θ 0 vx,chassis
where vy,global  =  sin θ cos θ 0 vy,chassis  (8.6)
ωglobal 0 0 1 ωchassis

We can control the model in equation (8.5) with an LQR, which will have three inde-
pendent proportional controllers. Then, we can convert the global velocity commands
to chassis velocity commands with equation (8.6) and convert the chassis velocity com-
mands to wheel speed commands with inverse kinematics. LQRs on each wheel can
track the wheel speed commands.
Note that the full control law is nonlinear because the kinematics contain a rotation
matrix for transforming from the chassis frame to the global frame. However, the
nonlinear part has been abstracted away from the tunable linear control laws.
110 Chapter 8. Nonlinear control

8.6.3 Holonomic vs nonholonomic control


Drivetrains that are unable to exercise all possible degrees of freedom (e.g., moving
sideways with respect to the chassis) are nonholonomic. An LQR on each degree of
freedom is ideal for holonomic drivetrains, but not for nonholonomic. Section 8.7 will
use the differential drive as a motivating example for various nonholonomic controllers.

R Nonholonomic controllers should not be used for holonomic drivetrains. They


make different assumptions about the drivetrain dynamics that yield suboptimal
results compared to holonomic controllers.

8.7 Differential drive


This drivetrain consists of two DC brushed motors per side which are chained together
on their respective sides and drive wheels which are assumed to be massless.

vr

vl
ωl ωr

J +x
r
θ
rb

Figure 8.1: Differential drive dimensions


+y

Figure 8.2: Differential drive coordinate


frame

8.7.1 Velocity subspace state-space model


By equations (12.35) and (12.36)
   
1 rb2 1 rb2
v̇l = + (C1 vl + C2 Vl ) + − (C3 vr + C4 Vr )
m J m J
   
1 r2 1 rb2
v̇r = − b (C1 vl + C2 Vl ) + + (C3 vr + C4 Vr )
m J m J
8.7 Differential drive 111

Regroup the terms into states vl and vr and inputs Vl and Vr .


   
1 rb2 1 rb2
v̇l = + C1 vl + + C 2 Vl
m J m J
   
1 rb2 1 rb2
+ − C3 vr + − C 4 Vr
m J m J
   
1 r2 1 r2
v̇r = − b C1 vl + − b C 2 Vl
m J m J
   
1 rb2 1 rb2
+ + C3 vr + + C 4 Vr
m J m J
   
1 r2 1 r2
v̇l = + b C1 vl + − b C3 vr
m J m J
 2
  
1 rb 1 rb2
+ + C 2 Vl + − C 4 Vr
m J m J
   
1 r2 1 r2
v̇r = − b C1 vl + + b C3 vr
m J m J
 2
  
1 rb 1 rb2
+ − C 2 Vl + + C 4 Vr
m J m J

Factor out vl and vr into a column vector and Vl and Vr into a column vector.
    
˙ r2 r2  
vl
1
+ Jb C1 1
− Jb C3 vl
=   m   m  
vr rb2 rb2
m − J
1
C1 1
m + J C3 vr
    
r2 r2  
1
+ Jb C2 1
− Jb C4 Vl
+   m   m  
r2 r2
1
m − b C2
J
1
m + b C 4 Vr
J
112 Chapter 8. Nonlinear control

Theorem 8.7.1 — Differential drive velocity state-space model.

ẋ = Ax + Bu
       
vl left velocity Vl left voltage
x= = u= =
vr right velocity Vr right voltage

    
rb2 rb2
1
+ J  C1
1
− J  C3 
A=  m m (8.7)
rb2 rb2
1
m − J C1 1
m + J C3
    
rb2 rb2
1
+ J  C2
1
− J  C4 
B=  m m (8.8)
rb2 rb2
1
m − J C2 1
m + J C4

G2 K G2 K
where C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 = − KvrRrt2 , and C4 = Gr K t
Rr .

Simulation
Python Control will be used to discretize the model and simulate it. One of the frccon-
trol examples[2] creates and tests a controller for it. Figure 8.3 shows the closed-loop
system response.
Given the high inertia in drivetrains, it’s better to drive the reference with a motion
profile instead of a step input for reproducibility.

8.7.2 Heading state-space model


We can control heading by augmenting the state with that. The change in heading is
defined as
vr − vl vr vl
θ̇ = = −
2rb 2rb 2rb

This gives the following linear model.

[2] https://github.com/calcmogul/frccontrol/blob/main/examples/differential_drive.py
8.7 Differential drive 113

Figure 8.3: Drivetrain response

Theorem 8.7.2 — Differential drive heading state-space model.

ẋ = Ax + Bu
   
θ heading    
V left voltage
x =  vl  =  left velocity  u = l =
Vr right voltage
vr right velocity

 
0  − 2r1b 
1
2rb 
 rb2 r2 
A= 0 1
+ J  C1
1
− Jb C3  (8.9)
 m m  
rb2 r2
0 1
m − J C1 1
m + Jb C3
 
 0   0 
 1 2
rb 2
rb 
B= + J  C2
1
− J  C4  (8.10)
 m m 
rb2 rb2
1
m − J C2 1
m + J C 4

G2 K G2 K
where C1 = − KvlRrt2 , C2 = GRr l Kt
, C3 = − KvrRrt2 , and C4 = Gr K t
Rr . The
constants C1 through C4 are from the derivation in section 12.6.
114 Chapter 8. Nonlinear control

The velocity states are required to make the heading controllable.

8.7.3 Linear time-varying model


We can control the drivetrain’s global pose (x, y, θ) by augmenting the state with x and
y. The change in global pose is defined by these three equations.
vl + vr vr vl
ẋ = cos θ = cos θ + cos θ
2 2 2
vl + vr vr vl
ẏ = sin θ = sin θ + sin θ
2 2 2
vr − vl vr vl
θ̇ = = −
2rb 2rb 2rb
 T
This augmented model is a nonlinear vector function where x = x y θ vl vr
 T
and u = Vl Vr .
f (x, u) =
 
vr
2 cos θ + v2l cos θ
 vr
sin θ + v2l sin θ 
 2 
 
 2rb − 2rb
vr vl
      
 1 rb2 r 2
r2 rb2 
 m+ J  C1 vl + 1
− Jb C3 vr + m 1
+ Jb C2 Vl + m1
− C 4 V r 
 m     J  
rb2 rb2
r 2
rb2

m − − Jb C2 Vl + m
1 1 1 1
J C1 vl + m + J C3 vr + m + J C 4 Vr
(8.11)

As mentioned in chapter 8, one can approximate a nonlinear system via linearizations


around points of interest in the state-space and design controllers for those linearized
subspaces. If we sample linearization points progressively closer together, we converge
on a control policy for the original nonlinear system. Since the linear plant being con-
trolled varies with time, its controller is called a linear time-varying (LTV) controller.
If we use LQRs for the linearized subspaces, the nonlinear control policy will also be
locally optimal. We’ll be taking this approach with a differential drive. To create an
LQR, we need to linearize equation (8.11).
 
0 0 − vl +v 2
r
sin θ 1
2 cos θ
1
2 cos θ
0 0 vl +vr cos θ 1 1 
 2 2 sin θ 2 sin θ 

∂f (x, u) 0 0 0 − 2rb
1 1 
=   2r  

b
2 2
∂x rb rb
0 0 0 1
+ C 1
1
− C 3 
  m J   m J  
rb2 rb2
m − J
1 1
0 0 0 C1 m + J C3
8.7 Differential drive 115
 
0 0
 0 0 
 
∂f (x, u)  0  0  
=


r2
 2
rb


∂u  m + Jb C2
1 1
− J C4 
  m  
rb2 r2
m − J
1 1
C2 m + Jb C4

Therefore,

Theorem 8.7.3 — Linear time-varying differential drive state-space model.

ẋ = Ax + Bu
   
x x position
 y   y position     
    Vl left voltage
  
x =  θ  =  heading  u =  =
 vl   left velocity  Vr right voltage
vr right velocity

 
0 0 −vs 1
2c
1
2c
0 0 vc 1 1 
 2s 2s 
0 0 0 − 2r1b 1 
A=


r2
 2rb 
rb2

 (8.12)
0 0 0 1
+ Jb C1 1
− J C3 
 m  m  
r2 r2
0 0 0 1
m − Jb C1 1
m + Jb C3
 
0 0
 0 0 
 
 0  0  
B= 
 1 r2

rb2

 (8.13)
 m + Jb C2 1
− J C4 
  m  
rb2 r2
m − J
1 1
C2 m + Jb C4

G2 K
where v = vl +vr
2 , c = cos θ, s = sin θ, C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 =
G2r Kt
− Kv Rr 2, and C4 = Gr K t
Rr . The constants C1 through C4 are from the derivation in
section 12.6.

We can also use this in an extended Kalman filter as is since the measurement model
(y = Cx + Du) is linear.
116 Chapter 8. Nonlinear control

8.7.4 Improving model accuracy


Figures 8.4 and 8.5 demonstrate the tracking behavior of the linearized differential
drive controller.

Figure 8.5: Linear time-varying differen-


tial drive controller response (first-order)
Figure 8.4: Linear time-varying differen-
tial drive controller x-y plot (first-order)

The linearized differential drive model doesn’t track well because the first-order lin-
earization of A doesn’t capture the full heading dynamics, making the model update
inaccurate. This linearization inaccuracy is evident in the Hessian matrix (second par-
tial derivative with respect to the state vector) being nonzero.
 
0 0 − vl +v 2
r
cos θ 0 0
0 0 − vl +vr sin θ 0 0
∂ 2 f (x, u)  0 0
2 

=  0 0 0 
∂x2 0 0 0 0 0
0 0 0 0 0

The second-order Taylor series expansion of the model around x0 would be


∂f (x, u) 1 ∂ 2 f (x, u)
f (x, u0 ) ≈ f (x0 , u0 ) + (x − x0 ) + (x − x0 )2
∂x 2 ∂x2

To include higher-order dynamics in the linearized differential drive model integration,


we’ll apply the Dormand-Prince integration method (RKDP) from theorem 7.9.3 to
equation (8.11).
8.7 Differential drive 117

Figures 8.6 and 8.7 show a simulation using RKDP instead of the first-order model.

Figure 8.7: Linear time-varying differ-


ential drive controller (global reference
Figure 8.6: Linear time-varying differ- frame formulation) response
ential drive controller (global reference
frame formulation) x-y plot

8.7.5 Cross track error controller


Figures 8.6 and 8.7 show the tracking performance of the linearized differential drive
controller for a given trajectory. The performance-effort trade-off can be tuned rather
intuitively via the Q and R gains. However, if the x and y error cost are too high, the
x and y components of the controller will fight each other, and it will take longer to
converge to the path. This can be fixed by applying a clockwise rotation matrix to the
global tracking error to transform it into the robot’s coordinate frame.
R    G 
ex cos θ sin θ 0 ex
ey  = − sin θ cos θ 0 ey 
eθ 0 0 1 eθ

where the the superscript R represents the robot’s coordinate frame and the superscript
G represents the global coordinate frame.
With this transformation, the x and y error cost in LQR penalize the error ahead of the
robot and cross-track error respectively instead of global pose error. Since the cross-
track error is always measured from the robot’s coordinate frame, the model used to
118 Chapter 8. Nonlinear control

compute the LQR should be linearized around θ = 0 at all times.


 
0 0 − vl +v 2
r
sin 0 1
2 cos 0
1
2 cos 0
0 0 vl +vr cos 0 1 1 
 2 2 sin 0 2 sin 0 
0 0 0 − 1 1 
A=   2r   2r  

b b
rb2 rb2
0 0 0 1
+ J  1 C 1
− J  3C 
 m 2
 m
2
r r
m − J
1 1
0 0 0 b
C1 m + J
b
C3
 1 1

0 0 0 2 2
0 0 vl +vr 0 0 
 2 
0 0 0 − 1 1 
A=   2r b  2r b  
rb2 rb2 
0 0 0 1
+ J  1 C 1
− J  3C
 m  m 
rb2 rb2
0 0 0 1
m − J C 1
1
m + J C 3
8.7 Differential drive 119

Theorem 8.7.4 — Linear time-varying differential drive controller. Let the


differential drive dynamics be of the form ẋ = f (x) + Bu where
   
x x position
 y   y position     
   
x= θ  =  heading  u = Vl = left voltage
    Vr right voltage
 vl   left velocity 
vr right velocity

 1 1

0 0 0 2 2
0 0 v 0 0 
 
∂f (x) 0 0 0 − 2r1b 1 
A= =


rb2
 2rb 
rb2

 (8.14)
∂x 0 0 0 1
+ 1
− J C3 
θ=0
 m J  C1 m  
rb2 r2
0 0 0 1
m − J C1 1
m + Jb C3
 
0 0
 0 0 
 
 0  0  
B= 
 1 r2
 2
rb

 (8.15)
 m + Jb C2 1
− J C4 
  m  
rb2 r2
m − J
1 1
C2 m + Jb C4

G2 K G2 K
2 , C1 = − Kv Rr 2 , C2 = Rr , C3 = − Kv Rr 2 , and C4 =
where v = vl +v Gl K t Gr K t
Rr .
r l t r t

The constants C1 through C4 are from the derivation in section 12.6.


The linear time-varying differential drive controller is
 
cos θ sin θ
02×3 
u = K  − sin θ cos θ (r − x) (8.16)
03×2 I3×3

At each timestep, the LQR controller gain K is computed for the (A, B) pair eval-
uated at the current state.

With the model in theorem 8.7.4, y is uncontrollable at v = 0 because the row corre-
sponding to y becomes the zero vector. This means the state dynamics and inputs can
no longer affect y. This is obvious given that nonholonomic drivetrains can’t move side-
ways. Some DARE solvers throw errors in this case, but one can avoid it by linearizing
120 Chapter 8. Nonlinear control

the model around a slightly nonzero velocity instead.


The controller in theorem 8.7.4 results in figures 8.8 and 8.9, which show slightly better
tracking performance than the previous formulation.

Figure 8.9: Linear time-varying differen-


tial drive controller response
Figure 8.8: Linear time-varying differen-
tial drive controller x-y plot

8.7.6 Nonlinear observer design


Encoder position augmentation
Estimation of the global pose can be significantly improved if encoder position mea-
surements are used instead of velocity measurements. By augmenting the plant with
the line integral of each wheel’s velocity over time, we can provide a mapping from
model states to position measurements. We can augment the linear subspace of the
model as follows.
Augment the matrix equation with position states xl and xr , which have the model
equations ẋl = vl and ẋr = vr . The matrix elements corresponding to vl in the first
equation and vr in the second equation will be 1, and the others will be 0 since they
don’t appear, so ẋl = 1vl + 0vr + 0xl + 0xr + 0Vl + 0Vr and ẋr = 0vl + 1vr +
0xl + 0xr + 0Vl + 0Vr . The existing rows will have zeroes inserted where xl and xr
are multiplied in.
˙      
xl 1 0 vl 0 0 Vl
= +
xr 0 1 vr 0 0 Vr
8.7 Differential drive 121
 T
This produces the following linear subspace over x = vl vr xl xr .
    
r2 r2
1
+ Jb C1 1
− Jb C3 0 0
 m   m  
 1 rb2 rb2 
A= m  − C 1
1
+ C3 0 0 (8.17)
J m J 
 1 0 0 0
0 1 0 0
    
rb2 rb2
1
+ J C2 1
− J C4
 m  m  
 1 rb2
rb2 
B= m  − C 2
1
+ C4  (8.18)
J m J 
 0 0 
0 0

 T
The measurement model for the complete nonlinear model is now y = θ xl xr
 T
instead of y = θ vl vr .
U error estimation
As per subsection 6.7.2, we will now augment the model so uerror states are added to
the control inputs.
The plant and observer augmentations should be performed before the model is dis-
cretized. After the controller gain is computed with the unaugmented discrete model,
the controller may be augmented. Therefore, the plant and observer augmentations
assume a continuous model and the controller augmentation assumes a discrete con-
troller.
The three uerror states we’ll be adding are uerror,l , uerror,r , and uerror,heading for
left voltage error, right voltage error, and heading error respectively. The left and right
wheel positions are filtered encoder positions and are not adjusted for heading error.
The turning angle computed from the left and right wheel positions is adjusted by the
gyroscope heading. The heading uerror state is the heading error between what the
wheel positions imply and the gyroscope measurement.
122 Chapter 8. Nonlinear control

The full state is thus  


x
 y 
 
 θ 
 
 vl 
 
 vr 
x=



 xl 
 xr 
 
 uerror,l 
 
 uerror,r 
uerror,heading

The complete nonlinear model is as follows. Let v = vl +v


2 . The three uerror states
r

augment the linear subspace, so the nonlinear pose dynamics are the same.
˙  
x v cos θ
y  =  v sin θ  (8.19)
2rb − 2rb
vr vl
θ

The left and right voltage error states should be mapped to the corresponding velocity
states, so the system matrix should be augmented with B.
The heading uerror is measuring counterclockwise encoder understeer relative to the
gyroscope heading, so it should add to the left position and subtract from the right po-
sition for clockwise correction of encoder positions. That corresponds to the following
input mapping vector.  
0
0
Bθ =  

1
−1

Now we’ll augment the linear system matrix horizontally to accomodate the uerror
states.  
vl
˙  vr 
vl  
 xl 
 vr    
  = A B Bθ  xr  + Bu (8.20)
 xl   
 uerror,l 
xr  
 uerror,r 
uerror,heading
8.8 Ramsete unicycle controller 123

A and B are the linear subspace from equation (8.18).


The uerror states have no dynamics. The observer selects them to minimize the differ-
ence between the expected and actual measurements.
 ˙ 
uerror,l
 uerror,r  = 03×1 (8.21)
uerror,heading

The controller is augmented as follows.


 
  r
1 0 0    0
Kerror = Kaug = K Kerror raug =
 0
 (8.22)
0 1 0
0

This controller augmentation compensates for unmodeled dynamics like:


1. Understeer caused by wheel friction inherent in skid-steer robots
2. Battery voltage drop under load, which reduces the available control authority

R The process noise for the voltage error states should be how much the voltage
can be expected to drop. The heading error state should be the encoder model
uncertainty.

8.8 Ramsete unicycle controller


Ramsete is a nonlinear time-varying feedback controller for unicycle models that drives
the model to a desired pose along a two-dimensional trajectory. Why would we need
a nonlinear control law in addition to the linear ones we have used so far? If we use
the original approach with an LQR controller for left and right position and velocity
states, the controller only deals with the local pose. If the robot deviates from the path,
there is no way for the controller to correct and the robot may not reach the desired
global pose. This is due to multiple endpoints existing for the robot which have the
same encoder path arc lengths.
Instead of using wheel path arc lengths (which are in the robot’s local coordinate frame),
nonlinear controllers like pure pursuit and Ramsete use global pose. The controller uses
this extra information to guide a linear reference tracker like an LQR controller back
in by adjusting the references of the LQR controller.
124 Chapter 8. Nonlinear control

The paper Control of Wheeled Mobile Robots: An Experimental Overview describes a


nonlinear controller for a wheeled vehicle with unicycle-like kinematics; a global pose
consisting of x, y, and θ; and a desired pose consisting of xd , yd , and θd [12]. We’ll
call it Ramsete because that’s the acronym for the title of the book it came from in
Italian (“Robotica Articolata e Mobile per i SErvizi e le TEcnologie”).

8.8.1 Velocity and turning rate command derivation


The state tracking error e in the vehicle’s coordinate frame is defined as
    
ex cos θ sin θ 0 xd − x
ey  = − sin θ cos θ 0  yd − y 
eθ 0 0 1 θd − θ

where ex is the tracking error in x, ey is the tracking error in y, and eθ is the tracking
error in θ. The 3 × 3 matrix is a rotation matrix that transforms the error in the pose
(represented by xd − x, yd − y, and θd − θ) from the global coordinate frame into the
vehicle’s coordinate frame.
We will use the following control laws u1 and u2 for velocity and turning rate respec-
tively.
u1 = −k1 ex
(8.23)
u2 = −k3 eθ − k2 vd sinc(eθ )ey

where k1 , k2 , and k3 are time-varying gains and sinc(eθ ) is defined as sineθeθ . This
choice of control law may seem arbitrary, and that’s because it is. The only requirement
on our choice is that there exist an associated Lyapunov candidate function that proves
the control law is globally asymptotically stable. We’ll provide a sketch of a proof in
theorem 8.8.1.
Our velocity and turning rate commands for the vehicle will use the following nonlinear
transformation of the inputs.

v = vd cos eθ − u1
ω = ωd − u2

Substituting the control laws u1 and u2 into these equations gives

v = vd cos eθ + k1 ex
ω = k3 eθ + ωd + k2 vd sinc(eθ )ey
8.8 Ramsete unicycle controller 125

Theorem 8.8.1 Assuming that vd and ωd are bounded with bounded derivatives,
and that vd (t) → 0 or ωd (t) → 0 when t → ∞, the control laws in equation (8.23)
globally asymptotically stabilize the origin e = 0.
Proof:
To prove convergence, the paper previously mentioned uses the following Lyapunov
function.
k2 e2
V = (e2x + e2y ) + θ
2 2
where k2 is a tuning constant, ex is the tracking error in x, ey is the tracking error
in y, and eθ is the tracking error in θ.
The time derivative along the solutions of the closed-loop system is nonincreasing
since
V̇ = −k1 k2 e2x − k3 e2θ ≤ 0

Thus, ∥e(t)∥ is bounded, V̇ (t) is uniformly continuous, and V (t) tends to some
limit value. Using the Barbalat lemma, V̇ (t) tends to zero [12].

8.8.2 Nonlinear controller equations


p
Let k2 = b and k = k1 (vd , ωd ) = k3 (vd , ωd ) = 2ζ ωd2 + bvd2 .
126 Chapter 8. Nonlinear control

Theorem 8.8.2 — Ramsete unicycle controller.


    
ex cos θ sin θ 0 xd − x
ey  = − sin θ cos θ 0  yd − y  (8.24)
eθ 0 0 1 θd − θ
v = vd cos eθ + kex (8.25)
ω = ωd + keθ + bvd sinc(eθ )ey (8.26)
q
k = 2ζ ωd2 + bvd2 (8.27)
sin eθ
sinc(eθ ) = (8.28)

v velocity command vd desired velocity


ω turning rate command ωd desired turning rate
x actual x position in global coordinate frame xd desired x position
y actual y position in global coordinate frame yd desired y position
θ actual angle in global coordinate frame θd desired angle
2
−1
b and ζ are tuning parameters where b > 0 rad
m2 and ζ ∈ (0, 1) rad . Larger values
of b make convergence more aggressive (like a proportional term), and larger values
of ζ provide more damping.

v and ω should be the references for a drivetrain reference tracker. A good choice would
be using equations (11.1) and (11.2) to convert v and ω to left and right velocities, then
applying LQR to the model in theorem 8.7.1.
x, y, and θ are obtained via a pose estimator (see chapter 10 for how to implement
one). The desired velocity, turning rate, and pose can be varied over time according to
a desired trajectory.
Figures 8.10 and 8.11 show the tracking performance of Ramsete for a given trajectory.

8.8.3 Dimensional analysis


[v] denotes the dimension of v. A means angle units, L means length units, and T
means time units.
8.8 Ramsete unicycle controller 127

Figure 8.11: Ramsete nonlinear controller


response
Figure 8.10: Ramsete nonlinear controller
x-y plot

Units of sinc(eθ )
First, we’ll find the units of sinc(eθ ) for later use.
sin(eθ )
sinc(eθ ) =

1
[sinc(eθ )] =
A
[sinc(eθ )] = A−1

Ramsete velocity command equation


Start from equation (8.25).
v = vd cos eθ + kex
[v] = [vd ][cos eθ ] + [k][ex ]
LT −1 = LT −1 · 1 + [k]L
LT −1 = LT −1 + [k]L
LT −1 = [k]L
T −1 = [k]
[k] = T −1

Therefore, the units of k are T −1 .


128 Chapter 8. Nonlinear control

Ramsete angular velocity command equation


Start from equation (8.26).

ω = ωd + keθ + bvd sinc(eθ )ey


[ω] = [ωd ] + [k][eθ ] + [b][vd ][sinc(eθ )][ey ]
[ω] = [ωd ] + [k][eθ ] + [b][vd ][sinc(eθ )][ey ]
AT −1 = AT −1 + [k]A + [b]LT −1 · A−1 · L
AT −1 = AT −1 + [k]A + [b]A−1 L2 T
AT −1 = [k]A + [b]A−1 L2 T (8.29)

First, we’ll find the units of k.


The left-hand side and first term of equation (8.29) must have equivalent units.

AT −1 = [k]A
T −1 = [k]
[k] = T −1

This is consistent with the result from the Ramsete velocity command equation.
Next, we’ll find the units of b.
The left-hand side and second term of equation (8.29) must have equivalent units.

AT −1 = [b]A−1 L2 T −1
A2 L−2 = [b]
[b] = A2 L−2

Ramsete k equation
Start from equation (8.27).
q
k = 2ζ ωd2 + bvd2
p
[k] = [ζ] [ωd ]2 + [b][vd ]2
p
T −1 = [ζ] (AT −1 )2 + [b](LT −1 )2
p
T −1 = [ζ] A2 T −2 + [b]L2 T −2 (8.30)

First, we’ll find the units of b.


8.9 Linear time-varying unicycle controller 129

The additive terms under the square root must have equivalent units.

(AT −1 )2 = [b](LT −1 )2
A2 T −2 = [b]L2 T −2
A2 L−2 = [b]
[b] = A2 L−2

This is consistent with the result from the angular velocity command equation, so we
can use it when determining the units of ζ.
Next, we’ll find the units of ζ.
Substitute [b] into equation (8.30).
p
T −1 = [ζ] A2 T −2 + [b]L2 T −2
p
T −1 = [ζ] A2 T −2 + A2 L−2 · L2 T −2
p
T −1 = [ζ] A2 T −2 + A2 T −2

T −1 = [ζ] A2 T −2
T −1 = [ζ]AT −1
A−1 = [ζ]
[ζ] = A−1

Units of tunable parameters b and ζ

[b] = A2 L−2 (8.31)


−1
[ζ] = A (8.32)

8.9 Linear time-varying unicycle controller


One can also create a linear time-varying controller with a cascaded control architecture
like Ramsete. This section will derive a locally optimal replacement for Ramsete.
The change in global pose for a unicycle is defined by the following three equations.

ẋ = v cos θ
ẏ = v sin θ
θ̇ = ω
130 Chapter 8. Nonlinear control
 T  T
Here’s the model as a vector function where x = x y θ and u = v ω .
 
v cos θ
f (x, u) =  v sin θ  (8.33)
ω

To create an LQR, we need to linearize this.


   
0 0 −v sin θ cos θ 0
∂f (x, u)  ∂f (x, u) 
= 0 0 v cos θ  = sin θ 0
∂x ∂u
0 0 0 0 1

We’re going to make a cross-track error controller, so we’ll apply a clockwise rotation
matrix to the global tracking error to transform it into the robot’s coordinate frame.
Since the cross-track error is always measured from the robot’s coordinate frame, the
model used to compute the LQR should be linearized around θ = 0 at all times.
   
0 0 −v sin 0 cos 0 0
A = 0 0 v cos 0  B =  sin 0 0
0 0 0  0 1
0 0 0 1 0
A = 0 0 v  B = 0 0 
0 0 0 0 1

Therefore,
8.9 Linear time-varying unicycle controller 131

Theorem 8.9.1 — Linear time-varying unicycle controller. Let the unicycle


 T
dynamics be ẋ = f (x, u) = v cos θ v sin θ ω where
   
x x position    
    v linear velocity
x = y = y position u= =
ω angular velocity
θ heading

 
0 0 0
∂f (x, u)
A= = 0 0 v
∂x θ=0 0 0 0
 
1 0
∂f (x, u)
B= = 0 0
∂u θ=0 0 1

The linear time-varying unicycle controller is


 
cos θ sin θ 0
u = K − sin θ cos θ 0 (r − x) (8.34)
0 0 1

At each timestep, the LQR controller gain K is computed for the (A, B) pair eval-
uated at the current input.

With the model in theorem 8.9.1, y is uncontrollable at v = 0 because nonholonomic


drivetrains are unable to move sideways. Some DARE solvers throw errors in this case,
but one can avoid it by linearizing the model around a slightly nonzero velocity instead.
The controller in theorem 8.9.1 results in figures 8.12 and 8.13.
132 Chapter 8. Nonlinear control

Figure 8.13: Linear time-varying unicycle


controller response
Figure 8.12: Linear time-varying unicycle
controller x-y plot

8.10 Further reading


To learn more about nonlinear control, we recommend watching the underactuated
robotics lectures by MIT and reading their lecture note [17].

“Underactuated Robotics”
Russ Tedrake
https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg/videos

The books Nonlinear Dynamics and Chaos by Steven Strogatz and Applied Nonlinear
Control by Jean-Jacques Slotine are also good references.
III
Estimation and
localization

9 Stochastic control theory . . . . . . . . . . . . 135

10 Pose estimation . . . . . . . . . . . . . . . . . . . . . 177


This page intentionally left blank
A sample of the Santa Cruz banana distribution [11]

9. Stochastic control theory

Stochastic control theory is a subfield of control theory that deals with the existence
of uncertainty either in observations or in noise that drives the evolution of a system.
We assign probability distributions to this random noise and aim to achieve a desired
control task despite the presence of this uncertainty.
Stochastic optimal control is concerned with doing this with minimum cost defined
by some cost functional, like we did with LQR earlier. First, we’ll cover the basics of
probability and how we represent linear stochastic systems in state-space representation.
Then, we’ll derive an optimal estimator using this knowledge, the Kalman filter, and
demonstrate creative applications of the Kalman filter theory.
This will be a math-heavy introduction, so we recommend reading Kalman and Bayesian
Filters in Python by Roger Labbe first [9].

9.1 Terminology
First, we should provide definitions for terms that have specific meanings in this field.
A causal system is one that uses only past information. A noncausal system also uses
information from the future. A filter is a causal system that filters information through a
probabilistic model to produce an estimate of a desired quantity that can’t be measured
directly. A smoother is a noncausal system, so it uses information from before and after
the current state to produce a better estimate.
136 Chapter 9. Stochastic control theory

9.2 State observers


State observers are used to estimate states which cannot be measured directly. This
can be due to noisy measurements or the state not being measurable (a hidden state).
This information can be used for localization, which is the process of using external
measurements to determine an agent’s pose,[1] or orientation in the world.
One type of state estimator is LQE. “LQE” stands for “Linear-Quadratic Estimator”.
Similar to LQR, it places the estimator poles such that it minimizes the sum of squares
of the estimation error. The Luenberger observer and Kalman filter are examples of
these, where the latter chooses the pole locations optimally based on the model and
measurement uncertainties.
Computer vision can also be used for localization. By extracting features from an image
taken by the agent’s camera, like a retroreflective target in FRC, and comparing them
to known dimensions, one can determine where the agent’s camera would have to be to
see that image. This can be used to correct our state estimate in the same way we do
with an encoder or gyroscope.

9.2.1 Luenberger observer


We’ll introduce the Luenberger observer first to demonstrate the general form of a state
estimator and some of their properties.

[1] An agent is a system-agnostic term for independent controlled actors like robots or aircraft.
9.2 State observers 137

Theorem 9.2.1 — Luenberger observer.

x̂˙ = Ax̂ + Bu + L(y − ŷ) (9.1)


ŷ = Cx̂ + Du (9.2)

x̂k+1 = Ax̂k + Buk + L(yk − ŷk ) (9.3)


ŷk = Cx̂k + Duk (9.4)

A system matrix x̂ state estimate vector


B input matrix u input vector
C output matrix y output vector
D feedthrough matrix ŷ output estimate vector
L estimator gain matrix

Matrix Rows × Columns Matrix Rows × Columns


A states × states x̂ states × 1
B states × inputs u inputs × 1
C outputs × states y outputs × 1
D outputs × inputs ŷ outputs × 1
L states × outputs

Table 9.1: Luenberger observer matrix dimensions

Variables denoted with a hat are estimates of the corresponding variable. For example,
x̂ is the estimate of the true state x.
Notice that a Luenberger observer has an extra term in the state evolution equation. This
term uses the difference between the estimated outputs and measured outputs to steer
the estimated state toward the true state. L approaching C+ trusts the measurements
more while L approaching 0 trusts the model more.

R Using an estimator forfeits the performance guarantees from earlier, but the
138 Chapter 9. Stochastic control theory

responses are still generally very good if the process and measurement noises are
small enough. See John Doyle’s paper Guaranteed Margins for LQG Regulators
for a proof.

A Luenberger observer combines the prediction and update steps of an estimator. To


run them separately, use the equations in theorem 9.2.2 instead.

Theorem 9.2.2 — Luenberger observer with separate predict/update.

Predict step
x̂− −
k+1 = Ax̂k + Buk (9.5)
Update step
− −1
x̂+
k+1 = x̂k+1 + A L(yk − ŷk ) (9.6)
ŷk = Cx̂−
k (9.7)

See appendix D.2.1 for a derivation.


Eigenvalues of closed-loop observer
The eigenvalues of the system matrix can be used to determine whether a state ob-
server’s estimate will converge to the true state.
Plugging equation (9.4) into equation (9.3) gives
x̂k+1 = Ax̂k + Buk + L(yk − ŷk )
x̂k+1 = Ax̂k + Buk + L(yk − (Cx̂k + Duk ))
x̂k+1 = Ax̂k + Buk + L(yk − Cx̂k − Duk )
Plugging in yk = Cxk + Duk gives
x̂k+1 = Ax̂k + Buk + L((Cxk + Duk ) − Cx̂k − Duk )
x̂k+1 = Ax̂k + Buk + L(Cxk + Duk − Cx̂k − Duk )
x̂k+1 = Ax̂k + Buk + L(Cxk − Cx̂k )
x̂k+1 = Ax̂k + Buk + LC(xk − x̂k )
Let ek = xk − x̂k be the error in the estimate x̂k .
x̂k+1 = Ax̂k + Buk + LCek

Subtracting this from xk+1 = Axk + Buk gives


xk+1 − x̂k+1 = Axk + Buk − (Ax̂k + Buk + LCek )
9.2 State observers 139

ek+1 = Axk + Buk − (Ax̂k + Buk + LCek )


ek+1 = Axk + Buk − Ax̂k − Buk − LCek
ek+1 = Axk − Ax̂k − LCek
ek+1 = A(xk − x̂k ) − LCek
ek+1 = Aek − LCek
ek+1 = (A − LC)ek (9.8)

For equation (9.8) to have a bounded output, the eigenvalues of A − LC must be


within the unit circle. These eigenvalues represent how fast the estimator converges
to the true state of the given model. A fast estimator converges quickly while a slow
estimator avoids amplifying noise in the measurements used to produce a state estimate.
The effect of noise can be seen if it is modeled stochastically as

x̂k+1 = Ax̂k + Buk + L((yk + νk ) − ŷk )

where νk is the measurement noise. Rearranging this equation yields

x̂k+1 = Ax̂k + Buk + L(yk − ŷk + νk )


x̂k+1 = Ax̂k + Buk + L(yk − ŷk ) + Lνk

As L increases, the measurement noise is amplified.

9.2.2 Separation principle


The separation principle for linear stochastic systems states that the optimal controller
and optimal observer for the stochastic system can be designed independently, and the
combination of a stable controller and a stable observer is itself stable.
This means that designing the optimal feedback controller for the stochastic system
can be decomposed into designing the optimal state observer, then feeding that into
the optimal controller for the deterministic system.
Consider the following state-space model.

xk+1 = Axk + Buk


yk = Cxk + Duk

We’ll use the following controller for state feedback.

uk = −Kx̂k
140 Chapter 9. Stochastic control theory

With the estimation error defined as ek = xk − x̂k , we get the observer dynamics
derived in equation (9.8).
ek+1 = (A − LC)ek

Also, after rearranging the error equation to be x̂k = xk − ek , the controller can be
rewritten as
uk = −K(xk − ek )

Substitute this into the model.

xk+1 = Axk + Buk


xk+1 = Axk − BK(xk − ek )
xk+1 = (A − BK)xk + BKek

Now, we can write the closed-loop dynamics as


    
xk+1 A − BK BK xk
=
ek+1 0 A − LC ek

Since the closed-loop system matrix is triangular, the eigenvalues are those of A−BK
and A − LC. Therefore, the stability of the feedback controller and observer are
independent.

9.3 Introduction to probability


Now we’ll begin establishing probability concepts we need to describe and manipulate
stochastic systems.

9.3.1 Random variables


A random variable is a variable whose values are the outcomes of a random phe-
nomenon, like dice rolls or noisy process measurements. As such, a random variable is
defined as a function that maps the outcomes of an unpredictable process to numerical
quantities. A particular output of this function is called a sample. The sample space is
the set of possible values taken by the random variable.
A probability density function (PDF) is a function of the random variable whose value
at a given sample (measured value) in the sample space (the range of possible measured
values) is the probability of that sample occurrring. The area under the function over a
range gives the probability that the sample falls within that range. Let x be a random
variable, and let p(x) denote the probability density function of x. The probability that
9.3 Introduction to probability 141

Figure 9.1: Probability density function

the value of x will be in the interval x ∈ [x1 , x1 + dx] is p(x1 ) dx. In other words, the
probability is the area under the PDF within the region [x1 , x1 + dx] (see figure 9.1).
A probability of zero means that the sample will not occur and a probability of one
means that the sample will always occur. Probability density functions require that no
probabilities are negative and that the sum of all probabilities is 1. If the probabilities
sum to 1, that means one of those outcomes must happen. In other words,
Z ∞
p(x) ≥ 0, p(x) dx = 1
−∞

or given that the probability of a given sample is greater than or equal to zero, the sum
of probabilities for all possible input values is equal to one.

9.3.2 Expected value


Expected value or expectation is a weighted average of the values the PDF can produce
where the weight for each value is the probability of that value occurring. This can be
written mathematically as
Z ∞
x = E[x] = x p(x) dx
−∞
142 Chapter 9. Stochastic control theory

The expectation can be applied to random functions as well as random variables.


Z ∞
E[f (x)] = f (x) p(x) dx
−∞

The mean of a random variable is denoted by an overbar (e.g., x) pronounced x-bar.


The expectation of the difference between a random variable and its mean x − x con-
verges to zero. In other words, the expectation of a random variable is its mean. Here’s
a proof.
Z ∞
E[x − x] = (x − x) p(x) dx
−∞
Z ∞ Z ∞
E[x − x] = x p(x) dx − x p(x) dx
−∞ −∞
Z ∞ Z ∞
E[x − x] = x p(x) dx − x p(x) dx
−∞ −∞
E[x − x] = x − x · 1
E[x − x] = 0

9.3.3 Variance
Informally, variance is a measure of how far the outcome of a random variable deviates
from its mean. Later, we will use variance to quantify how confident we are in the
estimate of a random variable; we can’t know the true value of that variable without
randomness, but we can give a bound on its randomness.
Z ∞
var(x) = σ 2 = E[(x − x)2 ] = (x − x)2 p(x) dx
−∞

The standard deviation is the square root of the variance.


p
std[x] = σ = var(x)

9.3.4 Joint probability density functions


Probability density functions can also include more than one variable. Let x and y be
random variables. The joint probability density function p(x, y) defines the probability
p(x, y) dx dy, so that x and y are in the intervals x ∈ [x, x + dx], y ∈ [y, y + dy].
In other words, the probability is the volume under a region of the PDF manifold (see
figure 9.2 for an example of a joint PDF).
9.3 Introduction to probability 143

Figure 9.2: Joint probability density function


144 Chapter 9. Stochastic control theory

Joint probability density functions also require that no probabilities are negative and
that the sum of all probabilities is 1.
Z ∞Z ∞
p(x, y) ≥ 0, p(x, y) dx dy = 1
−∞ −∞

The expected values for joint PDFs are as follows.


Z ∞Z ∞
E[x] = xp(x, y) dx dy
−∞ −∞
Z ∞Z ∞
E[y] = yp(x, y) dx dy
−∞ −∞
Z ∞Z ∞
E[f (x, y)] = f (x, y)p(x, y) dx dy
−∞ −∞

The variance of a joint PDF measures how a variable correlates with itself (we’ll cover
variances with respect to other variables shortly).
Z ∞Z ∞
var(x) = Σxx = E[(x − x)2 ] = (x − x)2 p(x, y) dx dy
−∞ −∞
Z ∞Z ∞
var(y) = Σyy = E[(y − y)2 ] = (y − y)2 p(x, y) dx dy
−∞ −∞

9.3.5 Covariance
A covariance is a measurement of how a variable correlates with another. If they vary
in the same direction, the covariance increases. If they vary in opposite directions, the
covariance decreases.
Z ∞Z ∞
cov(x, y) = Σxy = E[(x − x)(y − y)] = (x − x)(y − y) p(x, y) dx dy
−∞ −∞

9.3.6 Correlation
Two random variables are correlated if the result of one random variable affects the
result of another. Correlation is defined as
Σxy
ρ(x, y) = p , |ρ(x, y)| ≤ 1
Σxx Σyy

So two variable’s correlation is defined as their covariance over the geometric mean of
their variances. Uncorrelated sources have a covariance of zero.
9.3 Introduction to probability 145

9.3.7 Independence
Two random variables are independent if the following relation is true.
p(x, y) = p(x) p(y)

This means that the values of x do not correlate with the values of y. That is, the
outcome of one random variable does not affect another’s outcome. If we assume
independence,
Z ∞Z ∞
E[xy] = xy p(x, y) dx dy
−∞ −∞
Z ∞Z ∞
E[xy] = xy p(x) p(y) dx dy
−∞ −∞
Z ∞ Z ∞
E[xy] = x p(x) dx y p(y) dy
−∞ −∞
E[xy] = E[x]E[y]
E[xy] = x y

cov(x, y) = E[(x − x)(y − y)]


cov(x, y) = E[(x − x)]E[(y − y)]
cov(x, y) = 0 · 0

Therefore, the covariance Σxy is zero, as expected. Furthermore, ρ(x, y) = 0, which


means they are uncorrelated.

9.3.8 Marginal probability density functions


Given two random variables x and y whose joint distribution is known, the marginal
PDF p(x) expresses the probability of x averaged over information about y. In other
words, it’s the PDF of x when y is unknown. This is calculated by integrating the joint
PDF over y. Z ∞
p(x) = p(x, y) dy
−∞

9.3.9 Conditional probability density functions


Let us assume that we know the joint PDF p(x, y) and the exact value for y. The
conditional PDF gives the probability of x in the interval [x, x + dx] for the given
value y. The probability of x given y is denoted by p(x|y).
146 Chapter 9. Stochastic control theory

If p(x, y) is known, then we also know p(x, y = y ∗ ). However, note that the latter is
not the conditional density p(x|y ∗ ), instead
Z ∞

C(y ) = p(x, y = y ∗ ) dx
−∞
1
p(x|y ∗ ) = p(x, y = y ∗ )
C(y ∗ )

1
The scale factor C(y ∗ ) is used to scale the area under the PDF to 1.

9.3.10 Bayes’s rule


Bayes’s rule is used to determine the probability of an event based on prior knowledge
of conditions related to the event.
p(x, y) = p(x|y) p(y) = p(y|x) p(x)

If x and y are independent, then p(x|y) = p(x), p(y|x) = p(y), and p(x, y) =
p(x) p(y).

9.3.11 Conditional expectation


The concept of expectation can also be applied to conditional PDFs. This allows us to
determine what the mean of a variable is given prior knowledge of other variables.
Z ∞
E[x|y] = x p(x|y) dx = f (y), E[x|y] ̸= E[x]
−∞
Z ∞
E[y|x] = y p(y|x) dy = f (x), E[y|x] ̸= E[y]
−∞

9.3.12 Conditional variances

var(x|y) = E[(x − E[x|y])2 |y]


Z ∞
var(x|y) = (x − E[x|y])2 p(x|y) dx
−∞

9.3.13 Random vectors


Now we will extend the probability concepts discussed so far to vectors where each
element has a PDF.  
x1
 .. 
x= . 
xn
9.3 Introduction to probability 147

The elements of x are scalar variables jointly distributed with a joint density p(x1 , . . . , xn ).
The expectation is
Z ∞
E[x] = x = x p(x) dx
−∞
 
E[x1 ]
 .. 
E[x] =  . 
E[xn ]
Z ∞ Z ∞
E[xi ] = ... xi p(x1 , . . . , xn ) dx1 . . . dxn
−∞ −∞
Z ∞
E[f (x)] = f (x) p(x) dx
−∞

9.3.14 Covariance matrix


The covariance matrix for a random vector x ∈ Rn is

Σ = cov(x, x) = E[(x − x)(x − x)T ]


 
cov(x1 , x1 ) . . . cov(x1 , xn )
 .. .. .. 
Σ= . . . 
cov(xn , x1 ) ... cov(xn , xn )

This n×n matrix is symmetric and positive semidefinite. A positive semidefinite matrix
satisfies the relation that for any v ∈ Rn for which v ̸= 0, vT Σv ≥ 0. In other words,
the eigenvalues of Σ are all greater than or equal to zero.

9.3.15 Relations for independent random vectors


First, independent vectors imply linearity from p(x, y) = p(x) p(y).

E[Ax + By] = AE[x] + BE[y]


E[Ax + By] = Ax + By

Second, independent vectors being uncorrelated means their covariance is zero.

Σxy = cov(x, y)
Σxy = E[(x − x)(y − y)T ]
Σxy = E[xyT ] − E[xyT ] − E[xyT ] + E[xyT ]
148 Chapter 9. Stochastic control theory

Σxy = E[xyT ] − E[x]yT − xE[yT ] + xyT


Σxy = E[xyT ] − xyT − xyT + xyT
Σxy = E[xyT ] − xyT (9.9)

Now, compute E[xyT ].


Z Z
E[xyT ] = xyT p(x) p(y) dx dyT
X Y

Factor out constants from the inner integral. This includes variables which are held
constant for each inner integral evaluation.
Z Z
T
E[xy ] = p(x) x dx p(y) yT dyT
X Y

Each of these integrals is just the expected value of their respective integration variable.

E[xyT ] = xyT (9.10)

Substitute equation (9.10) into equation (9.9).

Σxy = (xyT ) − xyT


Σxy = 0

Using these results, we can compute the covariance of z = Ax + By.

Σz = cov(z, z)
Σz = E[(z − z)(z − z)T ]
Σz = E[(Ax + By − Ax − By)(Ax + By − Ax − By)T ]
Σz = E[(A(x − x) + B(y − y))(A(x − x) + B(y − y))T ]
Σz = E[(A(x − x) + B(y − y))((x − x)T AT + (y − y)T BT )]
Σz = E[A(x − x)(x − x)T AT + A(x − x)(y − y)T BT
+ B(y − y)(x − x)T AT + B(y − y)(y − y)T BT ]

Since x and y are independent, Σxy = 0 and the cross terms cancel out.

Σz = E[A(x − x)(x − x)T AT + 0 + 0 + B(y − y)(y − y)T BT ]


9.3 Introduction to probability 149

Σz = E[A(x − x)(x − x)T AT ] + E[B(y − y)(y − y)T BT ]


Σz = AE[(x − x)(x − x)T ]AT + BE[(y − y)(y − y)T ]BT

Recall that Σx = cov(x, x) and Σy = cov(y, y).

Σz = AΣx AT + BΣy BT

9.3.16 Gaussian random variables


A Gaussian random variable has the following properties:

E[x] = x
var(x) = σ 2
1 (x−x)2
p(x) = √ e− 2σ 2
2πσ 2

While we could use any random variable to represent a random process, we use the
Gaussian random variable a lot in probability theory due to the central limit theo-
rem.

Definition 9.3.1 — Central limit theorem. When independent random variables


are added, their properly normalized sum tends toward a normal distribution (a
Gaussian distribution or “bell curve”).

This is the case even if the original variables themselves are not normally distributed.
The theorem is a key concept in probability theory because it implies that probabilistic
and statistical methods that work for normal distributions can be applicable to many
problems involving other types of distributions.
For example, suppose that a sample is obtained containing a large number of indepen-
dent observations, and that the arithmetic mean of the observed values is computed.
The central limit theorem says that the computed values of the mean will tend toward
being distributed according to a normal distribution.
150 Chapter 9. Stochastic control theory

“But what is the Central Limit “A pretty reason why Gaussian +


Theorem?” (31 minutes) Gaussian = Gaussian” (13 minutes)
3Blue1Brown 3Blue1Brown
https://youtu.be/zeJD6dqJ5lo https://youtu.be/d_qvLDhkg00

9.4 Linear stochastic systems


Given the following stochastic system

xk+1 = Axk + Buk + wk


yk = Cxk + Duk + vk

where wk is the process noise and vk is the measurement noise,

E[wk ] = 0
E[wk wkT ] = Qk
E[vk ] = 0
E[vk vkT ] = Rk

where Qk is the process noise covariance matrix and Rk is the measurement noise
covariance matrix. We assume the noise samples are independent, so E[wk wjT ] = 0
and E[vk vjT ] = 0 where k ̸= j. Furthermore, process noise samples are independent
from measurement noise samples.
We’ll compute the expectation of these equations and their covariance matrices, which
we’ll use later for deriving the Kalman filter.

9.4.1 State vector expectation evolution


First, we will compute how the expectation of the system state evolves.

E[xk+1 ] = E[Axk + Buk + wk ]


E[xk+1 ] = E[Axk ] + E[Buk ] + E[wk ]
E[xk+1 ] = AE[xk ] + BE[uk ] + E[wk ]
E[xk+1 ] = AE[xk ] + Buk + 0
9.4 Linear stochastic systems 151

xk+1 = Axk + Buk

9.4.2 Error covariance matrix evolution


Now, we will use this to compute how the error covariance matrix P evolves.

xk+1 − xk+1 = Axk + Buk + wk − (Axk − Buk )


xk+1 − xk+1 = A(xk − xk ) + wk

E[(xk+1 − xk+1 )(xk+1 − xk+1 )T ] = E[(A(xk − xk ) + wk )(A(xk − xk ) + wk )T ]

Pk+1 = E[(A(xk − xk ) + wk )(A(xk − xk ) + wk )T ]


Pk+1 = E[(A(xk − xk )(xk − xk )T AT ] + E[A(xk − xk )wkT ]
+ E[wk (xk − xk )T AT ] + E[wk wkT ]
Pk+1 = AE[(xk − xk )(xk − xk )T ]AT + AE[(xk − xk )wkT ]
+ E[wk (xk − xk )T ]AT + E[wk wkT ]
Pk+1 = APk AT + AE[(xk − xk )wkT ] + E[wk (xk − xk )T ]AT + Qk

Since the error and noise are independent, the cross terms are zero.

Pk+1 = APk AT + A E[(xk − xk )wkT ] + E[wk (xk − xk )T ] AT + Qk


| {z } | {z }
0 0
T
Pk+1 = APk A + Qk

9.4.3 Measurement vector expectation


Next, we will compute the expectation of the output y.

E[yk ] = E[Cxk + Duk + vk ]


E[yk ] = CE[xk ] + Duk + 0
yk = Cxk + Duk

9.4.4 Measurement covariance matrix


Now, we will use this to compute how the measurement covariance matrix S evolves.

yk − yk = Cxk + Duk + vk − (Cxk + Duk )


yk − yk = C(xk − xk ) + vk
152 Chapter 9. Stochastic control theory

E[(yk − yk )(yk − yk )T ] = E[(C(xk − xk ) + vk )(C(xk − xk ) + vk )T ]

Sk = E[(C(xk − xk ) + vk )(C(xk − xk ) + vk )T ]
Sk = E[(C(xk − xk ) + vk )((xk − xk )T CT + vkT )]
Sk = E[(C(xk − xk )(xk − xk )T CT ] + E[C(xk − xk )vkT ]
+ E[vk (xk − xk )T CT ] + E[vk vkT ]
Sk = CE[(xk − xk )(xk − xk )T ]CT + CE[(xk − xk )vkT ]
+ E[vk (xk − xk )T ]CT + E[vk vkT ]
Sk = CPk CT + CE[(xk − xk )vkT ] + E[vk (xk − xk )T ]CT + Rk

Since the error and noise are independent, the cross terms are zero.

Sk = CPk CT + C E[(xk − xk )vkT ] + E[vk (xk − xk )T ] CT + Rk


| {z } | {z }
0 0
T
Sk = CPk C + Rk

9.5 Two-sensor problem


9.5.1 Two noisy (independent) observations
Variable z1 is the noisy measurement of the variable x. The value of the noisy mea-
surement is the random variable defined by the probability density function p(z1 |x).
Variable z2 is the noisy measurement of the same variable x and independent of z1 . If
we know the property of our sensor (measurement noise), then the probability density
function of the measurement is p(z2 |x).
We are interested in the probability density function of x given the measurements z1
and z2 .
p(x, z1 , z2 ) p(z1 , z2 |x) p(z1 |x)p(z2 |x)
p(x|z1 , z2 ) = = =
p(z1 , z2 ) p(z1 , z2 ) p(z1 , z2 )
R
where p(z1 , z2 ) = X
p(z1 |x)p(z2 |x) dx, but z1 and z2 are given and we can write
1
p(x|z1 , z2 ) = p(z1 |x)p(z2 |x) or p(x|z1 , z2 ) ∼ p(z1 |x)p(z2 |x)
C
R
where C is a normalizing constant providing that X p(x|z1 , z2 ) dx = 1. The proba-
bility density p(x|z1 , z2 ) summarizes our complete knowledge about x.
9.5 Two-sensor problem 153

9.5.2 Single noisy observations


Variable z1 is the noisy measurement of the variable x. The value of the noisy mea-
surement is the random variable defined by the probability density function p(z1 |x).
The probability density function of x given the measurement z1 is

p(x, z1 ) p(z1 |x)p(x)


p(x|z1 ) = =
p(z1 ) p(z1 )
R
where p(z1 ) = X
p(z1 |x) dx, but z1 is given so we can write

1
p(x|z1 ) = p(z1 |x)p(x) or p(x|z1 ) ∼ p(z1 |x)p(x)
C

The probability density p(x|z1 ) summarizes our complete knowledge about x.

R Even in the single measurement case, the estimation of the variable x is a data/in-
formation fusion problem. We combine prior probability with the probability
resulting from the measurement.

9.5.3 Single noisy observations: a Gaussian case


( )
x−x0 2
1 1 z1 −x 2 1 −1
p(z1 |x) = √ e− 2 ( σ ) and p(x) = √ e 2 σ0
σ 2π σ0 2π

z1 , x0 , σ 2 , and σ02 are given.

p(x, z1 )
p(x|z1 ) =
p(z1 )
p(z1 |x)p(x)
p(x|z1 ) =
p(z1 )
1
p(x|z1 ) = p(z1 |x)p(x)
C
( )
x−x0 2
1 1 1 z1 −x 2 1 −1
p(x|z1 ) = √ e− 2 ( σ ) √ e 2 σ0
C σ 2π σ0 2π
Absorb the leading coefficients of the two probability distributions into a new constant
C1 .
( )2
1 − 12 ( z1σ−x )2 − 12 x−x0
p(x|z1 ) = e e σ0

C1
154 Chapter 9. Stochastic control theory

Combine the exponents.


( )
(z1 −x)2 (x−x0 )2
1 − 12 σ2
+ 2
σ0
p(x|z1 ) = e
C1
Expand the exponent into separate terms.
( 2
)
z1 2z1 x x2 2 2xx0 x2
1 − 12 σ2
− σ2
+σ 2 + x2 −
σ0 2
σ0
+ 0
2
σ0
p(x|z1 ) = e
C1
σ2 σ02
Multiply in σ2 or σ02
as appropriate to give all terms a denominator of σ 2 σ02 .
( 2 z2 2z 2
)
σ0 σ0 σ0 σ2 σ 2 x0 σ 2 x2
1 − 12 1
2
σ 2 σ0
−2 1
2
σ 2 σ0
x+
σ 2 σ02 x2 + 2
σ 2 σ0
x2 −2 2
σ 2 σ0
x+ 0
2
σ 2 σ0
p(x|z1 ) = e
C1
Reorder terms in the exponent.
( 2 2z 2 z2
)
σ0 σ2 σ0 σ 2 x0 σ0 σ 2 x2
1 − 12 σ 2 σ02 x2 + 2
σ 2 σ0
x2 −2 1
2
σ 2 σ0
x−2 2
σ 2 σ0
x+ 1
2
σ 2 σ0
+ 0
2
σ 2 σ0
p(x|z1 ) = e
C1
Combine like terms.
( 2 +σ 2 2 z +σ 2 x 2 z 2 +σ 2 x2
)
σ0 σ0 σ0
1 − 12 2
σ 2 σ0
x2 −2 1
2
σ 2 σ0
0
x+ 1
2
σ 2 σ0
0
p(x|z1 ) = e
C1
σ02 +σ 2
Factor out σ 2 σ02
.
2 +σ 2
( 2 z +σ 2 x 2 z 2 +σ 2 x2
)
σ0 σ0 σ0
1 − 12 2
σ 2 σ0
x2 −2 1
2 +σ 2
σ0
0
x+ 1
σ02 +σ 2
0
p(x|z1 ) = e
C1
2 +σ 2
( ( 2
) )
σ0 σ0 σ2 σ 2 z 2 +σ 2 x2
1 − 12 2
σ 2 σ0
x2 −2 2 +σ 2
σ0
z1 + 2 +σ 2
σ0
x0 x+ 0 12 2 0
σ0 +σ
p(x|z1 ) = e
C1
σ02 2
z + σ2σ+σ2 x0
σ 2 +σ02 1
is the mean of the combined probability distribution, which we’ll
0
denote as µ.
( )
σ 2 +σ02 2 z 2 +σ 2 x2
σ0
1 − 12 2
σ 2 σ0
x2 −2µx+ 1
σ02 +σ 2
0
p(x|z1 ) = e
C1
Add in µ2 − µ2 to perform some factoring.
2 +σ 2
( 2 z 2 +σ 2 x2
)
σ0 σ0
1 − 12 2
σ 2 σ0
x2 −2µx+µ2 −µ2 + 1
σ02 +σ 2
0
p(x|z1 ) = e
C1
9.5 Two-sensor problem 155
2 +σ 2
( 2 z 2 +σ 2 x2
)
σ0 σ0
1 − 12 2
σ 2 σ0
(x−µ)2 −µ2 + 1
σ02 +σ 2
0
p(x|z1 ) = e
C1
Pull out all constant terms in the exponent and combine them with C1 to make a new
constant C2 . We’re basically doing c1 ex+a → c1 ea ex → c2 ex .
2 2
1 − 12 σσ02+σ
2
σ0
(x−µ)2
p(x|z1 ) = e
C2

This means that if we’re given an initial estimate x0 and a measurement z1 with associ-
ated means and variances represented by Gaussian distributions, this information can
be combined into a third Gaussian distribution with its own mean value and variance.
The expected value of x given z1 is

σ02 σ2
E[x|z1 ] = µ = z 1 + x0 (9.11)
σ02 + σ 2 σ02 + σ 2

The variance of x given z1 is

σ 2 σ02
E[(x − µ)2 |z1 ] = (9.12)
σ02 + σ2

The expected value, which is also the maximum likelihood value, is the linear combi-
nation of the prior expected (maximum likelihood) value and the measurement. The
expected value is a reasonable estimator of x.

σ02 σ2
x̂ = E[x|z1 ] = z1 + 2 x0 (9.13)
σ02 +σ 2 σ0 + σ 2
x̂ = w1 z1 + w2 x0

Note that the weights w1 and w2 sum to 1. When the prior (i.e., prior knowledge of
state) is uninformative (a large variance),

σ02
w1 = 2lim 2 =1 (9.14)
σ0 →∞ σ0 + σ2
σ2
w2 = 2lim 2 =0 (9.15)
σ0 →∞ σ0 + σ2

and x̂ = z1 . That is, the weight is on the observations and the estimate is equal to the
measurement.
156 Chapter 9. Stochastic control theory

Let us assume we have a model providing an almost exact prior for x. In that case, σ02
approaches 0 and

σ02
w1 = lim 2 =0 (9.16)
σ0 →0 σ0 + σ 2
2

σ2
w2 = lim 2 =1 (9.17)
σ0 →0 σ0 + σ 2
2

The Kalman filter uses this optimal fusion as the basis for its operation.

9.6 Kalman filter


So far, we’ve derived equations for updating the expected value and state covariance
without measurements and how to incorporate measurements into an initial state opti-
mally. Now, we’ll combine these concepts to produce an estimator which minimizes
the error covariance for linear systems.

9.6.1 Derivations

k+1 = x̂k+1 + Kk+1 (yk+1 − ŷk+1 ), we want
Given the a posteriori update equation x̂+
to find the value of Kk+1 that minimizes the a posteriori estimate covariance (the error
covariance) because this minimizes the estimation error.
a posteriori estimate covariance update equation
The following is the definition of the a posteriori estimate covariance matrix.

k+1 = cov(xk+1 − x̂k+1 )


P+ +

Substitute in the a posteriori update equation and expand the measurement equations.

k+1 = cov(xk+1 − (x̂k+1 + Kk+1 (yk+1 − ŷk+1 )))
P+

k+1 = cov(xk+1 − x̂k+1 − Kk+1 (yk+1 − ŷk+1 ))
P+

k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
(Ck+1 xk+1 + Dk+1 uk+1 + vk+1 )
− (Ck+1 x̂−
k+1 + Dk+1 uk+1 )))

k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
Ck+1 xk+1 + Dk+1 uk+1 + vk+1
− Ck+1 x̂−
k+1 − Dk+1 uk+1 ))
9.6 Kalman filter 157

Reorder terms.

k+1 = cov(xk+1 − x̂k+1 − Kk+1 (
P+
Ck+1 xk+1 − Ck+1 x̂−
k+1
+ Dk+1 uk+1 − Dk+1 uk+1 + vk+1 ))

The Dk+1 uk+1 terms cancel.


− −
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (Ck+1 xk+1 − Ck+1 x̂k+1 + vk+1 ))
P+

Distribute Kk+1 to vk+1 .


− −
k+1 = cov(xk+1 − x̂k+1 − Kk+1 (Ck+1 xk+1 − Ck+1 x̂k+1 ) − Kk+1 vk+1 )
P+

Factor out Ck+1 .


− −
k+1 = cov(xk+1 − x̂k+1 − Kk+1 Ck+1 (xk+1 − x̂k+1 ) − Kk+1 vk+1 )
P+

Factor out xk+1 − x̂−


k+1 to the right.


k+1 = cov((I − Kk+1 Ck+1 )(xk+1 − x̂k+1 ) − Kk+1 vk+1 )
P+

Covariance is a linear operator, so it can be applied to each term separately. Covariance


squares terms internally, so the negative sign on Kk+1 vk+1 is removed.

k+1 = cov((I − Kk+1 Ck+1 )(xk+1 − x̂k+1 )) + cov(Kk+1 vk+1 )
P+

Now just evaluate the covariances.



k+1 = (I − Kk+1 Ck+1 ) cov(xk+1 − x̂k+1 )(I − Kk+1 Ck+1 )
P+ T

+ Kk+1 cov(vk+1 )KT


k+1

k+1 = (I − Kk+1 Ck+1 )Pk+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
P+ T T
(9.18)

This is the Joseph form of the covariance update equation, which is valid for all Kalman
gains.

R The Joseph form is helpful in floating point implementations of the Kalman


filter since it has better numerical stability than the optimal Kalman gain form
discussed later.
158 Chapter 9. Stochastic control theory

Finding the optimal Kalman gain


The error in the a posteriori state estimation is xk+1 − x̂+ k+1 . We want to minimize
the expected value of the square of the magnitude of this vector, which can be written
as E[(xk+1 − x̂+ k+1 )(xk+1 − x̂k+1 ) ]. By definition, this expected value is the a
+ T
+
posteriori error covariance Pk+1 . Remember that the eigenvectors of a matrix are
the fundamental transformation directions of that matrix, and the eigenvalues are the
magnitudes of those transformations. By minimizing the eigenvalues, we minimize the
error variance (the uncertainty in the state estimate) for each state.
P+k+1 is positive definite, so we know all the eigenvalues are positive. Therefore, a
reasonable quantity to minimize with our choice of Kalman gain is the sum of the
eigenvalues. We don’t have direct access to the eigenvalues, but we can use the fact that
[2]
the sum of the eigenvalues is equal to the trace of P+
k+1 and minimize that instead.

We’ll start with the equation for P+


k+1 .


k+1 = (I − Kk+1 Ck+1 )Pk+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
P+ T T

We’re going to expand the equation for P+


k+1 and collect terms. First, multiply in
P−
k+1 .
− −
k+1 = (Pk+1 − Kk+1 Ck+1 Pk+1 )(I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
P+ T T

Tranpose each term in I − Kk+1 Ck+1 . I is symmetric, so its transpose is dropped.


− −
k+1 = (Pk+1 − Kk+1 Ck+1 Pk+1 )(I − Ck+1 Kk+1 ) + Kk+1 Rk+1 Kk+1
P+ T T T

Multiply in I − CT T
k+1 Kk+1 .

− −
k+1 = Pk+1 (I − Ck+1 Kk+1 ) − Kk+1 Ck+1 Pk+1 (I − Ck+1 Kk+1 )
P+ T T T T

+ Kk+1 Rk+1 KT
k+1

Expand terms.
− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1
P+ T T

+ Kk+1 Ck+1 P− T T T
k+1 Ck+1 Kk+1 + Kk+1 Rk+1 Kk+1
[2] By definition, the characteristic polynomial of an n × n matrix P is given by
p(t) = det(tI − P) = tn − tr(P)tn−1 + · · · + (−1)n det(P)
as well as p(t) = (t − λ1 ) . . . (t − λn ) where λ1 , . . . , λn are the eigenvalues of P. The coefficient for
tn−1 in the second polynomial’s expansion is −(λ1 + · · · + λn ). Therefore, by matching coefficients for
tn−1 , we get tr(P) = λ1 + · · · + λn .
9.6 Kalman filter 159

Factor out Kk+1 and KT


k+1 .

− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1
P+ T T

+ Kk+1 (Ck+1 P− T T
k+1 Ck+1 + Rk+1 )Kk+1

Ck+1 P− T
k+1 Ck+1 +Rk+1 is the innovation (measurement residual) covariance at timestep
k + 1. We’ll let this expression equal Sk+1 . We won’t need it in the final theorem, but
it makes the derivations after this point more concise.
− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1 + Kk+1 Sk+1 Kk+1 (9.19)
P+ T T T

Now take the trace.


− − −
k+1 ) = tr(Pk+1 ) − tr(Pk+1 Ck+1 Kk+1 ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+ T T

+ tr(Kk+1 Sk+1 KT
k+1 )

Transpose one of the terms twice.


− −T T −
k+1 ) = tr(Pk+1 ) − tr((Kk+1 Ck+1 Pk+1 ) ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+
+ tr(Kk+1 Sk+1 KT
k+1 )

P−
k+1 is symmetric, so we can drop the transpose.

− − −
k+1 ) = tr(Pk+1 ) − tr((Kk+1 Ck+1 Pk+1 ) ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+ T

+ tr(Kk+1 Sk+1 KT
k+1 )

The trace of a matrix is equal to the trace of its transpose since the elements used in
the trace are on the diagonal.
− − −
k+1 ) = tr(Pk+1 ) − tr(Kk+1 Ck+1 Pk+1 ) − tr(Kk+1 Ck+1 Pk+1 )
tr(P+
+ tr(Kk+1 Sk+1 KT
k+1 )
− −
k+1 ) = tr(Pk+1 ) − 2 tr(Kk+1 Ck+1 Pk+1 ) + tr(Kk+1 Sk+1 Kk+1 )
tr(P+ T

Given theorems 9.6.1 and 9.6.2


Theorem 9.6.1 ∂A tr(ABAT ) = 2AB where B is symmetric.
160 Chapter 9. Stochastic control theory


Theorem 9.6.2 ∂A tr(AC) = CT

find the minimum of the trace of P+


k+1 by taking the partial derivative with respect to
K and setting the result to 0.

∂ tr(P+
k+1 )
= 0 − 2(Ck+1 P− T
k+1 ) + 2Kk+1 Sk+1
∂K
∂ tr(P+
k+1 )
= −2P−T T
k+1 Ck+1 + 2Kk+1 Sk+1
∂K
∂ tr(P+
k+1 )
= −2P− T
k+1 Ck+1 + 2Kk+1 Sk+1
∂K
0 = −2P− T
k+1 Ck+1 + 2Kk+1 Sk+1
2Kk+1 Sk+1 = 2P− T
k+1 Ck+1
Kk+1 Sk+1 = P− T
k+1 Ck+1
Kk+1 = P− T −1
k+1 Ck+1 Sk+1

This is the optimal Kalman gain.


Simplified a posteriori estimate covariance update equation
If the optimal Kalman gain is used, the a posteriori estimate covariance matrix update
equation can be simplified. First, we’ll manipulate the equation for the optimal Kalman
gain.

Kk+1 = P− T −1
k+1 Ck+1 Sk+1
Kk+1 Sk+1 = P− T
k+1 Ck+1

Kk+1 Sk+1 KT T T
k+1 = Pk+1 Ck+1 Kk+1

Now we’ll substitute it into equation (9.19).


− − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1 + Kk+1 Sk+1 Kk+1
P+ T T T

− − − −
k+1 = Pk+1 − Pk+1 Ck+1 Kk+1 − Kk+1 Ck+1 Pk+1 + (Pk+1 Ck+1 Kk+1 )
P+ T T T T

− −
k+1 = Pk+1 − Kk+1 Ck+1 Pk+1
P+

Factor out P−
k+1 to the right.


k+1 = (I − Kk+1 Ck+1 )Pk+1
P+
9.6 Kalman filter 161

9.6.2 Predict and update equations


Now that we’ve derived all the pieces we need, we can finally write all the equations for
a Kalman filter. Theorem 9.6.3 shows the predict and update steps for a Kalman filter
at the k th timestep.
Intuitively, the predict step projects the error covariance forward in an increasing parabolic
shape because you become less certain in your state estimate as you go longer since a
measurement. In your state-space (think like 3D space but each state is an axis), the
error ellipsoid grows over time.
The correct step decreases the error covariance again by injecting new information.
The input has no effect on this because it has no noise associated with it. In fact, input
cancels out in the Kalman filter expectation and covariance update equation derivations.

A Kalman filter chooses the Kalman gain Kk+1 in the correct equation x̂+ k+1 = x̂k+1 +
Kk+1 (yk+1 − ŷk+1 ) such that the eigenvalues of P are minimized. This minimizes
the error variances and thus the dimensions of the uncertainty ellipsoid over time.
If the update period is constant, the predict and correct steps are run in a loop as op-
posed to sporadically skipping correct steps, and the (A, C) pair is detectable, the
error covariance matrix P willl approach a steady-state. This steady-state P results
in a steady-state Kalman gain that can be used instead of the error covariance update
equations to conserve computational resources.
162 Chapter 9. Stochastic control theory

Theorem 9.6.3 — Kalman filter.

Predict step
x̂− +
k+1 = Ax̂k + Buk (9.20)
P−
k+1 = AP−
kA
T
+Q (9.21)
Update step
Kk+1 = P− T − T
k+1 C (CPk+1 C + R)
−1
(9.22)
x̂+
k+1 = x̂−
k+1 + Kk+1 (yk+1 − Cx̂−
k+1 − Duk+1 ) (9.23)
+
Pk+1 = (I − Kk+1 Ck+1 )P−
k+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 KT
T
k+1
(9.24)

A system matrix x̂ state estimate vector


B input matrix u input vector
C output matrix y output vector
D feedthrough matrix Q process noise covariance
P error covariance matrix R measurement noise covariance
K Kalman gain matrix
where a superscript of minus denotes a priori and plus denotes a posteriori estimate
(before and after update respectively).

C, D, Q, and R from the equations derived earlier are made constants here.

R To implement a discrete time Kalman filter from a continuous model, the model
and continuous time Q and R matrices can be discretized using theorem 7.3.1.

Unknown states in a Kalman filter are generally represented by a Wiener (pronounced


VEE-ner) process.[3] This process has the property that its variance increases linearly
with time t.

[3] Explaining why we use the Wiener process would require going much more in depth into stochastic

processes and Itô calculus, which is outside the scope of this book.
9.6 Kalman filter 163

Matrix Rows × Columns Matrix Rows × Columns


A states × states x̂ states × 1
B states × inputs u inputs × 1
C outputs × states y outputs × 1
D outputs × inputs Q states × states
P states × states R outputs × outputs
K states × outputs

Table 9.2: Kalman filter matrix dimensions

9.6.3 Setup
Equations to model
The following example system will be used to describe how to define and initialize the
matrices for a Kalman filter.
A robot is between two parallel walls. It starts driving from one wall to the other at a
velocity of 0.8 cm/s and uses ultrasonic sensors to provide noisy measurements of the
distances to the walls in front of and behind it. To estimate the distance between the
walls, we will define three states: robot position, robot velocity, and distance between
the walls.

xk+1 = xk + vk ∆T (9.25)
vk+1 = vk (9.26)
xw
k+1 = xw
k (9.27)

This can be converted to the following state-space model.


 
xk
x k =  vk  (9.28)
xw
k
     
1 1 0 0 0
xk+1 = 0 0 0 xk + 0.8 + 0.1 wk (9.29)
0 0 1 0 0

where the Gaussian random variable wk has a mean of 0 and a variance of 1. The
164 Chapter 9. Stochastic control theory

observation model is  
1 0 0
yk = x + θk (9.30)
−1 0 1 k

where the covariance matrix of Gaussian measurement noise θ is a 2 × 2 matrix with


both diagonals 10 cm2 .
The state vector is usually initialized using the first measurement or two. The covariance
matrix entries are assigned by calculating the covariance of the expressions used when
assigning the state vector. Let k = 2.
 
Q= 1 (9.31)
 
10 0
R= (9.32)
0 10
 
yk,1
x̂ = (yk,1 − yk−1,1 )/dt (9.33)
yk,1 + yk,2
 
10 10/dt 10
P = 10/dt 20/dt2 10/dt (9.34)
10 10/dt 20

Initial conditions
To fill in the P matrix, we calculate the covariance of each combination of state vari-
ables. The resulting value is a measure of how much those variables are correlated.
Due to how the covariance calculation works out, the covariance between two vari-
ables is the sum of the variance of matching terms which aren’t constants multiplied by
any constants the two have. If no terms match, the variables are uncorrelated and the
covariance is zero.
In P11 , the terms in x1 correlate with itself. Therefore, P11 is x1 ’s variance, or P11 =
10. For P21 , One term correlates between x1 and x2 , so P21 = 10 dt . The constants
from each are simply multiplied together. For P22 , both measurements are correlated,
20
so the variances add together. Therefore, P22 = dt 2 . It continues in this fashion until

the matrix is filled up. Order doesn’t matter for correlation, so the matrix is symmetric.
Selection of priors
Choosing good priors is important for a well performing filter, even if little information
is known. This applies to both the measurement noise and the noise model. The act
of giving a state variable a large variance means you know something about the system.
Namely, you aren’t sure whether your initial guess is close to the true state. If you make
a guess and specify a small variance, you are telling the filter that you are very confident
9.6 Kalman filter 165

in your guess. If that guess is incorrect, it will take the filter a long time to move away
from your guess to the true value.
Covariance selection
While one could assume no correlation between the state variables and set the covari-
ance matrix entries to zero, this may not reflect reality. The Kalman filter is still guar-
enteed to converge to the steady-state covariance after an infinite time, but it will take
longer than otherwise.
Noise model selection
We typically use a Gaussian distribution for the noise model because the sum of many
independent random variables produces a normal distribution by the central limit the-
orem. Kalman filters only require that the noise is zero-mean. If the true value has an
equal probability of being anywhere within a certain range, use a uniform distribution
instead. Each of these communicates information regarding what you know about a
system in addition to what you do not.
Process noise and measurement noise covariance selection
Recall that the process noise covariance is Q and the measurement noise covariance is
R. To tune the elements of these, it can be helpful to take a collection of measurements,
then run the Kalman filter on them offline to evaluate its performance.
The diagonal elements of R are the variances of each measurement, which can be
easily determined from the offline measurements. The diagonal elements of Q are the
variances of each state. They represent how much each state is expected to deviate
from the model.
Selecting Q is more difficult. If the data is trusted too much over the model, one risks
overfitting the data. One should balance estimating any hidden states sufficiently with
actually filtering out the noise.
Modeling other noise colors
The Kalman filter assumes a model with zero-mean white noise. If the model is incom-
plete in some way, whether it’s missing dynamics or assumes an incorrect noise model,
e = y − Cx̂ over time will have probability characteristics not indicative
the residual y
of white noise (e.g., it isn’t zero-mean).
To handle other colors of noise in a Kalman filter, define that color of noise in terms
of white noise and augment the model with it.

9.6.4 Simulation
Figure 9.3 shows the state estimates and measurements of the Kalman filter over time.
Figure 9.4 shows the position estimate and variance over time. Figure 9.5 shows the
166 Chapter 9. Stochastic control theory

Figure 9.3: State estimates and measurements with Kalman filter

wall position estimate and variance over time. Notice how the variances decrease over
time as the filter gathers more measurements. This means that the filter becomes more
confident in its state estimates.
The final precisions in estimating the position of the robot and the wall are the square
roots of the corresponding elements in the covariance matrix. That is, 0.5188 m and
0.4491
√ m respectively. They are smaller than the precision of the raw measurements,
10 = 3.1623 m. As expected, combining the information from several measure-
ments produces a better estimate than any one measurement alone.

9.6.5 Kalman filter as Luenberger observer


A Kalman filter can be represented as a Luenberger observer by letting L = AKk (see
appendix D.2). The Luenberger observer has a constant observer gain matrix L, so the
steady-state Kalman gain is used to calculate it. We will demonstrate how to find this
shortly.
Kalman filter theory provides a way to place the poles of the Luenberger observer opti-
mally in the same way we placed the poles of the controller optimally with LQR. The
eigenvalues of the Kalman filter are

eig(A(I − Kk C)) (9.35)


9.6 Kalman filter 167

Figure 9.4: Robot position estimate and variance with Kalman filter

Figure 9.5: Wall position estimate and variance with Kalman filter
168 Chapter 9. Stochastic control theory

Steady-state Kalman gain


One may have noticed that the error covariance matrix can be updated independently
of the rest of the model. The error covariance matrix tends toward a steady-state value,
and this matrix can be obtained via the discrete algebraic Riccati equation. This can
then be used to compute a steady-state Kalman gain.
General matrix inverses like S−1 are expensive. We want to put K = PCT S−1 into
Ax = b form so we can solve it more efficiently.
K = PCT S−1
KS = PCT
(KS)T = (PCT )T
ST KT = CPT
The solution of Ax = b can be found via x = solve(A, b).
KT = solve(ST , CPT )
K = solve(ST , CPT )T

Snippet 9.1 computes the steady-state Kalman gain matrix.


"""Function for computing the steady-state Kalman gain matrix."""

import numpy as np
import scipy as sp

def kalmd(A, C, Q, R):


"""Solves for the discrete steady-state Kalman gain.

Keyword arguments:
A -- numpy.array(states x states), system matrix.
C -- numpy.array(outputs x states), output matrix.
Q -- numpy.array(states x states), process noise covariance matrix.
R -- numpy.array(outputs x outputs), measurement noise covariance matrix.

Returns:
K -- numpy.array(outputs x states), Kalman gain matrix.
"""
P = sp.linalg.solve_discrete_are(a=A.T, b=C.T, q=Q, r=R)
return np.linalg.solve(C @ P @ C.T + R, C @ P.T).T

Snippet 9.1. Steady-state Kalman gain matrix calculation in Python

9.7 Kalman smoother


The Kalman filter uses the data up to the current time to produce an optimal estimate
of the system state. If data beyond the current time is available, it can be ran through
9.7 Kalman smoother 169

a Kalman smoother to produce a better estimate. This is done by recording measure-


ments, then applying the smoother to it offline.
The Kalman smoother does a forward pass on the available data, then a backward pass
through the system dynamics so it takes into account the data before and after the
current time. This produces state variances that are lower than that of a Kalman filter.

9.7.1 Predict and update equations


One first does a forward pass with the typical Kalman filter equations and stores the
results. Then one can use the Rauch-Tung-Striebel (RTS) algorithm to do the back-
ward pass. Theorem 9.7.1 shows the predict and and update steps for the forward and
backward passes for a Kalman smoother at the k th timestep.
See section 3 of https://users.aalto.fi/~ssarkka/course_k2011/pdf/handout7.pdf for a
derivation of the Rauch-Tung-Striebel smoother.

Theorem 9.7.1 — Kalman smoother.

Forward predict step


x̂− +
k+1 = Ax̂k + Buk (9.36)
P−
k+1 = AP−
kA
T
+Q (9.37)
Forward update step
Kk+1 = P− T − T
k+1 C (CPk+1 C + R)
−1
(9.38)
x̂+
k+1 = x̂−
k+1 + Kk+1 (yk+1 − Cx̂−
k+1 − Duk+1 ) (9.39)
P+k+1 = (I − Kk+1 C)P−
k+1 (9.40)
Backward update step
− −1
K k = P+ T
k Ak (Pk+1 ) (9.41)

x̂k|N = k + Kk (x̂k+1|N − x̂k+1 )
x̂+ (9.42)

P+k + Kk (Pk+1|N − Pk+1 )Kk
T
Pk|N = (9.43)
Backward initial conditions
x̂N |N = x̂+
N (9.44)
PN |N = P+
N (9.45)
170 Chapter 9. Stochastic control theory

Figure 9.6: Robot position with Kalman smoother

9.7.2 Example
We will modify the robot model so that instead of a velocity of 0.8 cm/s with random
noise, the velocity is modeled as a random walk from the current velocity.
 
xk
x k =  vk  (9.46)
xwk
   
1 1 0 0
xk+1 = 0 1 0 xk + 0.1 wk (9.47)
0 0 1 0

We will use the same observation model as before.


Using the same data from subsection 9.6.4, figures 9.6, 9.7, and 9.8 show the improved
state estimates and figure 9.9 shows the improved robot position covariance with a
Kalman smoother.
Notice how the wall position produced by the smoother is a constant. This is because
that state has no dynamics, so the final estimate from the Kalman filter is already the
best estimate.
9.7 Kalman smoother 171

Figure 9.7: Robot velocity with Kalman smoother

Figure 9.8: Wall position with Kalman smoother


172 Chapter 9. Stochastic control theory

Figure 9.9: Robot position variance with Kalman smoother

See Roger Labbe’s book Kalman and Bayesian Filters in Python for more on smooth-
ing.[4]

[4] https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/13-Smoothing.

ipynb
9.8 Extended Kalman filter 173

9.8 Extended Kalman filter


In this book, we have covered the Kalman filter, which is the optimal unbiased estimator
for linear systems. One method for extending it to nonlinear systems is the extended
Kalman filter.
The extended Kalman filter (EKF) linearizes the dynamics and measurement models
during the prediction and correction steps respectively. Then, the linear Kalman filter
equations are used to compute the error covariance matrix P and Kalman gain matrix
K.
Theorem 9.8.1 shows the predict and update steps for an extended Kalman filter at the
k th timestep.
174 Chapter 9. Stochastic control theory

Theorem 9.8.1 — Extended Kalman filter.

Predict step
∂f (x, u)
A= (9.48)
∂x x̂+
k ,uk
| {z }
Linearize f (x,u)

Ak = eAT
|{z} (9.49)
Discretize A
x̂− +
k+1 = RK4(f, x̂k , uk , T ) (9.50)
| {z }
Numerical integration

P−
k+1 = A k P− T
k Ak + Q k (9.51)
Update step
∂h(x, u)
Ck+1 = (9.52)
∂x x̂−
k+1 ,uk+1
| {z }
Linearize h(x,u)

Kk+1 = P− T − T
k+1 Ck+1 (Ck+1 Pk+1 Ck+1 + Rk+1 )
−1
(9.53)
x̂+
k+1 = x̂− −
k+1 + Kk+1 (yk+1 − h(x̂k+1 , uk+1 )) (9.54)
+
Pk+1 = (I − Kk+1 Ck+1 )P−
k+1 (I − Kk+1 Ck+1 ) + Kk+1 Rk+1 Kk+1
T T

(9.55)

f (x, u) continuous dynamics model x̂ state estimate vector


h(x, u) measurement model u input vector
T sample timestep duration y output vector
P error covariance matrix Q process noise covariance
K Kalman gain matrix R measurement noise covariance
where a superscript of minus denotes a priori and plus denotes a posteriori estimate
(before and after update respectively).

R To implement a discrete time extended Kalman filter from a continuous model,


the dynamics model can be numerically integrated via a method from section
7.9 and the continuous time Q and R matrices can be discretized using theorem
7.3.1.
9.9 Unscented Kalman filter 175

Matrix Rows × Columns Matrix Rows × Columns


A states × states x̂ states × 1
C outputs × states u inputs × 1
P states × states y outputs × 1
K states × outputs Q states × states
R outputs × outputs

Table 9.3: Extended Kalman filter matrix dimensions

9.9 Unscented Kalman filter


In this book, we have covered the Kalman filter, which is the optimal unbiased estimator
for linear systems. One method for extending it to nonlinear systems is the unscented
Kalman filter.
The unscented Kalman filter (UKF) propagates carefully chosen points called sigma
points through the nonlinear state and measurement models to obtain estimates of the
true covariances (as opposed to linearized versions of them). We recommend reading
Roger Labbe’s book Kalman and Bayesian Filters in Python for more on UKFs.[5]
The original paper on the UKF is also an option [6]. See also the equations for van der
Merwe’s sigma point algorithm [22]. Here’s a paper on a quaternion-based Unscented
Kalman filter for orientation tracking [8].
Here’s an interview about the origin of the UKF with its creator [21].

9.9.1 Square-root UKF


The UKF uses a matrix square root (Cholesky decomposition) to compute the scaling
for the sigma points. This operation can introduce numerical instability. The square-
root UKF avoids this by propagating the square-root of the error covariance directly
instead of the error covariance; the Cholesky decomposition is replaced with a QR
decomposition and a Cholesky rank-1 update.
See the original paper [14] and this implementation tutorial [19].

[5] https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/

10-Unscented-Kalman-Filter.ipynb
176 Chapter 9. Stochastic control theory

9.10 Multiple model adaptive estimation


Multiple model adaptive estimation (MMAE) runs multiple Kalman filters with dif-
ferent models on the same data. The Kalman filter with the lowest residual has the
highest likelihood of accurately reflecting reality. This can be used to detect certain
system states like an aircraft engine failing without needing to invest in costly sensors
to determine this directly.
For example, say you have three Kalman filters: one for turning left, one for turning
right, and one for going straight. If the control input is attempting to fly the plane
straight and the Kalman filter for going left has the lowest residual, the aircraft’s left
engine probably failed.
See Roger Labbe’s book Kalman and Bayesian Filters in Python for more on MMAE.[6]

[6] MMAE section of https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/

14-Adaptive-Filtering.ipynb
Road next to Stevenson Academic building at UCSC

10. Pose estimation

Pose is defined as the position and orientation of an agent (a system with a controller).
The plant usually includes the pose in its state vector. We’ll cover several methods for
estimating an agent’s pose from local measurements such as encoders and gyroscope
heading.

10.1 Forward Euler integration


The simplest way to perform pose estimation via dead reckoning (that is, no direct mea-
surements of the pose are used) is to integrate the velocity in each orthogonal direction
over time. In two dimensions, one could use

xk+1 = xk + vk cos θk T
yk+1 = yk + vk sin θk T
θk+1 = θgyro,k+1

where T is the sample period. This odometry approach assumes that the robot follows
a straight path between samples (that is, ω = 0 at all but the sample times).

10.2 Pose exponential


We can obtain a more accurate approximation of the pose by including first-order dy-
namics for the heading θ. To provide a rationale for the math we’re about to do, we
178 Chapter 10. Pose estimation

need to cover some aspects of group theory.

10.2.1 What is a group?


In mathematics, a group is a set equipped with a binary operation (an operation with
two arguments) that combines any two elements (of the set) to form a third element in
such a way that four conditions called group axioms are satisfied: closure, associativity,
identity, and invertibility.
Closure means that the result is in the same set as the arguments.
Associativity means that within an expression containing two or more occurrences in a
row of the same associative operator, the order in which the operations are performed
does not matter as long as the sequence of the operands is not changed. In other words,
different groupings of the operations produces the same result.
Identity, or an identity element, is a special type of element of a set with respect to
a binary operation on that set, which leaves any element of the set unchanged when
combined with it. For example, the additive identity of the set of integers is zero,
which means that any integer summed with zero produces that integer.
Invertibility means there is an element that can “undo” the effect of combination with
another given element. For integers and the addition operator, the inverse element
would be the negation.

10.2.2 What is a pose?


To develop what a pose is in group theory, we need to define a few key groups. SO(2)
is the special orthogonal group in dimension 2. They represent a 2D rotation.
SE(2) is the special euclidean group in dimension 2. They represent a 2D rotation and
a 2D translation, which we call a 2D pose. In other words, pose is an element of SE(2).

10.2.3 What is a twist?


A 2D twist is an element of the tangent space of SE(2) (like the tangential distance
traveled by the robot along an arc in SE(2)). We use the “pose exponential” to map a
twist (an element of the tangent space) to an element of SE(2). In other words, we map
a twist to a pose.
We call it a pose exponential because it’s an exponential map onto a pose. The term
exponential is used because an exponential is the solution to integrating a differential
equation whose derivative of a value is proportional to the value itself. For example,
dx at
dt = ax has the solution x = x0 e .

We use the pose exponential to take encoder measurement deltas and gyro angle deltas
10.2 Pose exponential 179

(which are in the tangent space and are thus a twist) and turn them into a change in
pose. This gets added to the pose from the last update.

10.2.4 Derivation
We can obtain a more accurate approximation of the pose than Euler integration by
including first-order dynamics for the heading θ.
 
x
x = y 
θ

vx , vy , and ω are the x and y velocities of the robot within its local coordinate frame,
which will be treated as constants.

R There are two coordinate frames used here: robot and global. A superscript
on the left side of a matrix denotes the coordinate frame in which that matrix
is represented. The robot’s coordinate frame is denoted by R and the global
coordinate frame is denoted by G.

In the robot frame (the tangent space)


R
dx = R vx dt
R
dy = R vy dt
R
dθ = R ω dt

To transform this into the global frame SE(2), we apply a counterclockwise rotation
matrix where θ changes over time.
G    R 
dx cos θ(t) − sin θ(t) 0 vx
dy  =  sin θ(t) cos θ(t) 0 vy dt
dθ 0 0 1 ω
G    R 
dx cos ωt − sin ωt 0 vx
dy  =  sin ωt cos ωt 0 vy dt
dθ 0 0 1 ω
Now, integrate the matrix equation (matrices are integrated element-wise). This deriva-
tion heavily utilizes the integration method described in section 4.3.
G   sin ωt cos ωt
 R  t
∆x ω ω 0 vx
∆y  = − cos ωt sin ωt 0 vy 
ω ω
∆θ 0 0 t ω
0
180 Chapter 10. Pose estimation
G   sin ωt cos ωt−1
 R 
∆x ω ω 0 vx
∆y  =  1−cos ωt sin ωt
0 vy 
ω ω
∆θ 0 0 t ω

This equation assumes a starting orientation of θ = 0. For nonzero starting orientations,


we can apply a counterclockwise rotation by θ.
G     sin ωt  R 
∆x cos θ − sin θ 0 ω
cos ωt−1
ω 0 vx
∆y  =  sin θ cos θ 0  1−cos ωt sin ωt
0   vy  (10.1)
ω ω
∆θ 0 0 1 0 0 t ω

R Control system implementations will generally have a model update and a con-
troller update in a given iteration. Equation (10.1) (the model update) uses the
current velocity to advance the state to the next timestep (into the future). Since
controllers use the current state, the controller update should be run before the
model update.

If we factor out a t, we can use change in pose between updates instead of velocities.
G     sin ωt  R 
∆x cos θ − sin θ 0 ω
cos ωt−1
ω 0 vx
∆y  =  sin θ cos θ 0  1−cos ωt sin ωt
0   vy 
ω ω
∆θ 0 0 1 0 0 t ω
G     sin ωt  R  
∆x cos θ − sin θ 0 ωt
cos ωt−1
ωt 0 vx
∆y  =  sin θ cos θ 0  1−cos ωt sin ωt
0  v y  t
ωt ωt
∆θ 0 0 1 0 0 1 ω
G     sin ωt  R  
∆x cos θ − sin θ 0 ωt
cos ωt−1
ωt 0 vx t
∆y  =  sin θ cos θ 0  1−cos ωt sin ωt
0  v y t 
ωt ωt
∆θ 0 0 1 0 0 1 ωt
G     R   R 
∆x cos θ − sin θ 0 sin ∆θ
∆θ
cos ∆θ−1
∆θ 0 ∆x
∆y  =  sin θ cos θ 0  1−cos ∆θ sin ∆θ
0 ∆y  (10.2)
∆θ ∆θ
∆θ 0 0 1 0 0 1 ∆θ

R 
The vector ∆x ∆y ∆θ T is a twist because it’s an element of the tangent space
(the robot’s local coordinate frame).

R Control system implementations will generally have a model update and a con-
troller update in a given iteration. Equation (10.2) (the model update) uses local
10.2 Pose exponential 181

distance and heading deltas between the previous and current timestep, so it ad-
vances the state to the current timestep. Since controllers use the current state,
the controller update should be run after the model update.

When the robot is traveling on a straight trajectory (∆θ = 0), some expressions within
the equation above are indeterminate. We can approximate these with Taylor series
expansions.

sin ∆θ ∆θ2 ∆θ2


=1− + ... ≈1−
∆θ 6 6
cos ∆θ − 1 ∆θ ∆θ3 ∆θ
=− + − ... ≈ −
∆θ 2 24 2
1 − cos ∆θ ∆θ ∆θ3 ∆θ
= − + ... ≈
∆θ 2 24 2

Theorem 10.2.1 — Pose exponential.

G    R  sin ∆θ  R 
∆x cos θ − sin θ 0 ∆θ
cos ∆θ−1
∆θ 0 ∆x
∆y  =  sin θ cos θ 0  1−cos ∆θ sin ∆θ
0 ∆y  (10.3)
∆θ ∆θ
∆θ 0 0 1 0 0 1 ∆θ

where G denotes global coordinate frame and R denotes robot’s coordinate frame.
For sufficiently small ∆θ:

sin ∆θ ∆θ2 cos ∆θ − 1 ∆θ 1 − cos ∆θ ∆θ


=1− =− = (10.4)
∆θ 6 ∆θ 2 ∆θ 2

∆x change in pose’s x ∆y change in pose’s y


∆θ change in pose’s θ θ starting angle in global coordinate frame
This change in pose can be added directly to the previous pose estimate to update it.

Figures 10.1 through 10.4 show the pose estimation errors of forward Euler odometry
and pose exponential odometry for a feedforward S-curve trajectory (dt = 20 ms).
The highest errors for the 8.84 m by 5.0 m trajectory are 3.415 cm in x, 0.158 cm in
y, and −0.644 deg in heading. The difference would be even more noticeable for paths
with higher curvatures and longer durations. The error returns to near zero in this case
182 Chapter 10. Pose estimation

Figure 10.1: Pose estimation comparison Figure 10.2: Pose estimation comparison
(y vs x) (x error vs time)

Figure 10.3: Pose estimation comparison Figure 10.4: Pose estimation comparison
(y error vs time) (heading error vs time)

because the curvature is symmetric, so the second half cancels the error accrued in the
first half.
Using a smaller update period somewhat mitigates the forward Euler pose estimation
error. However, there are bigger sources of error like turning scrub on skid steer robots
that should be dealt with before odometry numerical accuracy.

10.2.5 Lie groups


While we avoided the topic in our explanation, pose is what’s known as a Lie group (a
group that is also a differentiable manifold). There’s a lot of mathematical and controls
results developed around Lie groups, so we’re mentioning the connection here in case
10.3 Pose correction 183

you want to search the Internet for more information.

10.3 Pose correction


The previous methods for pose estimation have assumed no direct pose measurements
are available. At best, only heading is available. To augment any of them with cor-
rections from full pose measurements, an Extended Kalman filter (section 9.8) or Un-
scented Kalman filter (section 9.9) can be used.
This page intentionally left blank
IV
System modeling

11 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

12 Newtonian mechanics examples . . . . . 201

13 Lagrangian mechanics examples . . . . 223

14 System identification . . . . . . . . . . . . . . . . 225


This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura

11. Dynamics

11.1 Linear motion


X
F = ma
P
where F is the sum of all forces applied to an object in Newtons, m is the mass of
the object in kg, and a is the net acceleration of the object in sm2 .

1
x(t) = x0 + v0 t + at2
2

where x(t) is an object’s position at time t, x0 is the initial position, v0 is the initial
velocity, and a is the acceleration.

11.2 Angular motion


X
τ = Iα
P
where τ is the sum of all torques applied to an object in Newton-meters, I is the
moment of inertia of the object in kg-m2 (also called the rotational mass), and α is the
net angular acceleration of the object in rad
s2 .

1
θ(t) = θ0 + ω0 t + αt2
2
188 Chapter 11. Dynamics

where θ(t) is an object’s angle at time t, θ0 is the initial angle, ω0 is the initial angular
velocity, and α is the angular acceleration.

11.3 Vectors
Vectors are quantities with a magnitude and a direction. Vectors in three-dimensional
space have a coordinate for each spatial direction x, y, and z. Let’s take the vector
⃗a = ⟨1, 2, 3⟩. ⃗a is a three-dimensional vector that describes a movement 1 unit in the
x direction, 2 units in the y direction, and 3 units in the z direction.
We define î, ĵ, and k̂ as vectors that represent the fundamental movements one can
make the three-dimensional space: 1 unit of movement in the x direction, 1 unit of
movement y direction, and 1 unit of movement in the z direction respectively. These
three vectors form a basis of three-dimensional space because copies of them can be
added together to reach any point in three-dimensional space.

î = ⟨1, 0, 0⟩
ĵ = ⟨0, 1, 0⟩
k̂ = ⟨0, 0, 1⟩

We can also write the vector ⃗a in terms of these basis vectors.

⃗a = 1î + 2ĵ + 3k̂

11.3.1 Basic vector operations


We will now show this is equivalent to the original notation through some vector math-
ematics. First, we’ll substitute in the values for the basis vectors.

⃗a = 1⟨1, 0, 0⟩ + 2⟨0, 1, 0⟩ + 3⟨0, 0, 1⟩

Scalars are multiplied component-wise with vectors.

⃗a = ⟨1, 0, 0⟩ + ⟨0, 2, 0⟩ + ⟨0, 0, 3⟩

Vectors are added by summing each of their components.

⃗a = ⟨1, 2, 3⟩
11.4 Curvilinear motion 189

11.3.2 Cross product


The cross product is denoted by ×. The cross product of the basis vectors are computed
as follows.

î × ĵ = k̂
ĵ × k̂ = î
k̂ × î = ĵ

They proceed in a cyclic fashion through i, j, and k. If a vector is crossed with itself, it
produces the zero vector (a scalar zero for each coordinate). The cross products of the
basis vectors in the opposite order progress backwards and include a negative sign.

î × k̂ = −ĵ
k̂ × ĵ = −î
ĵ × î = −k̂

Given vectors ⃗u = aî + bĵ + ck̂ and ⃗v = dî + eĵ + f k̂, ⃗u × ⃗v is computed using the
distributive property.

⃗u × ⃗v = (aî + bĵ + ck̂) × (dî + eĵ + f k̂)


⃗u × ⃗v = ad(î × î) + ae(î × ĵ) + af (î × k̂)
+ bd(ĵ × î) + be(ĵ × ĵ)
+ bf (ĵ × k̂)
+ cd(k̂ × î) + ce(k̂ × ĵ) + cf (k̂ × k̂)
⃗u × ⃗v = aek̂ + af (−ĵ) + bd(−k̂) + bf î + cdĵ + ce(−î)
⃗u × ⃗v = aek̂ − af ĵ − bdk̂ + bf î + cdĵ − ceî
⃗u × ⃗v = (bf − ce)î + (cd − af )ĵ + (ae − bd)k̂

11.4 Curvilinear motion


Curvilinear motion describes the motion of an object along a fixed curve. This motion
has both linear and angular components. For derivations involving curvilinear motion,
we’ll assume positive x (î) is forward, positive y (ĵ) is to the left, positive z (k̂) is up, and
the robot is facing in the x direction. This axes convention is known as North-West-Up
(NWU), and is shown in figure 11.1.
190 Chapter 11. Dynamics

+x

+y

Figure 11.1: 2D projection of North-West-Up (NWU) axes convention. The positive


z-axis is pointed out of the page toward the reader.

The main equation we’ll need is the following.

⃗vB = ⃗vA + ωA × ⃗rB|A

where ⃗vB is the velocity vector at point B, ⃗vA is the velocity vector at point A, ωA is
the angular velocity vector at point A, and ⃗rB|A is the distance vector from point A to
point B (also described as the “distance to B relative to A”).

11.5 Differential drive kinematics


A differential drive has two wheels, one on each side, separated by some distance 2rb .
The forces they generate when moving forward are shown in figure 11.2.

ωl ωr

J
r
rb

Figure 11.2: Differential drive free body diagram


11.5 Differential drive kinematics 191

11.5.1 Inverse kinematics


The mapping from v and ω to the left and right wheel velocities vl and vr is derived as
follows. Let ⃗vc be the velocity vector of the center of rotation, ⃗vl be the velocity vector
of the left wheel, ⃗vr be the velocity vector of the right wheel, rb is the distance from
the center of rotation to each wheel, and ω is the counterclockwise turning rate around
the center of rotation.
Once we have the vector equation representing the wheel’s velocity, we’ll project it onto
the wheel direction vector using the dot product.
First, we’ll derive vl .

⃗vl = vc î + ω k̂ × rb ĵ
⃗vl = vc î − ωrb î
⃗vl = (vc − ωrb )î

Now, project this vector onto the left wheel, which is pointed in the î direction.


vl = (vc − ωrb )î ·
∥î∥

The magnitude of î is 1, so the denominator cancels.

vl = (vc − ωrb )î · î


vl = vc − ωrb (11.1)

Next, we’ll derive vr .

⃗vr = vc î + ω k̂ × rb ĵ
⃗vr = vc î + ωrb î
⃗vr = (vc + ωrb )î

Now, project this vector onto the right wheel, which is pointed in the î direction.


vr = (vc + ωrb )î ·
∥î∥

The magnitude of î is 1, so the denominator cancels.

vr = (vc + ωrb )î · î


192 Chapter 11. Dynamics

vr = vc + ωrb (11.2)

So the two inverse kinematic equations are as follows.


vl = vc − ωrb (11.3)
vr = vc + ωrb (11.4)

11.5.2 Forward kinematics


The mapping from the left and right wheel velocities vl and vr to v and ω is derived as
follows.
vr = vc + ωrb
vc = vr − ωrb (11.5)

Substitute equation (11.5) equation for vl .


vl = vc − ωrb
vl = (vr − ωrb ) − ωrb
vl = vr − 2ωrb
2ωrb = vr − vl
vr − vl
ω=
2rb

Substitute this back into equation (11.5).


vc = vr − ωrb
 
vr − vl
vc = vr − rb
2rb
vr − vl
vc = vr −
2
vr vl
vc = vr − +
2 2
vr + vl
vc =
2

So the two forward kinematic equations are as follows.


vr + vl
vc = (11.6)
2
vr − vl
ω= (11.7)
2rb
11.6 Mecanum drive kinematics 193

11.6 Mecanum drive kinematics


A mecanum drive has four wheels, one on each corner of a rectangular chassis. The
wheels have rollers offset at 45 degrees (whether it’s clockwise or not varies per wheel).
The forces they generate when moving forward are shown in figure 11.3.

vf l vf r

vrl vrr

+x

+y

Figure 11.3: Mecanum drive free body diagram

Note that the velocity of the wheel is the same as the velocity in the diagram for the
purposes of feedback control. The rollers on the wheel redirect the velocity vector.

11.6.1 Inverse kinematics


First, we’ll derive the front-left wheel kinematics.

⃗vf l = vx î + vy ĵ + ω k̂ × (rf lx î + rf ly ĵ)


⃗vf l = vx î + vy ĵ + ωrf lx ĵ − ωrf ly î
⃗vf l = (vx − ωrf ly )î + (vy + ωrf lx )ĵ

Project the front-left wheel onto its wheel vector.

î − ĵ
vf l = ((vx − ωrf ly )î + (vy + ωrf lx )ĵ) · √
2
1
vf l = ((vx − ωrf ly ) − (vy + ωrf lx )) √
2
1
vf l = (vx − ωrf ly − vy − ωrf lx ) √
2
1
vf l = (vx − vy − ωrf ly − ωrf lx ) √
2
194 Chapter 11. Dynamics

1
vf l = (vx − vy − ω(rf lx + rf ly )) √
2
1 1 1
vf l = √ vx − √ vy − √ ω(rf lx + rf ly ) (11.8)
2 2 2

Next, we’ll derive the front-right wheel kinematics.

⃗vf r = vx î + vy ĵ + ω k̂ × (rf rx î + rf ry ĵ)


⃗vf r = vx î + vy ĵ + ωrf rx ĵ − ωrf ry î
⃗vf r = (vx − ωrf ry )î + (vy + ωrf rx )ĵ

Project the front-right wheel onto its wheel vector.


1
vf r = ((vx − ωrf ry )î + (vy + ωrf rx )ĵ) · (î + ĵ) √
2
1
vf r = ((vx − ωrf ry ) + (vy + ωrf rx )) √
2
1
vf r = (vx − ωrf ry + vy + ωrf rx ) √
2
1
vf r = (vx + vy − ωrf ry + ωrf rx ) √
2
1
vf r = (vx + vy + ω(rf rx − rf ry )) √
2
1 1 1
vf r = √ vx + √ vy + √ ω(rf rx − rf ry ) (11.9)
2 2 2

Next, we’ll derive the rear-left wheel kinematics.

⃗vrl = vx î + vy ĵ + ω k̂ × (rrlx î + rrly ĵ)


⃗vrl = vx î + vy ĵ + ωrrlx ĵ − ωrrly î
⃗vrl = (vx − ωrrly )î + (vy + ωrrlx )ĵ

Project the rear-left wheel onto its wheel vector.


1
vrl = ((vx − ωrrly )î + (vy + ωrrlx )ĵ) · (î + ĵ) √
2
1
vrl = ((vx − ωrrly ) + (vy + ωrrlx )) √
2
11.6 Mecanum drive kinematics 195

1
vrl = (vx − ωrrly + vy + ωrrlx ) √
2
1
vrl = (vx + vy − ωrrly + ωrrlx ) √
2
1
vrl = (vx + vy + ω(rrlx − rrly )) √
2
1 1 1
vrl = √ vx + √ vy + √ ω(rrlx − rrly ) (11.10)
2 2 2

Next, we’ll derive the rear-right wheel kinematics.

⃗vrr = vx î + vy ĵ + ω k̂ × (rrrx î + rrry ĵ)


⃗vrr = vx î + vy ĵ + ωrrrx ĵ − ωrrry î
⃗vrr = (vx − ωrrry )î + (vy + ωrrrx )ĵ

Project the rear-right wheel onto its wheel vector.

î − ĵ
vrr = ((vx − ωrrry )î + (vy + ωrrrx )ĵ) · √
2
1
vrr = ((vx − ωrrry ) − (vy + ωrrrx )) √
2
1
vrr = (vx − ωrrry − vy − ωrrrx ) √
2
1
vrr = (vx − vy − ωrrry − ωrrrx ) √
2
1
vrr = (vx − vy − ω(rrrx + rrry )) √
2
1 1 1
vrr = √ vx − √ vy − √ ω(rrrx + rrry ) (11.11)
2 2 2

This gives the following inverse kinematic equations.


1 1 1
vf l = √ vx − √ vy − √ ω(rf lx + rf ly )
2 2 2
1 1 1
vf r = √ vx + √ vy + √ ω(rf rx − rf ry )
2 2 2
1 1 1
vrl = √ vx + √ vy + √ ω(rrlx − rrly )
2 2 2
196 Chapter 11. Dynamics

1 1 1
vrr = √ vx − √ vy − √ ω(rrrx + rrry )
2 2 2
Now we’ll factor them out into matrices.
   √1 − √1 
− √12 (rf lx + rf ly )  
vf l 2 2
vf r    √1 √1 √1 (rf r − rf r )  vx
 
  =  12 2 2 x y
 vy
 vrl   √ √1 √1 (rrl − rrl ) 
2 2 2 x y
ω
vrr √1
2
− √1
2
− √12 (rrrx + rrry )
   
vf l 1 −1 −(rf lx + rf ly )  
vf r   vx
  = √1 1 1 (rf rx − rf ry ) 
  vy 
 vrl  (11.12)
2 1 1 (rrlx − rrly ) 
ω
vrr 1 −1 −(rrrx + rrry )

11.6.2 Forward kinematics


Let M be the 4 × 3 inverse kinematics matrix above including the √1
2
factor. The
forward kinematics are  
  vf l
vx  
v
 vy  = M+  f r 
 vrl  (11.13)
ω
vrr

where M+ is the pseudoinverse of M.

11.7 Swerve drive kinematics


A swerve drive has an arbitrary number of wheels which can rotate in place independent
of the chassis. The forces they generate are shown in figure 11.4.

11.7.1 Inverse kinematics

⃗ robot × ⃗rrobot2wheel1
⃗vwheel1 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel2
⃗vwheel2 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel3
⃗vwheel3 = ⃗vrobot + ω
⃗ robot × ⃗rrobot2wheel4
⃗vwheel4 = ⃗vrobot + ω

where ⃗vwheel is the wheel velocity vector, ⃗vrobot is the robot velocity vector, ω
⃗ robot
is the robot angular velocity vector, ⃗rrobot2wheel is the displacement vector from the
11.7 Swerve drive kinematics 197

v1

v3 v2

+x
v4

+y

Figure 11.4: Swerve drive free body diagram

robot’s center of rotation to the wheel,[1] ⃗vrobot = vx î + vy ĵ, and ⃗rrobot2wheel =


rx î + ry ĵ. The number suffixes denote a specific wheel in figure 11.4.

⃗v1 = vx î + vy ĵ + ω k̂ × (r1x î + r1y ĵ)


⃗v2 = vx î + vy ĵ + ω k̂ × (r2x î + r2y ĵ)
⃗v3 = vx î + vy ĵ + ω k̂ × (r3x î + r3y ĵ)
⃗v4 = vx î + vy ĵ + ω k̂ × (r4x î + r4y ĵ)

⃗v1 = vx î + vy ĵ + (ωr1x ĵ − ωr1y î)


⃗v2 = vx î + vy ĵ + (ωr2x ĵ − ωr2y î)
⃗v3 = vx î + vy ĵ + (ωr3x ĵ − ωr3y î)
⃗v4 = vx î + vy ĵ + (ωr4x ĵ − ωr4y î)

⃗v1 = vx î + vy ĵ + ωr1x ĵ − ωr1y î


⃗v2 = vx î + vy ĵ + ωr2x ĵ − ωr2y î
⃗v3 = vx î + vy ĵ + ωr3x ĵ − ωr3y î
⃗v4 = vx î + vy ĵ + ωr4x ĵ − ωr4y î

⃗v1 = vx î − ωr1y î + vy ĵ + ωr1x ĵ


⃗v2 = vx î − ωr2y î + vy ĵ + ωr2x ĵ
[1] The robot’s center of rotation need not coincide with the robot’s geometric center.
198 Chapter 11. Dynamics

⃗v3 = vx î − ωr3y î + vy ĵ + ωr3x ĵ


⃗v4 = vx î − ωr4y î + vy ĵ + ωr4x ĵ

⃗v1 = (vx − ωr1y )î + (vy + ωr1x )ĵ


⃗v2 = (vx − ωr2y )î + (vy + ωr2x )ĵ
⃗v3 = (vx − ωr3y )î + (vy + ωr3x )ĵ
⃗v4 = (vx − ωr4y )î + (vy + ωr4x )ĵ

Separate the i-hat components into independent equations.

v1x = vx − ωr1y
v2x = vx − ωr2y
v3x = vx − ωr3y
v4x = vx − ωr4y

Separate the j-hat components into independent equations.

v1y = vy + ωr1x
v2y = vy + ωr2x
v3y = vy + ωr3x
v4y = vy + ωr4x

Now we’ll factor them out into matrices.


   
v1x 1 0 −r1y
v2x  1 0 −r2y 
   
v3x  1 0 −r3y   
    vx
v4x  1 0 −r4y 
 =   vy 
v1y  0 1 r1x 
    ω
v2y  0 1 r2x 
   
v3y  0 1 r3x 
v4y 0 1 r4x
11.7 Swerve drive kinematics 199

Rearrange the rows so the x and y components are in adjacent rows.


   
v1x 1 0 −r1y
v1y  0 1 r1x 
   
v2x  1 0 −r2y   
   
v2y  0 1 r2x  vx
 =   (11.14)
v3x  1 0 −r3y  vy
   
v3y  0 1 r3x  ω
   
v4x  1 0 −r4y 
v4y 0 1 r4x

To convert the swerve module x and y velocity components to a velocity and head-
ing, use the Pythagorean theorem and arctangent respectively. Here’s an example for
module 1.
q
v1 = v1x 2 + v2 (11.15)
1y
 
v1y
θ1 = tan−1 (11.16)
v1x

11.7.2 Forward kinematics


Let M be the 8 × 3 inverse kinematics matrix from equation (11.14). The forward
kinematics are  
v1x
v1y 
 
  v2x 
vx  
 
vy  = M+ v2y  (11.17)
v3x 
ω  
v3y 
 
v4x 
v4y

where M+ is the pseudoinverse of M.


This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura

12. Newtonian mechanics examples

A model is a set of differential equations describing how the system behaves over time.
There are two common approaches for developing them.
1. Collecting data on the physical system’s behavior and performing system identi-
fication with it.
2. Using physics to derive the system’s model from first principles.
This chapter covers the second approach using Newtonian mechanics.
The models derived here should cover most types of motion seen on an FRC robot.
Furthermore, they can be easily tweaked to describe many types of mechanisms just
by pattern-matching. There’s only so many ways to hook up a mass to a motor in FRC.
The flywheel model can be used for spinning mechanisms, the elevator model can be
used for spinning mechanisms transformed to linear motion, and the single-jointed
arm model can be used for rotating servo mechanisms (it’s just the flywheel model
augmented with a position state).
These models assume all motor controllers driving DC brushed motors are set to brake
mode instead of coast mode. Brake mode behaves the same as coast mode except
where the applied voltage is zero. In brake mode, the motor leads are shorted together
to prevent movement. In coast mode, the motor leads are an open circuit.
202 Chapter 12. Newtonian mechanics examples

12.1 DC brushed motor


We will be deriving a first-order model for a DC brushed motor. A second-order model
would include the inductance of the motor windings as well, but we’re assuming the
time constant of the inductor is small enough that its affect on the model behavior is
negligible for FRC use cases (see subsection 6.10.7 for a demonstration of this for a
real DC brushed motor).
The first-order model will only require numbers from the motor’s datasheet. The second-
order model would require measuring the motor inductance as well, which generally
isn’t in the datasheet. It can be difficult to measure accurately without the right equip-
ment.

12.1.1 Equations of motion


The circuit for a DC brushed motor is shown in figure 12.1.

V
ω
Vemf = Kv

Figure 12.1: DC brushed motor circuit

V is the voltage applied to the motor, I is the current through the motor in Amps, R
is the resistance across the motor in Ohms, ω is the angular velocity of the motor in
radians per second, and Kv is the angular velocity constant in radians per second per
Volt. This circuit reflects the following relation.
ω
V = IR + (12.1)
Kv

The mechanical relation for a DC brushed motor is

τ = Kt I (12.2)

where τ is the torque produced by the motor in Newton-meters and Kt is the torque
12.1 DC brushed motor 203

Figure 12.2: Example motor datasheet for 775pro

constant in Newton-meters per Amp. Therefore


τ
I=
Kt

Substitute this into equation (12.1).

τ ω
V = R+ (12.3)
Kt Kv

12.1.2 Calculating constants


A typical motor’s datasheet should include graphs of the motor’s measured torque and
current for different angular velocities for a given voltage applied to the motor. Figure
12.2 is an example. Data for the most common motors in FRC can be found at https:
//motors.vex.com/.
Torque constant Kt

τ = Kt I
τ
Kt =
I
τstall
Kt = (12.4)
Istall
204 Chapter 12. Newtonian mechanics examples

where τstall is the stall torque and Istall is the stall current of the motor from its
datasheet.
Resistance R
Recall equation (12.1).
ω
V = IR +
Kv
When the motor is stalled, ω = 0.

V = Istall R
V
R= (12.5)
Istall

where Istall is the stall current of the motor and V is the voltage applied to the motor
at stall.
Angular velocity constant Kv
Recall equation (12.1).
ω
V = IR +
Kv
ω
V − IR =
Kv
ω
Kv =
V − IR
When the motor is spinning under no load,
ωf ree
Kv = (12.6)
V − If ree R

where ωf ree is the angular velocity of the motor under no load (also known as the free
speed), and V is the voltage applied to the motor when it’s spinning at ωf ree , and If ree
is the current drawn by the motor under no load.
If several identical motors are being used in one gearbox for a mechanism, multiply the
stall torque by the number of motors.

12.1.3 Current limiting


Current limiting of a DC brushed motor reduces the maximum input voltage to avoid
exceeding a current threshold. Predictive current limiting uses a projected estimate of
12.2 Elevator 205

the current, so the voltage is reduced before the current threshold is exceeded. Reactive
current limiting uses an actual current measurement, so the voltage is reduced after the
current threshold is exceeded.
The following pseudocode demonstrates each type of current limiting.
# Normal feedback control
V = K @ (r - x)

# Calculations for predictive current limiting


omega = angular_velocity_measurement
I = V / R - omega / (Kv * R)

# Calculations for reactive current limiting


I = current_measurement
omega = Kv * V - I * R * Kv # or can be angular velocity measurement

# If predicted/actual current above max, limit current by reducing voltage


if I > I_max:
V = I_max * R + omega / Kv

Snippet 12.1. Limits current of DC motor to Imax

12.2 Elevator
This elevator consists of a DC brushed motor attached to a pulley that drives a mass up
or down.

R ωp

I
ωm

Vemf
m v
V G
r

circuit mechanics

Figure 12.3: Elevator system diagram

Gear ratios are written as output over input, so G is greater than one in figure 12.3.
206 Chapter 12. Newtonian mechanics examples

12.2.1 Equations of motion


We want to derive an equation for the carriage acceleration a (derivative of v) given an
input voltage V , which we can integrate to get carriage velocity and position.
First, we’ll find a torque to substitute into the equation for a DC brushed motor. Based
on figure 12.3
τm G = τ p (12.7)

where G is the gear ratio between the motor and the pulley and τp is the torque produced
by the pulley.
rFm = τp (12.8)

where r is the radius of the pulley. Substitute equation (12.7) into τm in the DC brushed
motor equation (12.3).
τp
G ωm
V = R+
Kt Kv
τp ωm
V = R+
GKt Kv
Substitute in equation (12.8) for τp .
rFm ωm
V = R+ (12.9)
GKt Kv

The angular velocity of the motor armature ωm is


ωm = Gωp (12.10)

where ωp is the angular velocity of the pulley. The velocity of the mass (the elevator
carriage) is
v = rωp
v
ωp = (12.11)
r
Substitute equation (12.11) into equation (12.10).
v
ωm = G (12.12)
r

Substitute equation (12.12) into equation (12.9) for ωm .


rFm Gv
V = R+ r
GKt Kv
12.3 Flywheel 207

RrFm G
V = + v
GKt rKv
Solve for Fm .
RrFm G
=V − v
GKt rKv
 
G GKt
Fm = V − v
rKv Rr
2
GKt G Kt
Fm = V − v (12.13)
Rr Rr2 Kv

We need to find the acceleration of the elevator carriage. Note that


X
F = ma (12.14)

P
where F is the sum of forces applied to the elevator carriage, m is the mass of the
elevator carriage in kilograms, and a is the acceleration of the elevator carriage.

ma = Fm
 
GKt G2 Kt
ma = V − v
Rr Rr2 Kv
GKt G2 Kt
a= V − v
Rrm Rr2 mKv
G2 K t GKt
a=− 2 v+ V (12.15)
Rr mKv Rrm

R Gravity is not part of the modeled dynamics because it complicates the state-
space model and the controller will behave well enough without it.

This model will be converted to state-space notation in section 6.9.

12.3 Flywheel
This flywheel consists of a DC brushed motor attached to a spinning mass of non-
negligible moment of inertia.
Gear ratios are written as output over input, so G is greater than one in figure 12.4.
208 Chapter 12. Newtonian mechanics examples

R ωf

I J
ωm

Vemf
V G

circuit mechanics

Figure 12.4: Flywheel system diagram

12.3.1 Equations of motion


We want to derive an equation for the flywheel angular acceleration ω̇f given an input
voltage V , which we can integrate to get flywheel angular velocity.
We will start with the equation derived earlier for a DC brushed motor, equation (12.3).
τm ωm
V = R+
Kt Kv
Solve for the angular acceleration. First, we’ll rearrange the terms because from inspec-
tion, V is the model input, ωm is the state, and τm contains the angular acceleration.
R 1
V = τm + ωm
Kt Kv
Solve for τm .
R 1
V = τm + ωm
Kt Kv
R 1
τm =V − ωm
Kt Kv
Kt Kt
τm = V − ωm
R Kv R
Since τm G = τf and ωm = Gωf ,
τ  Kt Kt
f
= V − (Gωf )
G R Kv R
12.3 Flywheel 209

τf Kt GKt
= V − ωf
G R Kv R
GKt G2 K t
τf = V − ωf (12.16)
R Kv R
The torque applied to the flywheel is defined as
τf = J ω̇f (12.17)
where J is the moment of inertia of the flywheel and ω̇f is the angular acceleration.
Substitute equation (12.17) into equation (12.16).
GKt G2 K t
(J ω̇f ) = V − ωf
R Kv R
GKt G2 Kt
ω̇f = V − ωf
RJ Kv RJ
G2 Kt GKt
ω̇f = − ωf + V
Kv RJ RJ
We’ll relabel ωf as ω for convenience.
G2 Kt GKt
ω̇ = − ω+ V (12.18)
Kv RJ RJ

This model will be converted to state-space notation in section 6.10.

12.3.2 Calculating constants


Moment of inertia J
Given the simplicity of this mechanism, it may be easier to compute this value theoret-
ically using material properties in CAD. A procedure for measuring it experimentally
is presented below.
First, rearrange equation (12.18) into the form y = mx + b such that J is in the
numerator of m.
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
G2 K t GKt
J ω̇ = − ω+ V
Kv R R
Kv R
Multiply by G2 K t on both sides.
JKv R GKt Kv R
2
ω̇ = −ω + · 2 V
G Kt R G Kt
210 Chapter 12. Newtonian mechanics examples

JKv R Kv
ω̇ = −ω + V
G2 K t G
JKv R Kv
ω=− 2 ω̇ + V (12.19)
G Kt G

The test procedure is as follows.


1. Run the flywheel forward at a constant voltage. Record the angular velocity over
time.
2. Compute the angular acceleration from the angular velocity data as the difference
between each sample divided by the time between them.
3. Perform a linear regression of angular velocity versus angular acceleration. The
slope of this line has the form − JK vR
G2 Kt as per equation (12.19).
2
4. Multiply the slope by − GKvKRt to obtain a least squares estimate of J.

12.4 Single-jointed arm


This single-jointed arm consists of a DC brushed motor attached to a pulley that spins
a straight bar in pitch.

m
R ωarm l
I
ωm

Vemf
V G

circuit mechanics

Figure 12.5: Single-jointed arm system diagram

Gear ratios are written as output over input, so G is greater than one in figure 12.5.

12.4.1 Equations of motion


We want to derive an equation for the arm angular acceleration ω̇arm given an input
voltage V , which we can integrate to get arm angular velocity and angle.
12.4 Single-jointed arm 211

We will start with the equation derived earlier for a DC brushed motor, equation (12.3).
τm ωm
V = R+
Kt Kv
Solve for the angular acceleration. First, we’ll rearrange the terms because from inspec-
tion, V is the model input, ωm is the state, and τm contains the angular acceleration.
R 1
V = τm + ωm
Kt Kv
Solve for τm .
R 1
V = τm + ωm
Kt Kv
R 1
τm =V − ωm
Kt Kv
Kt Kt
τm = V − ωm (12.20)
R Kv R
Since τm G = τarm and ωm = Gωarm ,
τ  K Kt
arm t
= V − (Gωarm )
G R Kv R
τarm Kt GKt
= V − ωarm
G R Kv R
GKt G2 K t
τarm = V − ωarm (12.21)
R Kv R
The torque applied to the arm is defined as
τarm = J ω̇arm (12.22)
where J is the moment of inertia of the arm and ω̇arm is the angular acceleration.
Substitute equation (12.22) into equation (12.21).
GKt G2 K t
(J ω̇arm ) = V − ωarm
R Kv R
G2 Kt GKt
ω̇arm =− ωarm + V
Kv RJ RJ
We’ll relabel ωarm as ω for convenience.
G2 Kt GKt
ω̇ = − ω+ V (12.23)
Kv RJ RJ

This model will be converted to state-space notation in section 6.11.


212 Chapter 12. Newtonian mechanics examples

12.4.2 Calculating constants


Moment of inertia J
Given the simplicity of this mechanism, it may be easier to compute this value theoret-
ically using material properties in CAD. J can also be approximated as the moment of
inertia of a thin rod rotating around one end. Therefore
1 2
J= ml (12.24)
3

where m is the mass of the arm and l is the length of the arm. Otherwise, a procedure
for measuring it experimentally is presented below.
First, rearrange equation (12.23) into the form y = mx + b such that J is in the
numerator of m.
G2 Kt GKt
ω̇ = − ω+ V
Kv RJ RJ
G2 K t GKt
J ω̇ = − ω+ V
Kv R R
Kv R
Multiply by G2 K t on both sides.

JKv R GKt Kv R
ω̇ = −ω + · V
G2 K t R G2 Kt
JKv R Kv
2
ω̇ = −ω + V
G Kt G
JKv R Kv
ω=− 2 ω̇ + V (12.25)
G Kt G

The test procedure is as follows.


1. Orient the arm such that its axis of rotation is aligned with gravity (i.e., the arm
is on its side). This avoids gravity affecting the measurements.
2. Run the arm forward at a constant voltage. Record the angular velocity over
time.
3. Compute the angular acceleration from the angular velocity data as the difference
between each sample divided by the time between them.
4. Perform a linear regression of angular velocity versus angular acceleration. The
slope of this line has the form − JK vR
G2 Kt as per equation (12.25).
2
5. Multiply the slope by − GKvKRt to obtain a least squares estimate of J.
12.5 Pendulum 213

12.5 Pendulum
Kinematics and dynamics are a rather large topics, so for now, we’ll just focus on the
basics required for working with the models in this book. We’ll derive the same model,
a pendulum, using three approaches: sum of forces, sum of torques, and conservation
of energy.

θ
y0
y1 θ0
mg sin θ
mg cos θ
θ h
mg

(a) Force diagram of a pendulum (b) Trigonometry of a pendulum

Figure 12.6: Pendulum force diagrams

12.5.1 Force derivation


Consider figure 12.6a, which shows the forces acting on a pendulum.
Note that the path of the pendulum sweeps out an arc of a circle. The angle θ is
measured in radians. The blue arrow is the gravitational force acting on the bob, and the
violet arrows are that same force resolved into components parallel and perpendicular
to the bob’s instantaneous motion. The direction of the bob’s instantaneous velocity
always points along the red axis, which is considered the tangential axis because its
direction is always tangent to the circle. Consider Newton’s second law

F = ma

where F is the sum of forces on the object, m is mass, and a is the acceleration. Because
we are only concerned with changes in speed, and because the bob is forced to stay in a
circular path, we apply Newton’s equation to the tangential axis only. The short violet
arrow represents the component of the gravitational force in the tangential axis, and
trigonometry can be used to determine its magnitude. Therefore

−mg sin θ = ma
214 Chapter 12. Newtonian mechanics examples

a = −g sin θ

where g is the acceleration due to gravity near the surface of the earth. The negative
sign on the right hand side implies that θ and a always point in opposite directions. This
makes sense because when a pendulum swings further to the left, we would expect it
to accelerate back toward the right.
This linear acceleration a along the red axis can be related to the change in angle θ by
the arc length formulas; s is arc length and l is the length of the pendulum.

s = lθ (12.26)
ds dθ
v= =l
dt dt
d2 s d2 θ
a= 2 =l 2
dt dt
Therefore,
d2 θ
l = −g sin θ
dt2
2
d θ g
2
= − sin θ
dt l
g
θ̈ = − sin θ
l

12.5.2 Torque derivation


The equation can be obtained using two definitions for torque.

τ =r×F

First start by defining the torque on the pendulum bob using the force due to gravity.

τ = l × Fg

where l is the length vector of the pendulum and Fg is the force due to gravity.
For now just consider the magnitude of the torque on the pendulum.

|τ | = −mgl sin θ

where m is the mass of the pendulum, g is the acceleration due to gravity, l is the
length of the pendulum and θ is the angle between the length vector and the force due
to gravity.
12.5 Pendulum 215

Next rewrite the angular momentum.

L = r × p = mr × (ω × r)

Again just consider the magnitude of the angular momentum.

|L| = mr2 ω

|L| = ml2
dt
2
d d θ
|L| = ml2 2
dt dt
dL
According to τ = dt , we can just compare the magnitudes.

d2 θ
−mgl sin θ = ml2
dt2
g d2 θ
− sin θ = 2
l dt
g
θ̈ = − sin θ
l

which is the same result from force analysis.

12.5.3 Energy derivation


The equation can also be obtained via the conservation of mechanical energy principle:
any object falling a vertical distance h would acquire kinetic energy equal to that which
it lost to the fall. In other words, gravitational potential energy is converted into kinetic
energy. Change in potential energy is given by

∆U = mgh

The change in kinetic energy (body started from rest) is given by


1
∆K = mv 2
2
Since no energy is lost, the gain in one must be equal to the loss in the other
1
mv 2 = mgh
2
The change in velocity for a given change in height can be expressed as
p
v = 2gh
216 Chapter 12. Newtonian mechanics examples

Using equation (12.26), this equation can be rewritten in terms of dt .

dθ p
v=l = 2gh
dt
dθ 2gh
= (12.27)
dt l

where h is the vertical distance the pendulum fell. Look at figure 12.6b, which presents
the trigonometry of a pendulum. If the pendulum starts its swing from some initial angle
θ0 , then y0 , the vertical distance from the pivot point, is given by
y0 = l cos θ0
Similarly for y1 , we have
y1 = l cos θ
Then h is the difference of the two
h = l(cos θ − cos θ0 )
Substituting this into equation (12.27) gives
r
dθ 2g
= (cos θ − cos θ0 )
dt l
This equation is known as the first integral of motion. It gives the velocity in terms of
the location and includes an integration constant related to the initial displacement (θ0 ).
We can differentiate by applying the chain rule with respect to time. Doing so gives the
acceleration.
r
d dθ d 2g
= (cos θ − cos θ0 )
dt dt dt l
d2 θ 1 − 2g
l sin θ dθ
2
= q
dt 2 2g (cos θ − cos θ ) dt
l 0
r
d2 θ 1 − 2g
l sin θ 2g
2
= q (cos θ − cos θ0 )
dt 2 2g (cos θ − cos θ ) l
l 0

d2 θ g
= − sin θ
dt2 l
g
θ̈ = − sin θ
l

which is the same result from force analysis.


12.6 Differential drive 217

12.6 Differential drive


This drivetrain consists of two DC brushed motors per side which are chained together
on their respective sides and drive wheels which are assumed to be massless.
vr

vl

+x
θ

+y

Figure 12.7: Differential drive system diagram

12.6.1 Equations of motion


We want to derive equations for the accelerations of the left and right sides of the robot
v̇l and v̇r given left and right input voltages Vl and Vr .
From equation (12.16) of the flywheel model derivations
GKt G2 Kt
τ= V − ω (12.28)
R Kv R

where τ is the torque applied by one wheel of the differential drive, G is the gear ratio
of the differential drive, Kt is the torque constant of the motor, R is the resistance of
the motor, and Kv is the angular velocity constant. Since τ = rF and ω = vr where v
is the velocity of a given drive side along the ground and r is the drive wheel radius
GKt G2 Kt  v 
(rF ) = V −
R Kv R r
GKt G2 Kt
rF = V − v
R Kv Rr
GKt G2 K t
F = V − v
Rr Kv Rr2
G2 Kt GKt
F =− 2
v+ V
Kv Rr Rr
218 Chapter 12. Newtonian mechanics examples

Therefore, for each side of the robot,


G2l Kt Gl Kt
Fl = − 2
vl + Vl
Kv Rr Rr
G2 Kt Gr Kt
Fr = − r 2 v r + Vr
Kv Rr Rr

where the l and r subscripts denote the side of the robot to which each variable corre-
sponds.
G2 K G2 K
Let C1 = − KvlRrt2 , C2 = Gl K t
Rr , C3 = − KvrRrt2 , and C4 = Gr K t
Rr .

Fl = C 1 v l + C 2 V l (12.29)
Fr = C 3 v r + C 4 V r (12.30)

First, find the sum of forces.


X
F = ma
Fl + Fr = mv̇
v̇l + v̇r
Fl + Fr = m
2
2
(Fl + Fr ) = v̇l + v̇r
m
2
v̇l = (Fl + Fr ) − v̇r (12.31)
m

Next, find the sum of torques.


X
τ = J ω̇
 
v̇r − v̇l
τl + τr = J
2rb

where rb is the radius of the differential drive.


v̇r − v̇l
(−rb Fl ) + (rb Fr ) = J
2rb
J
−rb Fl + rb Fr = (v̇r − v̇l )
2rb
J
−Fl + Fr = 2 (v̇r − v̇l )
2rb
12.6 Differential drive 219

2rb2
(−Fl + Fr ) = v̇r − v̇l
J
2r2
v̇r = v̇l + b (−Fl + Fr )
J

Substitute in equation (12.31) for v̇l to obtain an expression for v̇r .


 
2 2r2
v̇r = (Fl + Fr ) − v̇r + b (−Fl + Fr )
m J
2 2rb2
2v̇r = (Fl + Fr ) + (−Fl + Fr )
m J
1 r2
v̇r = (Fl + Fr ) + b (−Fl + Fr ) (12.32)
m J
1 1 r2 r2
v̇r = Fl + Fr − b Fl + b Fr
m
 m J J 
1 rb2 1 r2
v̇r = − Fl + + b Fr (12.33)
m J m J

Substitute equation (12.32) back into equation (12.31) to obtain an expression for v̇l .
 
2 1 rb2
v̇l = (Fl + Fr ) − (Fl + Fr ) + (−Fl + Fr )
m m J
2
1 r
v̇l = (Fl + Fr ) − b (−Fl + Fr )
m J
1 rb2
v̇l = (Fl + Fr ) + (Fl − Fr )
m J
1 1 r2 r2
v̇l = Fl + Fr + b Fl − b Fr
m
 m J J 
1 rb2 1 r2
v̇l = + Fl + − b Fr (12.34)
m J m J

Now, plug the expressions for Fl and Fr into equation (12.33).


   
1 r2 1 r2
v̇r = − b Fl + + b Fr
m J m J
 2
  
1 r 1 r2
v̇r = − b (C1 vl + C2 Vl ) + + b (C3 vr + C4 Vr ) (12.35)
m J m J
220 Chapter 12. Newtonian mechanics examples

Now, plug the expressions for Fl and Fr into equation (12.34).


   
1 rb2 1 rb2
v̇l = + Fl + − Fr
m J m J
   
1 rb2 1 rb2
v̇l = + (C1 vl + C2 Vl ) + − (C3 vr + C4 Vr ) (12.36)
m J m J

This model will be converted to state-space notation in section 8.7.

12.6.2 Calculating constants


Moment of inertia J
We’ll use empirical measurements of linear and angular velocity to determine J. First,
we’ll derive the equation required to perform a linear regression using velocity test data.

τ1 = r × F
τ1 = rma

where τ1 is the torque applied by a drive motor during only linear acceleration, r is the
wheel radius, m is the robot mass, and a is the linear acceleration.

τ2 = Iα

where τ2 is the torque applied by a drive motor during only angular acceleration, I is the
moment of inertia (same as J), and α is the angular acceleration. If a constant voltage
is applied during both the linear and angular acceleration tests, τ1 = τ2 . Therefore,

rma = Iα

Integrate with respect to time.

rmv + C1 = Iω + C2
rmv = Iω + C3
I
v= ω + C3 (12.37)
rm

where v is linear velocity and ω is angular velocity. C1 , C2 , and C3 are arbitrary


constants of integration that won’t be needed. The test procedure is as follows.
1. Run the drivetrain forward at a constant voltage. Record the linear velocity over
time using encoders.
12.6 Differential drive 221

2. Rotate the drivetrain around its center by applying the same voltage as the lin-
ear acceleration test with the motors driving in opposite directions. Record the
angular velocity over time using a gyroscope.
3. Perform a linear regression of linear velocity versus angular velocity. The slope
I
of this line has the form rm as per equation (12.37).
4. Multiply the slope by rm to obtain a least squares estimate of I.
This page intentionally left blank
Hills by northbound freeway between Santa Maria and Ventura

13. Lagrangian mechanics examples

A model is a set of differential equations describing how the system behaves over time.
There are two common approaches for developing them.
1. Collecting data on the physical system’s behavior and performing system identi-
fication with it.
2. Using physics to derive the system’s model from first principles.
This chapter covers the second approach using Lagrangian mechanics.
We suggest reading An introduction to Lagrangian and Hamiltonian Mechanics by Si-
mon J.A. Malham for the basics [13].

13.1 Single-jointed arm


See https://underactuated.mit.edu/pend.html for the dynamics of a single-jointed arm.

13.2 Double-jointed arm


See https://underactuated.mit.edu/multibody.html for a derivation of the kinematics
and dynamics of a double-jointed arm via Lagrangian mechanics.
224 Chapter 13. Lagrangian mechanics examples

13.3 Cart-pole
See https://underactuated.mit.edu/acrobot.html#cart_pole for a derivation of the kine-
matics and dynamics of a cart-pole via Lagrangian mechanics.
Hills by northbound freeway between Santa Maria and Ventura

14. System identification

First, we’ll cover a system identification procedure for generating a velocity system
feedforward model from performance data. Then, we’ll use that feedforward model to
create state-space models for several common mechanisms.

14.1 Ordinary least squares


The parameter estimation in this chapter will be performed using ordinary least squares
(OLS). Let there be a linear equation of the form y = β1 x1 + . . . + βp xp . We want to
find the constants β1 , . . . , βp that best fit a set of observations represented by x1,...,n -y
tuples. Consider the overdetermined system of n linear equations y = Xβ where
     
y1 x11 · · · x1p β1
 ..   .. . .. .  
..  β =  ... 
y= .  X= . 
yn xn1 ··· xnp βp

each row corresponds to a datapoint, and n > p (more datapoints than unknowns).
OLS is a type of linear least squares method that estimates the unknown parameters β
in a linear regression model y = Xβ + ϵ where ϵ is the error in the observations y.
β is chosen by the method of least squares: minimizing the sum of the squares of the
difference between y (observations of the dependent variable) and Xβ (predictions of
y using a linear function of the independent variable β).
226 Chapter 14. System identification

Geometrically, this is the sum of the squared distances, parallel to the y-axis, between
each value in y and the corresponding value in Xβ. Smaller differences mean a better
model fit.
To find the β that fits best, we can solve the following quadratic minimization problem
β̂ = arg min(y − Xβ)T (y − Xβ)
β

arg minβ means “find the value of β that minimizes the following function of β”.
Given corollary 5.12.3, take the partial derivative of the cost function with respect to
β and set it equal to zero, then solve for β̂.
0 = −2XT (y − Xβ̂)
0 = XT (y − Xβ̂)
0 = XT y − XT Xβ̂
XT Xβ̂ = XT y
β̂ = (XT X)−1 XT y (14.1)

14.1.1 Examples
While this is a linear regression, we can fit nonlinear functions by making the contents
of X nonlinear functions of the independent variables. For example, we can find the
quadratic equation y = ax2 + bx + c that best fits a set of x-y tuples. Lets assume we
have a set of observations as follows.
y1 = ax21 + bx1 + c
..
.
yn = ax2n + bxn + c
We want to find a, b, and c, so let’s factor those out.
   2 
y1 x1 x1 1  a 
 ..   .. .. ..   b 
 . = . . .
yn x 2
xn 1 c
n

Plug these matrices into equation (14.1) to obtain the coefficients a, b, and c.
   2   
y1 x1 x1 1 a
   ..  β =  b 
y =  ...  X =  ... ..
. .
yn x2n xn 1 c
14.2 1-DOF mechanism feedforward model 227

14.2 1-DOF mechanism feedforward model


14.2.1 Introduction
The feedforward model we’ll be using is
u = Ks sgn(v) + Kv v + Ka a (14.2)

where u is input voltage, v is velocity, a is acceleration, and Ks , Kv , and Ka are


constants reflecting how much each variable contributes to the required voltage.
Since this is a linear multiple regression, the most obvious approach for obtaining the
feedforward gains Ks , Kv , and Ka is to use ordinary least squares (OLS) on 4-tuples
of recorded voltage, velocity sign, velocity, and acceleration. The problem is we can’t
cleanly measure acceleration directly, and numerical differentiation of velocity is noisy.
Ideally, we could find the feedforward gains with samples of current velocity, current
input, next velocity, and the nominal sample period instead.

14.2.2 Overview
First, we’ll put equation (14.2) into the form dx
dt = Ax + Bu + c. Then, we’ll discretize
it to obtain an equation of the form xk+1 = αxk + βuk + γ sgn(xk ). Then, we’ll
perform OLS on xk+1 = αxk + βuk + γ sgn(xk ) using 4-tuples of current velocity,
next velocity, input voltage, and velocity sign to obtain α, β, and γ. Since we solved
for those in the discretization step earlier, we have a system of three equations we can
solve for Ks , Kv , and Ka .

14.2.3 Derivation
First, solve u = Ks sgn(v) + Kv v + Ka a for a.
Ka a = u − Ks sgn(v) − Kv v
1 Ks Kv
a= u− sgn(v) − v
Ka Ka Ka
Kv 1 Ks
a=− v+ u− sgn(v)
Ka Ka Ka
Let x = v, dx
dt = a, A = − K
Kv
a
,B = 1
Ka , and c = − K
Ks
a
sgn(x).
dx
= Ax + Bu + c
dt

Since Bu + c is a constant over the time interval [0, T ) where T is the sample period,
this equation can be integrated according to appendix D.1, which gives
xk+1 = eAT xk + A−1 (eAT − 1)(Buk + c)
228 Chapter 14. System identification

xk+1 = eAT xk + A−1 (eAT − 1)Buk + A−1 (eAT − 1)c

Substitute c back in.


 
  Ks
xk+1 = eAT xk + A−1 eAT − 1 Buk + A−1 eAT − 1 − sgn(xk )
Ka
 K 
xk+1 = eAT xk + A−1 eAT − 1 Buk −
s −1
A eAT − 1 sgn(xk )
Ka

This equation has the form xk+1 = αxk + βuk + γ sgn(xk ). Running OLS with
(xk+1 , xk , uk , sgn(xk )) 4-tuples will give α, β, and γ. To obtain Ks , Kv , and Ka ,
solve the following system.


α = e
AT

β = A−1 eAT − 1 B

 Ks −1

γ = −K a
A e AT
− 1

Substitute α into the second and third equation.




α = e
AT

β = A−1 (α − 1)B (14.3)



 Ks −1
γ = −K a
A (α − 1)

Divide the second equation by the third equation and solve for Ks .
β A−1 (α − 1)B
= Ks −1
γ − Ka A (α − 1)
β Ka B
=−
γ Ks
 
β Ka K1a
=−
γ Ks
β 1
=−
γ Ks
γ
Ks = − (14.4)
β
Solve the second equation in (14.3) for Kv .

β = A−1 (α − 1)B
14.2 1-DOF mechanism feedforward model 229
 −1  
Kv 1
β= − (α − 1)
Ka Ka
Ka 1
β =− (α − 1)
Kv Ka
1
β = (1 − α)
Kv
Kv β =1−α
1−α
Kv = (14.5)
β
Solve the first equation in (14.3) for Ka .
Kv
α = e− Ka T
Kv
ln α = − T
Ka
Ka ln α = −Kv T
Kv T
Ka = −
ln α
Substitute in Kv from equation (14.5).
 
1−α
β T
Ka = −
ln α
(1 − α)T
Ka = −
β ln α
(α − 1)T
Ka = (14.6)
β ln α

14.2.4 Results
The new system identification process is as follows.
1. Gather input voltage and velocity from a dynamic acceleration event
2. Given an equation of the form xk+1 = αxk + βuk + γ sgn(xk ), perform OLS
on (xk+1 , xk , uk , sgn(xk )) 4-tuples to obtain α, β, and γ
The feedforward gains are
γ
Ks = − (14.7)
β
1−α
Kv = (14.8)
β
230 Chapter 14. System identification

(α − 1)T
Ka = (14.9)
β ln α
α > 0, β ̸= 0

where T is the sample period of the velocity data.


For Ks , Kv , and Ka to be defined, the dataset must include robot movement, and there
must be a discernible relationship between a change in the input voltage and a change
in the velocity (in other words, the robot must have accelerated).

14.3 1-DOF mechanism state-space model


ẋ = Ax + Bu
x = velocity ẋ = acceleration u = voltage

We want to derive what A and B are from the following feedforward model
u = Kv x + Ka ẋ

Since this equation is a linear multiple regression, the constants Kv and Ka can be de-
termined by applying ordinary least squares (OLS) to vectors of recorded input voltage,
velocity, and acceleration data from quasistatic velocity tests and acceleration tests.
Kv is a proportional constant that describes how much voltage is required to maintain
a given constant velocity by offsetting the electromagnetic resistance of the motor and
any friction that increases linearly with speed (viscous drag). The relationship between
speed and voltage (for a given initial acceleration) is linear for permanent-magnet DC
motors in the FRC operating regime.
Ka is a proportional constant that describes how much voltage is required to induce
a given acceleration in the motor shaft. As with Kv , the relationship between voltage
and acceleration (for a given initial velocity) is linear.
To convert u = Kv x + Ka ẋ to state-space form, simply solve for ẋ.
u = Kv x + Ka ẋ
Ka ẋ = u − Kv x
Ka ẋ = −Kv x + u
Kv 1
ẋ = − x+ u
Ka Ka

By inspection, A = − K
Kv
a
and B = 1
Ka . A model with position and velocity states
would be
14.4 Drivetrain left/right velocity state-space model 231

Theorem 14.3.1 — 1-DOF mechanism position model.

ẋ = Ax + Bu
 
position  
x= u = voltage
velocity
   
0 1 0
ẋ = x + u (14.10)
0 −K Kv
a
1
Ka

The model in theorem 14.3.1 is undefined when Ka = 0. If one wants to design an LQR
for such a system, use the model in theorem 14.3.2. As Ka tends to zero, acceleration
requires no effort and velocity becomes an input for position control.

Theorem 14.3.2 — 1-DOF mechanism position model (Ka = 0).

ẋ = Ax + Bu
   
x = position u = velocity
   
ẋ = 0 x + 1 u (14.11)

14.4 Drivetrain left/right velocity state-space model


ẋ = Ax + Bu
     
left velocity left acceleration left voltage
x= ẋ = u=
right velocity right acceleration right voltage

We want to derive what A and B are from linear and angular feedforward models.
Since the left and right dynamics are symmetric, we’ll guess the model has the form
   
A1 A2 B1 B2
A= B=
A2 A1 B2 B1

Let Kv,lin be the linear velocity gain, Ka,lin be the linear acceleration gain, Kv,ang
be the angular velocity gain, and Ka,ang be the angular acceleration gain. Let u =
 T
Kv,lin v Kv,lin v be the input that makes both sides of the drivetrain move at a
 T  T
constant velocity v. Therefore, x = v v and ẋ = 0 0 . Substitute these into
232 Chapter 14. System identification

the state-space model.


       
0 A1 A2 v B1 B2 Kv,lin v
= +
0 A2 A1 v B2 B1 Kv,lin v
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
       
0 A1 A2 v B1 B2 Kv,lin v
= +
0 A1 A2 v B1 B2 Kv,lin v
Since the rows are linearly dependent, we can use just one of them.
   
0 = A1 A2 v + B1 B2 Kv,lin v
 
A1
   A2 
0 = v v Kv,lin v Kv,lin v  B1 

B2
 
A1
   A2 
0 = 1 1 Kv,lin Kv,lin  B1 

B2

 T
Let u = Kv,lin v + Ka,lin a Kv,lin v + Ka,lin a be the input that accelerates both
 T
sides of the drivetrain by a from an initial velocity of v. Therefore, x = v v and
 T
ẋ = a a . Substitute these into the state-space model.
       
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
= +
a A2 A1 v B2 B1 Kv,lin v + Ka,lin a
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
       
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
= +
a A1 A2 v B1 B2 Kv,lin v + Ka,lin a
Since the rows are linearly dependent, we can use just one of them.
    
a = A1 A2 v + B1 B2 Kv,lin v + Ka,lin a
 
A1
   A2 
a= v v Kv,lin v + Ka,lin a Kv,lin + Ka,lin a B1 

B2
14.4 Drivetrain left/right velocity state-space model 233
 T
Let u = −Kv,ang v Kv,ang v be the input that rotates the drivetrain in place
 T
where each wheel has a constant velocity v. Therefore, x = −v v and ẋ =
 T
0 0 .
       
0 A1 A2 −v B1 B2 −Kv,ang v
= +
0 A2 A1 v B2 B1 Kv,ang v
       
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 −A2 A1 v −B2 B1 Kv,ang v
       
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 A1 −A2 v B1 −B2 Kv,ang v

Since the column vectors contain the same element, the elements in the second row can
be rearranged.
       
0 −A1 A2 v −B1 B2 Kv,ang v
= +
0 −A1 A2 v −B1 B2 Kv,ang v

Since the rows are linearly dependent, we can use just one of them.
   
0 = −A1 A2 v + −B1 B2 Kv,ang v
0 = −vA1 + vA2 − Kv,ang vB1 + Kv,ang vB2
 
A1
   A2 
0 = −v v −Kv,ang v Kv,ang v   
B1 
B2
 
A1
   A2 
0 = −1 1 −Kv,ang Kv,ang  B1 

B2

 T
Let u = −Kv,ang v − Ka,ang a Kv,ang v + Ka,ang a be the input that rotates
the drivetrain in place where each wheel has an initial speed of v and accelerates by a.
 T  T
Therefore, x = −v v and ẋ = −a a .
       
−a A1 A2 −v B1 B2 −Kv,ang v − Ka,ang a
= +
a A2 A1 v B2 B1 Kv,ang v + Ka,ang a
       
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
a −A2 A1 v −B2 B1 Kv,ang v + Ka,ang a
234 Chapter 14. System identification
       
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
a A1 −A2 v B1 −B2 Kv,ang v + Ka,ang a
Since the column vectors contain the same element, the elements in the second row can
be rearranged.
       
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a
= +
−a −A1 A2 v −B1 B2 Kv,ang v + Ka,ang a

Since the rows are linearly dependent, we can use just one of them.
    
−a = −A1 A2 v + −B1 B2 Kv,ang v + Ka,ang a
−a = −vA1 + vA2 − (Kv,ang v + Ka,ang a)B1 + (Kv,ang v + Ka,ang a)B2
 
A1
   A2 
−a = −v v −(Kv,ang v + Ka,ang a) Kv,ang v + Ka,ang a  B1 

B2
 
A1
   A2 
a = v −v Kv,ang v + Ka,ang a −(Kv,ang v + Ka,ang a)  B1 

B2

Now stack the rows.


    
0 1 1 Kv,lin Kv,lin A1
a  v v Kv,lin v + Ka,lin a Kv,lin v + Ka,lin a   
 =   A2 
0 −1 1 −Kv,ang Kv,ang  B1 
a v −v Kv,ang v + Ka,ang a −(Kv,ang v + Ka,ang a) B2

Solve for matrix elements with Wolfram Alpha. Let b = Kv,lin , c = Ka,lin , d =
Kv,ang , and f = Ka,ang .
inverse of {{1, 1, b, b},
{v, v, b v + c a, b v + c a},
{-1, 1, -d, d},
{v, -v, d v + f a, -(d v + f a)}}
* {{0}, {a}, {0}, {a}}
   
A1 −cd − bf
 A2   
  = 1  cd − bf 
B1  2cf  c + f 
B2 f −c
14.4 Drivetrain left/right velocity state-space model 235
   
A1 −Ka,lin Kv,ang − Kv,lin Ka,ang
 A2  1  Ka,lin Kv,ang − Kv,lin Ka,ang 
 =  
B1  2Ka,lin Ka,ang  Ka,lin + Ka,ang 
B2 Ka,ang − Ka,lin
 Ka,lin Kv,ang 
  Kv,lin Ka,ang
− Ka,lin Ka,ang − Ka,lin
A1  Ka,lin Kv,ang
Ka,ang
Kv,lin Ka,ang 
 A2  1  K Ka,ang − Ka,lin Ka,ang 

  =  a,lin
B1  2  K a,lin K a,ang 
 Ka,lin Ka,ang + Ka,lin Ka,ang 
B2 Ka,ang Ka,lin
Ka,lin Ka,ang − Ka,lin Ka,ang
   Kv,ang Kv,lin 
A1 − Ka,ang − Ka,lin
 A2  1  Kv,ang Kv,lin 
 =  Ka,ang − Ka,lin 
B1  2   Ka,ang + K 1 
1 
a,lin
B2
Ka,lin − Ka,ang
1 1

  
  −
Kv,lin
+
Kv,ang
A1   Ka,lin Ka,ang 

A2  1 − Kv,lin − Kv,ang 
 =  Ka,lin Ka,ang 
B1  2  
 Ka,lin 1 1
+ Ka,ang 
B2
Ka,lin − Ka,ang
1 1

To summarize,
236 Chapter 14. System identification

Theorem 14.4.1 — Drivetrain left/right velocity model.

ẋ = Ax + Bu
   
left velocity left voltage
x= u=
right velocity right voltage
   
A1 A2 B1 B2
ẋ = x+ u
A2 A1 B2 B1
 K 
  − v,lin
+
Kv,ang
A1   Ka,lin Ka,ang 

A2  1 − Kv,lin − Kv,ang 
 =  Ka,lin Ka,ang  (14.12)
B1  2  
 Ka,lin1 1
+ Ka,ang 
B2
Ka,lin − Ka,ang
1 1

V V
Kv,ang and Ka,ang have units of L/T and L/T 2 respectively where V means voltage

units, L means length units, and T means time units.

If Kv and Ka are the same for both the linear and angular cases, it devolves to the
one-dimensional case. This means the left and right sides are decoupled.
  
  − Kv
+ Kv
A1   Ka Ka 

 A2  1  − K v − K v 
 =  Ka 
B1  2  Ka

 K1 + K1 
B2 a a

Ka − Ka
1 1
   Kv 
A1 −2 Ka
 A2  1  0 
 =  2 
B1  2  
Ka
B2 0
   Kv 
A1 − Ka
 A2   0 
 = 1 
B1   
Ka
B2 0
14.5 Drivetrain linear/angular velocity state-space model 237

14.5 Drivetrain linear/angular velocity state-space model


Let the linear and angular voltage contributions be

ulin = Kv,lin v + Ka,lin a


uang = Kv,ang ω + Ka,ang α

Solve for acceleration.


Kv,lin 1
a=− v+ ulin
Ka,lin Ka,lin
Kv,ang 1
α=− ω+ ulin
Ka,ang Ka,ang

Factor them into a matrix equation.


" K #  " # 
− Ka,lin
v,lin
0 v
1
Ka,lin 0 ulin
ẋ = Kv,ang + 1
0 − Ka,ang ω 0 Ka,ang
uang

We want to write this model in terms of left and right voltages. The linear and angular
voltages have the following relations.

ulef t = ulin − uang


uright = ulin + uang

Factor them into a matrix equation, then solve for the linear and angular voltages.
    
ulef t 1 −1 ulin
=
uright 1 1 uang
    
ulin 0.5 0.5 ulef t
=
uang −0.5 0.5 uright

Plug this into the model input.


" K #  " #  
− Ka,lin
v,lin
0 v
1
Ka,lin 0 0.5 0.5 ulef t
ẋ = +
0
Kv,ang
− Ka,ang ω 0 1
Ka,ang
−0.5 0.5 uright
" K #  " # 
− Ka,lin
v,lin
0 v
0.5
Ka,lin
0.5
Ka,lin ulef t
ẋ = +
0
K
− Kv,ang ω − Ka,ang
0.5 0.5
Ka,ang
uright
a,ang
238 Chapter 14. System identification

Theorem 14.5.1 — Drivetrain linear/angular velocity model.

ẋ = Ax + Bu
   
linear velocity left voltage
x= u=
angular velocity right voltage
" K # " #
− Ka,lin
v,lin
0 0.5
Ka,lin
0.5
Ka,lin
ẋ = x+ u
0
K
− Kv,ang − Ka,ang
0.5 0.5
Ka,ang
a,ang

V V
Kv,ang and Ka,ang have units of A/T and A/T 2 respectively where V means voltage

units, A means angle units, and T means time units.


V
Motion planning

15 Motion profiles . . . . . . . . . . . . . . . . . . . . . . 241

16 Configuration spaces . . . . . . . . . . . . . . . . 249

17 Trajectory optimization . . . . . . . . . . . . . . 255


This page intentionally left blank
OPERS field at UCSC

15. Motion profiles

If smooth, predictable motion of a system over time is desired, it’s best to only change a
system’s reference as fast as the system is able to physically move. Motion profiles, also
known as trajectories, are used for this purpose. For multi-state systems, each state is
given its own trajectory. Since these states are usually position and velocity, they share
different derivatives of the same profile.

15.1 1-DOF motion profiles


Trapezoid profiles (figure 15.1) and S-curve profiles (figure 15.2) are the most com-
monly used motion profiles in FRC for point-to-point movements with one degree of
freedom (1 DOF). These profiles accelerate the system to a maximum velocity from
rest, then decelerate it later such that the final acceleration velocity, are zero at the
moment the system arrives at the desired location.
These profiles are given their names based on the shape of their velocity trajectory. The
trapezoid profile has a velocity trajectory shaped like a trapezoid and the S-curve profile
has a velocity trajectory shaped like an S-curve.
In the context of a point-to-point move, a full S-curve consists of seven distinct phases
of motion. Phase I starts moving the system from rest at a linearly increasing accel-
eration until it reaches the maximum acceleration. In phase II, the profile accelerates
at this maximum acceleration rate until it must start decreasing as it approaches the
maximum velocity. This occurs in phase III when the acceleration linearly decreases
242 Chapter 15. Motion profiles

Figure 15.1: Trapezoid profile Figure 15.2: S-curve profile

until it reaches zero. In phase IV, the velocity is constant until deceleration begins, at
which point the profiles decelerates in a manner symmetric to phases I, II and III.
A trapezoid profile, on the other hand, has three phases. It is a subset of an S-curve pro-
file, having only the phases corresponding to phase II of the S-curve profile (constant
acceleration), phase IV (constant velocity), and phase VI (constant deceleration). This
reduced number of phases underscores the difference between these two profiles: the
S-curve profile has extra motion phases which transition between periods of accelera-
tion and periods of nonacceleration; the trapezoid profile has instantaneous transitions
between these phases. This can be seen in the acceleration graphs of the corresponding
velocity profiles for these two profile types.

15.1.1 Jerk
The motion characteristic that defines the change in acceleration, or transitional period,
is known as “jerk”. Jerk is defined as the rate of change of acceleration with time. In
a trapezoid profile, the jerk (change in acceleration) is infinite at the phase transitions,
while in the S-curve profile the jerk is a constant value, spreading the change in accel-
eration over a period of time.
From figures 15.1 and 15.2, we can see S-curve profiles are smoother than trapezoid
profiles. Why, however, do the S-curve profile result in less load oscillation? For a
given load, the higher the jerk, the greater the amount of unwanted vibration energy
will be generated, and the broader the frequency spectrum of the vibration’s energy will
be.
This means that more rapid changes in acceleration induce more powerful vibrations,
and more vibrational modes will be excited. Because vibrational energy is absorbed in
15.1 1-DOF motion profiles 243

the system mechanics, it may cause an increase in settling time or reduced accuracy if
the vibration frequency matches resonances in the mechanical system.

15.1.2 Profile selection


Since trapezoid profiles spend their time at full acceleration or full deceleration, they
are, from the standpoint of profile execution, faster than S-curve profiles. However, if
this “all on”/“all off” approach causes an increase in settling time, the advantage is lost.
Often, only a small amount of “S” (transition between acceleration and no acceleration)
can substantially reduce induced vibration. Therefore to optimize throughput, the S-
curve profile must be tuned for each a given load and given desired transfer speed.
What S-curve form is right for a given system? On an application by application basis,
the specific choice of the form of the S-curve will depend on the mechanical nature of
the system and the desired performance specifications. For example, in medical appli-
cations which involve liquid transfers that should not be jostled, it would be appropriate
to choose a profile with no phase II and VI segment at all. Instead the acceleration tran-
sitions would be spread out as far as possible, thereby maximizing smoothness.
In other applications involving high speed pick and place, overall transfer speed is most
important, so a good choice might be an S-curve with transition phases (phases I, III,
V, and VII) that are five to fifteen percent of phase II and VI. In this case, the S-curve
profile will add a small amount of time to the overall transfer time. However, the
reduced load oscillation at the end of the move considerably decreases the total effective
transfer time. Trial and error using a motion measurement system is generally the best
way to determine the right amount of “S” because modeling high frequency dynamics
is difficult to do accurately.
Another consideration is whether that “S” segment will actually lead to smoother control
of the system. If the high frequency dynamics at play are negligible, one can use the
simpler trapezoid profile.

R S-curve profiles are unnecesary for FRC mechanisms. Motors in FRC effec-
tively have first-order velocity dynamics because the inductance effects are on
the order of microseconds; FRC dynamics operate on the order of milliseconds.
First-order motor models can achieve the instantaneous acceleration changes of
trapezoid profiles because voltage is electromotive force, which is analogous to
acceleration. That is, we can instantaneously achieve any desired acceleration
with a finite voltage, and we can follow any trapezoid profile perfectly with only
feedforward control.
244 Chapter 15. Motion profiles

15.1.3 Profile equations


Trapezoid profile
The trapezoid profile uses the following equations.
1
x(t) = x0 + v0 t + at2
2
v(t) = v0 + at

where x(t) is the position at time t, x0 is the initial position, v0 is the initial velocity,
and a is the acceleration at time t.
Snippet 15.1 shows a Python implementation.
"""Function for generating a trapezoid profile."""

import math

def generate_trapezoid_profile(max_v, time_to_max_v, dt, goal):


"""Creates a trapezoid profile with the given constraints.

Returns:
t_rec -- list of timestamps
x_rec -- list of positions at each timestep
v_rec -- list of velocities at each timestep
a_rec -- list of accelerations at each timestep

Keyword arguments:
max_v -- maximum velocity of profile
time_to_max_v -- time from rest to maximum velocity
dt -- timestep
goal -- final position when the profile is at rest
"""
t_rec = [0.0]
x_rec = [0.0]
v_rec = [0.0]
a_rec = [0.0]

a = max_v / time_to_max_v
time_at_max_v = goal / max_v - time_to_max_v

# If profile is short
if max_v * time_to_max_v > goal:
time_to_max_v = math.sqrt(goal / a)
time_from_max_v = time_to_max_v
time_total = 2.0 * time_to_max_v
profile_max_v = a * time_to_max_v
else:
time_from_max_v = time_to_max_v + time_at_max_v
time_total = time_from_max_v + time_to_max_v
profile_max_v = max_v

while t_rec[-1] < time_total:


t = t_rec[-1] + dt
t_rec.append(t)
15.1 1-DOF motion profiles 245

if t < time_to_max_v:
# Accelerate up
a_rec.append(a)
v_rec.append(a * t)
elif t < time_from_max_v:
# Maintain max velocity
a_rec.append(0.0)
v_rec.append(profile_max_v)
elif t < time_total:
# Accelerate down
decel_time = t - time_from_max_v
a_rec.append(-a)
v_rec.append(profile_max_v - a * decel_time)
else:
a_rec.append(0.0)
v_rec.append(0.0)
x_rec.append(x_rec[-1] + v_rec[-1] * dt)
return t_rec, x_rec, v_rec, a_rec

Snippet 15.1. Trapezoid profile implementation in Python

S-curve profile
The S-curve profile equations also include jerk.
1 1
x(t) = x0 + v0 t + at2 + jt3
2 6
1 2
v(t) = v0 + at + jt
2
a(t) = a0 + jt

where j is the jerk at time t, a(t) is the acceleration at time t, and a0 is the initial
acceleration.
Snippet 15.2 shows a Python implementation.
"""Function for generating an S-curve profile."""

import math

def generate_s_curve_profile(max_v, max_a, time_to_max_a, dt, goal):


"""Returns an s-curve profile with the given constraints.

Returns:
t_rec -- list of timestamps
x_rec -- list of positions at each timestep
v_rec -- list of velocities at each timestep
a_rec -- list of accelerations at each timestep

Keyword arguments:
max_v -- maximum velocity of profile
max_a -- maximum acceleration of profile
time_to_max_a -- time from rest to maximum acceleration
246 Chapter 15. Motion profiles

dt -- timestep
goal -- final position when the profile is at rest
"""
t_rec = [0.0]
x_rec = [0.0]
v_rec = [0.0]
a_rec = [0.0]

j = max_a / time_to_max_a
short_profile = max_v * (time_to_max_a + max_v / max_a) > goal

if short_profile:
profile_max_v = max_a * (
math.sqrt(goal / max_a - 0.75 * time_to_max_a**2) - 0.5 *
time_to_max_a
)
else:
profile_max_v = max_v

# Find times at critical points


t2 = profile_max_v / max_a
t3 = t2 + time_to_max_a
if short_profile:
t4 = t3
else:
t4 = goal / profile_max_v
t5 = t4 + time_to_max_a
t6 = t4 + t2
t7 = t6 + time_to_max_a
time_total = t7

while t_rec[-1] < time_total:


t = t_rec[-1] + dt
t_rec.append(t)
if t < time_to_max_a:
# Ramp up acceleration
a_rec.append(j * t)
v_rec.append(0.5 * j * t**2)
elif t < t2:
# Increase speed at max acceleration
a_rec.append(max_a)
v_rec.append(max_a * (t - 0.5 * time_to_max_a))
elif t < t3:
# Ramp down acceleration
a_rec.append(max_a - j * (t - t2))
v_rec.append(max_a * (t - 0.5 * time_to_max_a) - 0.5 * j * (t -
t2) ** 2)
elif t < t4:
# Maintain max velocity
a_rec.append(0.0)
v_rec.append(profile_max_v)
elif t < t5:
# Ramp down acceleration
a_rec.append(-j * (t - t4))
v_rec.append(profile_max_v - 0.5 * j * (t - t4) ** 2)
elif t < t6:
# Decrease speed at max acceleration
a_rec.append(-max_a)
v_rec.append(max_a * (t2 + t5 - t - 0.5 * time_to_max_a))
elif t < t7:
15.1 1-DOF motion profiles 247

# Ramp up acceleration
a_rec.append(-max_a + j * (t - t6))
v_rec.append(
max_a * (t2 + t5 - t - 0.5 * time_to_max_a) + 0.5 * j * (t -
t6) ** 2
)
else:
a_rec.append(0.0)
v_rec.append(0.0)
x_rec.append(x_rec[-1] + v_rec[-1] * dt)
return t_rec, x_rec, v_rec, a_rec

Snippet 15.2. S-curve profile implementation in Python

15.1.4 Other profile types


The ultimate goal of any profile is to match the profile’s motion characteristics to the
desired application. Trapezoid and S-curve profiles work well when the system’s torque
response curve is fairly flat. In other words, when the output torque does not vary that
much over the range of velocities the system will be experiencing. This is true for most
servo motor systems, whether DC brushed or DC brushless.
Step motors, however, do not have flat torque/speed curves. Torque output is nonlinear,
sometimes has a large drop at a location called the “mid-range instability”, and generally
drops off at higher velocities.
Mid-range instability occurs at the step frequency when the motor’s natural resonance
frequency matches the current step rate. To address mid-range instability, the most
common technique is to use a nonzero starting velocity. This means that the profile
instantly “jumps” to a programmed velocity upon initial acceleration, and while decel-
erating. While crude, this technique sometimes provides better results than a smooth
ramp for zero, particularly for systems that do not use a microstepping drive technique.
To address torque drop-off at higher velocities, a parabolic profile can be used. The cor-
responding acceleration curve has the smallest acceleration when the velocity is highest.
This is a good match for stepper motors because there is less torque available at higher
speeds. However, notice that starting and ending accelerations are very high, and there
is no “S” phase where the acceleration smoothly transitions to zero. If load oscillation
is a problem, parabolic profiles may not work as well as an S-curve despite the fact that
a standard S-curve profile is not optimized for a stepper motor from the standpoint of
the torque/speed curve.

15.1.5 Further reading


FRC teams 254 and 971 gave a talk at FIRST World Championships in 2015 about
motion profiles.
248 Chapter 15. Motion profiles

“Motion Planning and Control in FRC” (59 minutes)


Team 254: The Cheesy Poofs
https://youtu.be/8319J1BEHwM

15.2 2-DOF motion profiles


In FRC, point-to-point movements with two degrees of freedom (2 DOFs) are almost
always within the context of drivetrains where the two degrees of freedom are the x
and y axes.
A path is a set of (x, y) points for the drivetrain to follow. A drivetrain trajectory is a
path that includes both the states (e.g., position and velocity) and control inputs (e.g.,
voltage) of the drivetrain as functions of time.
Currently, the most common form of multidimensional trajectory planning in FRC is
based on polynomial splines. The splines determine the path of the trajectory, and the
path is parameterized by a trapezoidal motion profile to create a trajectory. A Dive into
WPILib Trajectories by Declan Freeman-Gleason introduces 2-DOF trajectory planning
in FRC using polynomial splines [4].
Planning Motion Trajectories for Mobile Robots Using Splines by Christoph Sprunk pro-
vides a more general treatment of spline-based trajectory generation [16].
OPERS field at UCSC

16. Configuration spaces

16.1 Introduction
When a robot can interfere with itself, we need a motion profiling solution that en-
sures the robot doesn’t destroy itself. Now that we’ve demonstrated various methods
of motion profiling, how can we incorporate safe zones into the planning process or
implement mechanism collision avoidance?
1-DOF systems are straightforward. We define the minimum and maximum position
constraints for the arm, elevator, etc. and ensure we stay between them. To do so, we
might use some combination of:
1. Rejecting, coercing, or clamping setpoints so that they are always within the valid
range
2. Soft limits on the speed controller
3. Limit switches, and/or physical hard stops
Multi-DOF systems are more complex. In the 2-DOF case (e.g., an arm with a shoulder
and a wrist, or an elevator and an arm), each DOF has some limits that are indepen-
dent of the rest of the system (e.g., min and max angles), but there are also dependent
constraints (e.g., if the shoulder is angled down, the wrist must be angled up to avoid
hitting the ground). Examples of such constraints could include:
• Minimum/maximum angles
• Maximum extension rules
250 Chapter 16. Configuration spaces

• Robot chassis
• Angles that require an infeasible amount of holding torque
One intuitive way to think about this is to envision an N-D space (2D for 2 DOFs, etc.)
where the x and y axes are the positions of the degrees of freedom. A given point (x, y)
can either be a valid configuration or an invalid configuration (meaning it violates one
or more constraints). So, our 2D plot may have regions (half-planes, rectangles, etc.)
representing constraints.

16.2 Waypoint planning


Our first example has 2 DOFs where all of the constraints are independent (minimum/-
maximum limits on each axis), as shown in figure 16.1.
Motion planning and control in this example is simple; just make sure all controller
references are always inside the box of valid states.
Now let’s make it more interesting. Say that if the arm is near the middle of travel
(pointing straight down), the elevator must be above a certain height. Otherwise, the
arm intersects the robot chassis. This configuration space would look like figure 16.2.
Let’s consider motion planning in the second example. Say the current state of our
system is x and our goal state is r as shown in figure 16.3.
Notice that the elevator height at the current state and the goal is identical. Do we
just have to move the arm? Moving this way will sweep the system into the constraint
we defined for preventing self-intersection with the chassis. Instead, we need to plan
a sequence of references to execute that only pass through valid states. Figure 16.4
shows an example.
a and b are intermediate waypoints we added to ensure that the whole system moves in
a safe manner. Considering again our toy example, the elevator goes up, then the arm
rotates underneath, then the elevator goes back down. There are many possible ways to
generate this path (or other valid paths), including some sort of search (discretize the
plot into a grid and find a valid path using breadth-first search or A*) or hard-coded
rules to handle different regions of the constraint space.
16.2 Waypoint planning 251

Figure 16.1: Elevator-arm configuration space with independent constraints

Figure 16.2: Elevator-arm configuration space requiring elevator to be above a certain


height when arm is pointing down
252 Chapter 16. Configuration spaces

Figure 16.3: Elevator-arm configuration space with initial state x and goal state r

Figure 16.4: Elevator-arm configuration space with path between initial state x and
final state r
16.3 Waypoint traversal 253

16.3 Waypoint traversal


Once we have a sequence of waypoints, there are a few options for traversing them:
1. Generate trajectories for each DOF and time-synchronize them. This ensures
the mechanism follows the “diagonal lines” in the path above. The upside of
this approach is it can yield the fastest (time-optimal) coordinated movements
possible. The downside is that the motion planning is “open loop” – if one of
the DOFs lags behind its trajectory, it’s possible that the system may still end up
in an invalid state. In practice, we include some buffer around our references to
ensure that tracking error doesn’t immediately lead to violated constraints. We
are making our trajectories more robust to uncertainty.
2. Perform movements sequentially. That is, move from x to a, and only start
moving toward b once the path from the current system state to b is a straight
line that is collision-free. This gives robustness to control error, but may result
in unnecessary stops or jerky movement compared to the optimized, open-loop
trajectory.
This planning approach generalizes to an arbitrary number of DOFs and is essentially
how industrial 7-DOF arms are operated.

16.4 State machine validation with safe sets


Configuration spaces can also be used for state machine validation. The states of all the
mechanisms, including solenoid actuation states, would be axes in a configuration space.
A state machine unit test would define a valid path through the configuration space, send
the desired events to the state machine, then ensure the robot took the correct path in
response to each event for all iterations.
This page intentionally left blank
OPERS field at UCSC

17. Trajectory optimization

A trajectory is a collection of samples defined over some time interval. A drivetrain


trajectory would include its states (e.g., x position, y position, heading, and velocity)
and control inputs (e.g., voltage).
Trajectory optimization finds the best choice of trajectory (for some mathematical def-
inition of “best”) by formulating and solving a constrained optimization problem.

17.1 Solving optimization problems with Sleipnir


Sleipnir is a C++ library that facilitates the specification of constrained nonlinear opti-
mization problems with natural mathematical notation, then efficiently solves them.
Sleipnir supports problems of the form:

minf (x)
x
subject to ce (x) = 0
ci (x) ≥ 0

where f (x) is the scalar cost function, x is the vector of decision variables (variables
the solver can tweak to minimize the cost function), ce (x) is the vector-valued function
whose rows are equality constraints, and ci (x) is the vector-valued function whose
rows are inequality constraints. Constraints are equations or inequalities of the decision
256 Chapter 17. Trajectory optimization

variables that constrain what values the solver is allowed to use when searching for an
optimal solution.
The nice thing about Sleipnir is users don’t have to put their system in the form shown
above manually; they can write it in natural mathematical form and it’ll be converted
for them. We’ll cover some examples next.

17.1.1 Double integrator minimum time


A system with position and velocity states and an acceleration input is an example of a
double integrator. We want to go from 0 m at rest to 10 m at rest in the minimum time
while obeying the velocity limit (−1, 1) and the acceleration limit (−1, 1).
 T
The model for our double integrator is ẍ = u where x is the vector position velocity
and u is the acceleration. The velocity constraints are −1 ≤ x(1) ≤ 1 and the acceler-
ation constraints are −1 ≤ u ≤ 1.

Importing required libraries


Jormungandr is the Python version of Sleipnir and can be installed via pip install
sleipnirgroup-jormungandr.
from jormungandr.optimization import OptimizationProblem
import numpy as np

Initializing a problem instance


First, we need to make a problem instance.
T = 3.5 # s
dt = 0.005 # 5 ms
N = int(T / dt)

r = 2.0

problem = OptimizationProblem()

Creating decision variables


Next, we need to make decision variables for our state and input.
# 2x1 state vector with N + 1 timesteps (includes last state)
X = problem.decision_variable(2, N + 1)

# 1x1 input vector with N timesteps (input at last state doesn't matter)
U = problem.decision_variable(1, N)

By convention, we use capital letters for the variables to designate matrices.


17.1 Solving optimization problems with Sleipnir 257

Applying constraints
Now, we need to apply dynamics constraints between timesteps.
# Kinematics constraint assuming constant acceleration between timesteps
for k in range(N):
p_k1 = X[0, k + 1]
v_k1 = X[1, k + 1]
p_k = X[0, k]
v_k = X[1, k]
a_k = U[0, k]

problem.subject_to(p_k1 == p_k + v_k * dt)


problem.subject_to(v_k1 == v_k + a_k * dt)

Next, we’ll apply the state and input constraints.


# Start and end at rest
problem.subject_to(X[:, 0] == np.array([[0.0], [0.0]]))
problem.subject_to(X[:, N] == np.array([[r], [0.0]]))

# Limit velocity
problem.subject_to(-1 <= X[1, :])
problem.subject_to(X[1, :] <= 1)

# Limit acceleration
problem.subject_to(-1 <= U)
problem.subject_to(U <= 1)

Specifying a cost function


Next, we’ll create a cost function for minimizing position error.
# Cost function - minimize position error
J = 0.0
for k in range(N + 1):
J += (r - X[0, k]) ** 2
problem.minimize(J)

The cost function passed to Minimize() should produce a scalar output.


Solving the problem
Now we can solve the problem.
problem.solve()

The solver will find the decision variable values that minimize the cost function while
satisfying the constraints.
258 Chapter 17. Trajectory optimization

Accessing the solution


You can obtain the solution by querying the values of the variables like so.
position = X.value[0, 0]
velocity = X.value[1, 0]
acceleration = U.value(0)

Figure 17.1: Double integrator position Figure 17.2: Double integrator velocity

Figure 17.3: Double integrator accelera-


tion

Other applications
In retrospect, the solution here seems obvious: if you want to reach the desired position
in the minimum time, you just apply positive max input to accelerate to the max speed,
coast for a while, then apply negative max input to decelerate to a stop at the desired
position. Optimization problems can get more complex than this though. In fact, we can
use this same framework to design optimal trajectories for a drivetrain while satisfying
dynamics constraints, avoiding obstacles, and driving through points of interest.
17.2 Further reading 259

17.1.2 Optimizing the problem formulation


Cost functions and constraints can have the following orders:
• none (i.e., there is no cost function or are no constraints)
• constant
• linear
• quadratic
• nonlinear
For nonlinear problems, the solver calculates the Hessian of the cost function and the
Jacobians of the constraints at each iteration. However, problems with lower order cost
functions and constraints can be solved faster. For example, the following only need to
be computed once because they’re constant:
• the Hessian of a quadratic or lower cost function
• the Jacobian of linear or lower constraints
A problem is constant if:
• the cost function is constant or lower
• the equality constraints are constant or lower
• the inequality constraints are constant or lower
A problem is linear if:
• the cost function is linear
• the equality constraints are linear or lower
• the inequality constraints are linear or lower
A problem is quadratic if:
• the cost function is quadratic
• the equality constraints are linear or lower
• the inequality constraints are linear or lower
All other problems are nonlinear.

17.2 Further reading


Sleipnir’s authors recommend ”Numerical Optimization, 2nd Ed.” by Nocedal and
Wright as a useful introductory resource for implementing optimization problem solvers
[15]. A list of resources used to implement Sleipnir is located at https://sleipnirgroup.
github.io/Sleipnir/md_docs_2resources.html.
260 Chapter 17. Trajectory optimization

Matthew Kelly has an approachable introduction to trajectory optimization on his web-


site [7].
VI
Appendices

A Simplifying block diagrams . . . . . . . . . . 263

B Linear-quadratic regulator . . . . . . . . . . . 267

C Feedforwards . . . . . . . . . . . . . . . . . . . . . . . 281

D Derivations . . . . . . . . . . . . . . . . . . . . . . . . . 289

E Classical control theory . . . . . . . . . . . . . 293

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . 327

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
This page intentionally left blank
Sunset in an airplane over New Mexico

A. Simplifying block diagrams

A.1 Cascaded blocks


Y = (P1 P2 )X (A.1)

X P1 P2 Y

Figure A.1: Cascaded blocks

X P1 P2 Y

Figure A.2: Simplified cascaded blocks

A.2 Combining blocks in parallel


Y = P1 X ± P2 X (A.2)
264 Appendix A. Simplifying block diagrams

+
X P1 Y
±

P2

Figure A.3: Parallel blocks

X P1 ± P2 Y

Figure A.4: Simplified parallel blocks

A.3 Removing a block from a feedforward loop


Y = P1 X ± P2 X (A.3)

+
X P1 Y
±

P2

Figure A.5: Feedforward loop

P1 +
X P2 Y
P2
±

Figure A.6: Transformed feedforward loop


A.4 Eliminating a feedback loop 265

A.4 Eliminating a feedback loop


Y = P1 (X ∓ P2 Y ) (A.4)

+
X P1 Y

P2

Figure A.7: Feedback loop

P1
X 1±P1 P 2 Y

Figure A.8: Simplified feedback loop

A.5 Removing a block from a feedback loop


Y = P1 (X ∓ P2 Y ) (A.5)

+
X P1 Y

P2

Figure A.9: Feedback loop


266 Appendix A. Simplifying block diagrams

1 −
X P1 P2 Y
P2

Figure A.10: Transformed feedback loop


Sunset in an airplane over New Mexico

B. Linear-quadratic regulator

This appendix will go into more detail on the linear-quadratic regulator’s derivation and
interesting applications.

B.1 Derivation
Let there be a discrete time linear system defined as

xk+1 = Axk + Buk (B.1)

with the cost functional


∞ 
X T   
xk Q N xk
J=
uk NT R uk
k=0

where J represents a trade-off between state excursion and control effort with the
weighting factors Q, R, and N. Q is the weight matrix for error, R is the weight
matrix for control effort, and N is a cross weight matrix between error and control
effort. N is commonly utilized when penalizing the output in addition to the state and
input.
X∞  T  
xk Qxk + Nuk
J=
uk NT xk + Ruk
k=0
268 Appendix B. Linear-quadratic regulator

X  
  Qxk + Nuk
J= xT uT
k k NT xk + Ruk
k=0
X∞
J= (xT T T
k (Qxk + Nuk ) + uk (N xk + Ruk ))
k=0
X∞
J= (xT T T T T
k Qxk + xk Nuk + uk N xk + uk Ruk )
k=0
X∞
J= (xT T T T T
k Qxk + xk Nuk + xk Nuk + uk Ruk )
k=0
X∞
J= (xT T T
k Qxk + 2xk Nuk + uk Ruk )
k=0
X∞
J= (xT T T
k Qxk + uk Ruk + 2xk Nuk )
k=0

The feedback control law which minimizes J subject to the constraint xk+1 = Axk +
Buk is
uk = −Kxk

where K is given by

K = (R + BT SB)−1 (BT SA + NT )

and S is found by solving the discrete time algebraic Riccati equation defined as

AT SA − S − (AT SB + N)(R + BT SB)−1 (BT SA + NT ) + Q = 0

or alternatively

AT SA − S − AT SB(R + BT SB)−1 BT SA + Q = 0

with

A = A − BR−1 NT
Q = Q − NR−1 NT
B.1 Derivation 269

If there is no cross-correlation between error and control effort, N is a zero matrix and
the cost functional simplifies to

X
J= (xT T
k Qxk + uk Ruk )
k=0

The feedback control law which minimizes this J subject to xk+1 = Axk + Buk is

uk = −Kxk

where K is given by
K = (R + BT SB)−1 BT SA

and S is found by solving the discrete time algebraic Riccati equation defined as

AT SA − S − AT SB(R + BT SB)−1 BT SA + Q = 0

Snippet B.1 computes the infinite horizon, discrete time LQR.


"""Function for computing the infinite horizon LQR."""

import numpy as np
import scipy as sp

def lqr(A, B, Q, R, N):


"""Solves for the optimal discrete linear-quadratic regulator (LQR).

Keyword arguments:
A -- numpy.array(states x states), system matrix.
B -- numpy.array(states x inputs), input matrix.
Q -- numpy.array(states x states), state cost matrix.
R -- numpy.array(inputs x inputs), control effort cost matrix.
N -- numpy.array(states x inputs), cross weight matrix.

Returns:
K -- numpy.array(inputs x states), controller gain matrix.
"""
P = sp.linalg.solve_discrete_are(a=A, b=B, q=Q, r=R, s=N)
return np.linalg.solve(R + B.T @ P @ B, B.T @ P @ A + N.T)

Snippet B.1. Infinite horizon, discrete time LQR solver in Python

Other formulations of LQR for finite horizon and discrete time can be seen on Wikipedia
[2].
MIT OpenCourseWare has a rigorous proof of the results shown above [18].
270 Appendix B. Linear-quadratic regulator

B.2 State feedback with output cost


LQR is normally used for state feedback on

xk+1 = Axk + Buk


yk = Cxk + Duk

with the cost functional



X
J= (xT T
k Qxk + uk Ruk )
k=0

However, we may not know how to select costs for some of the states, or we don’t
care what certain internal states are doing. We can address this by writing the cost
functional in terms of the output vector instead of the state vector. Not only can we
make our output contain a subset of states, but we can use any other cost metric we can
think of as long as it’s representable as a linear combination of the states and inputs.[1]
For state feedback with an output cost, we want to minimize the following cost func-
tional.

X
J= (ykT Qyk + uT
k Ruk )
k=0

Substitute in the expression for yk .



X
J= ((Cxk + Duk )T Q(Cxk + Duk ) + uT
k Ruk )
k=0

Apply the transpose to the left-hand side of the Q term.



X
J= ((xT T T T T
k C + uk D )Q(Cxk + Duk ) + uk Ruk )
k=0
 T  
xk xk
Factor out from the left side and
from the right side of each term.
uk uk

X  T  T     T    !
xk C   xk xk 0 0 xk
J= Q C D +
uk DT uk uk 0 R uk
k=0
[1]We’ll see this later on in section B.3 when we define the cost metric as deviation from the behavior of

another model.
B.2 State feedback with output cost 271

X  T  T     !
xk C   0 0 xk
J= Q C D +
uk DT 0 R uk
k=0

Multiply in Q.

X  T  T     !
xk C Q   0 0 xk
J= C D +
uk DT Q 0 R uk
k=0

Multiply matrices in the left term together.


X ∞  T  T     !
xk C QC CT QD 0 0 xk
J= +
uk DT QC DT QD 0 R uk
k=0

Add the terms together.


 T 
C QC CT QD
∞ 
X T | {z } | {z }  
xk  Q N
 xk

J= (B.2)
uk DT QC DT QD + R uk
k=0 | {z } | {z }
NT R

Thus, state feedback with an output cost can be defined as the following optimization
problem.

Theorem B.2.1 — Linear-quadratic regulator with output cost.


 T 
C QC CT QD
∞ 
X T | {z } | {z }  

xk  Q  xk
u∗k = arg min N 
uk uk DT QC DT QD + R uk
k=0 | {z } | {z }
NT R

subject to xk+1 = Axk + Buk (B.3)

The optimal control policy u∗k is K(rk −xk ) where rk is the desired state. Note that
the Q in CT QC is outputs × outputs instead of states × states. K can be computed
via the typical LQR equations based on the algebraic Ricatti equation.

If the output is just the state vector, then C = I, D = 0, and the cost functional
simplifies to that of LQR with a state cost.
X∞  T   
xk Q 0 xk
J=
uk 0 R uk
k=0
272 Appendix B. Linear-quadratic regulator

B.3 Implicit model following


If we want to design a feedback controller that erases the dynamics of our system and
makes it behave like some other system, we can use implicit model following. This is
used on the Blackhawk helicopter at NASA Ames research center when they want to
make it fly like experimental aircraft (within the limits of the helicopter’s actuators, of
course).

B.3.1 Following reference system matrix


Let the original system dynamics be

xk+1 = Axk + Buk


yk = Cxk

and the desired system dynamics be zk+1 = Aref zk .

yk+1 = Cxk+1
yk+1 = C(Axk + Buk )
yk+1 = CAxk + CBuk

We want to minimize the following cost functional.



X 
J= (yk+1 − zk+1 )T Q(yk+1 − zk+1 ) + uT
k Ruk
k=0

We’ll be measuring the desired system’s state, so let y = z.

zk+1 = Aref yk
zk+1 = Aref Cxk

Therefore,

yk+1 − zk+1 = CAxk + CBuk − (Aref Cxk )


yk+1 − zk+1 = (CA − Aref C)xk + CBuk

Substitute this into the cost functional.



X 
J= (yk+1 − zk+1 )T Q(yk+1 − zk+1 ) + uT
k Ruk
k=0
B.3 Implicit model following 273

X
J= ((CA − Aref C)xk + CBuk )T Q((CA − Aref C)xk + CBuk )
k=0

+uT
k Ruk

Apply the transpose to the left-hand side of the Q term.



X
k (CA − Aref C) + uk (CB) )Q((CA − Aref C)xk + CBuk )
(xT T T T
J=
k=0

+uT
k Ruk

 T  
xk xk
Factor out from the left side and from the right side of each term.
uk uk

X∞  T    
xk (CA − Aref C)T   xk
J= Q CA − A ref C CB
uk (CB)T uk
k=0
 T    !
x 0 0 xk
+ k
uk 0 R uk
X∞  T  
xk (CA − Aref C)T  
J= T Q CA − Aref C CB
uk (CB)
k=0
   
0 0 xk
+
0 R uk

Multiply in Q.
X∞  T  
xk (CA − Aref C)T Q  
J= CA − Aref C CB
uk (CB)T Q
k=0
   
0 0 xk
+
0 R uk

Multiply matrices in the left term together.



X
J=
k=0
 T  
xk (CA − Aref C)T Q(CA − Aref C) (CA − Aref C)T Q(CB)
uk (CB)T Q(CA − Aref C) (CB)T Q(CB)
274 Appendix B. Linear-quadratic regulator
   
0 0 xk
+
0 R uk

Add the terms together.



X
J=
k=0
 
(CA − Aref C)T Q(CA − Aref C) (CA − Aref C)T Q(CB)
 T } 
xk | {z }| {z
 Q N
 xk
uk  (CB) Q(CA − Aref C)
T
(CB) Q(CB) + R  uk
T
| {z } | {z }
NT R

(B.4)

Thus, implicit model following can be defined as the following optimization prob-
lem.

Theorem B.3.1 — Implicit model following.


∞ 
X T   
xk Qimf Nimf xk
u∗k = arg min
uk uk NT
imf Rimf uk
k=0
subject to xk+1 = Axk + Buk (B.5)

where

Qimf = (CA − Aref C)T Q(CA − Aref C)


Nimf = (CA − Aref C)T Q(CB)
Rimf = (CB)T Q(CB) + R

The optimal control policy u∗k is −Kxk . K can be computed via the typical LQR
equations based on the algebraic Ricatti equation.

The control law uimf,k = −Kxk makes xk+1 = Axk + Buimf,k match the open-
loop response of zk+1 = Aref zk .
If the original and desired system have the same states, then C = I and the cost func-
B.4 Time delay compensation 275

tional simplifies to
 
(A − Aref )T Q(A − Aref ) (A − Aref )T QB

X xk 
∞  T | {z } | {z }  
 Q N  xk
J=   (B.6)
uk  BT Q(A − Aref ) BT QB + R  uk
k=0 | {z } | {z }
NT R

B.3.2 Following reference input matrix


The feedback control law above makes the open-loop system behave like Aref , but
the input dynamics are still that of the original system. Here’s how to make the input
dynamics behave like Bref . We want to find the uimf,k that makes the real model
follow the reference model.

xk+1 = Axk + Buimf,k


zk+1 = Aref zk + Bref uk

Let x = z.

xk+1 = zk+1
Axk + Buimf,k = Aref xk + Bref uk
Buimf,k = Aref xk − Axk + Bref uk
Buimf,k = (Aref − A)xk + Bref uk
uimf,k = B+ ((Aref − A)xk + Bref uk )
uimf,k = −B+ (A − Aref )xk + B+ Bref uk

The first term makes the open-loop poles match that of the reference model, and the
second term makes the input behave like that of the reference model.

B.4 Time delay compensation


Linear-Quadratic regulator controller gains tend to be aggressive. If sensor measure-
ments are time-delayed too long, the LQR may be unstable (see figure B.1). However,
if we know the amount of delay, we can compute the control based on where the system
will be after the time delay.
We can compensate for the time delay if we know the control law we’re applying in
future timesteps (u = −Kx) and the duration of the time delay. To get the true state
at the current time for control purposes, we project our delayed state forward by the
276 Appendix B. Linear-quadratic regulator

time delay using our model and the aforementioned control law. Figure B.2 shows
improved control with the predicted state.[2]

Figure B.2: Elevator response at 5 ms sam-


Figure B.1: Elevator response at 5 ms sam-
ple period with 50 ms of output lag (delay-
ple period with 50 ms of output lag
compensated)

For steady-state controller gains, this method of delay compensation seems to work
better for second-order systems than first-order systems. Figures B.3 and B.5 show time
delay for a drivetrain velocity system and flywheel system respectively. Figures B.4 and
B.6 show that compensating the controller gain significantly reduces the feedback gain.
For systems with fast dynamics and a long time delay, the delay-compensated controller
has an almost open-loop response because only feedforward has a significant effect; this
has poor disturbance rejection. Fixing the source of the time delay is always preferred
for these systems.
Since we are computing the control based on future states and the state exponentially
converges to zero over time, the control action we apply at the current timestep also
converges to zero for longer time delays. During startup, the inputs we use to predict
the future state are zero because there’s initially no input history. This means the initial
inputs are larger to give the system a kick in the right direction. As the input delay
buffer fills up, the controller gain converges to a smaller steady-state value. If one uses
the steady-state controller gain during startup, the transient response may be slow.
All figures shown here use the steady-state control law (the second case in equation
(B.12)).

[2] Input delay and output delay have the same effect on the system, so the time delay can be simulated

with either an input delay buffer or a measurement delay buffer.


B.4 Time delay compensation 277

We’ll show how to derive this controller gain compensation for continuous and discrete
systems.

B.4.1 Continuous case


The undelayed continuous linear system is defined as ẋ = Ax(t) + Bu(t) with the
controller u(t) = −Kx(t). Let L be the amount of time delay in seconds. We can
avoid the time delay if we compute the control based on the plant L seconds in the
future.
u(t) = −Kx(t + L)

We need to find x(t + L) given x(t). Since we know u(t) = −Kx(t) will be applied
over the time interval [t, t + L), substitute it into the continuous model.
ẋ = Ax(t) + Bu(t)
ẋ = Ax(t) + B(−Kx(t))
ẋ = Ax(t) − BKx(t)
ẋ = (A − BK)x(t)

We now have a differential equation for the closed-loop system dynamics. Take the
matrix exponential from the current time t to L in the future to obtain x(t + L).
x(t + L) = e(A−BK)L x(t) (B.7)

This works for t ≥ L, but if t < L, we have no input history for the time interval [t, L).
If we assume the inputs for [t, L) are zero, the state prediction for that interval is
x(L) = eA(L−t) x(t)

The time interval [0, t) has nonzero inputs since it’s in the past and was using the normal
control law.
x(t + L) = e(A−BK)t x(L)
x(t + L) = e(A−BK)t eA(L−t) x(t) (B.8)

Therefore, equations (B.7) and (B.8) give the latency-compensated control law for all
t ≥ 0.
u(t) = −Kx(t + L)
(
−Ke(A−BK)t eA(L−t) x(t) if 0 ≤ t < L
u(t) = (B.9)
−Ke(A−BK)L x(t) if t ≥ L
278 Appendix B. Linear-quadratic regulator

B.4.2 Discrete case


The undelayed discrete linear system is defined as xk+1 = Axk + Buk with the
controller uk = −Kxk . Let L be the amount of time delay in seconds. We can avoid
the time delay if we compute the control based on the plant L seconds in the future.

uk = −Kxk+L/T

We need to find xk+L/T given xk . Since we know uk = −Kxk will be applied for
the timesteps k through k + TL , substitute it into the discrete model.

xk+1 = Axk + Buk


xk+1 = Axk + B(−Kxk )
xk+1 = Axk − BKxk
xk+1 = (A − BK)xk

Let T be the duration between timesteps in seconds and L be the amount of time delay
in seconds. TL gives the number of timesteps represented by L.
L
xk+L/T = (A − BK) T xk (B.10)

This works for kT ≥ L where kT is the current time, but if kT < L, we have no input
history for the time interval [kT, L). If we assume the inputs for [kT, L) are zero, the
state prediction for that interval is
L−kT
xL/T = A T xk
T −k
L
xL/T = A xk

The time interval [0, kT ) has nonzero inputs since it’s in the past and was using the
normal control law.

xk+L/T = (A − BK)k xL/T


xk+L/T = (A − BK)k A T −k xk
L
(B.11)

Therefore, equations (B.10) and (B.11) give the latency-compensated control law for
all t ≥ 0.

uk = −Kxk+L/T
B.4 Time delay compensation 279
(
−K(A − BK)k A T −k xk
L
if 0 ≤ k < L
T
uk = L (B.12)
−K(A − BK) T xk if k ≥ TL

If the delay L isn’t a multiple of the sample period T in equation (B.12), we have to
evaluate fractional matrix powers, which can be tricky. Eigen (a C++ library) supports
fractional powers with the pow() member function provided by
<unsupported/Eigen/MatrixFunctions>. SciPy (a Python library) supports fractional
powers with the free function
scipy.linalg.fractional_matrix_power(). If the language you’re using doesn’t provide
such a function, you can try the following approach instead.
Let there be a matrix M raised to a fractional power n. If M is diagonalizable, we can
obtain an exact answer for Mn by decomposing M into PDP−1 where D is a diagonal
matrix, computing Dn as each diagonal element raised to n, then recomposing Mn as
PDn P−1 .
If a matrix raised to a fractional power in equation (B.12) isn’t diagonalizable, we have
to approximate by rounding TL to the nearest integer. This approximation gets worse
as L mod T approaches T2 .
280 Appendix B. Linear-quadratic regulator

Figure B.3: Drivetrain velocity response at Figure B.4: Drivetrain velocity response at
1 ms sample period with 40 ms of output 1 ms sample period with 40 ms of output
lag lag (delay-compensated)

Figure B.6: Flywheel response at 1 ms


Figure B.5: Flywheel response at 1 ms
sample period with 80 ms of output lag
sample period with 80 ms of output lag
(delay-compensated)
Sunset in an airplane over New Mexico

C. Feedforwards

This appendix will show the derivations for alternate feedforward formulations.

C.1 QR-weighted linear plant inversion


C.1.1 Setup
Let’s start with the equation for the reference dynamics

rk+1 = Ark + Buk

where uk is the feedforward input. Note that this feedforward equation does not and
should not take into account any feedback terms. We want to find the optimal uk such
that we minimize the tracking error between rk+1 and rk .

rk+1 − Ark = Buk

To solve for uk , we need to take the inverse of the nonsquare matrix B. This isn’t
possible, but we can find the pseudoinverse given some constraints on the state tracking
error and control effort. To find the optimal solution for these sorts of trade-offs, one
can define a cost function and attempt to minimize it. To do this, we’ll first solve the
expression for 0.
0 = Buk − (rk+1 − Ark )
282 Appendix C. Feedforwards

This expression will be the state tracking cost we use in our cost function.
Our cost function will use an H2 norm with Q as the state cost matrix with dimen-
sionality states × states and R as the control input cost matrix with dimensionality
inputs × inputs.

J = (Buk − (rk+1 − Ark ))T Q(Buk − (rk+1 − Ark )) + uT


k Ruk

C.1.2 Minimization
Given theorem 5.12.1 and corollary 5.12.3, find the minimum of J by taking the partial
derivative with respect to uk and setting the result to 0.

∂J
= 2BT Q(Buk − (rk+1 − Ark )) + 2Ruk
∂uk
0 = 2BT Q(Buk − (rk+1 − Ark )) + 2Ruk
0 = BT Q(Buk − (rk+1 − Ark )) + Ruk
0 = BT QBuk − BT Q(rk+1 − Ark ) + Ruk
BT QBuk + Ruk = BT Q(rk+1 − Ark )
(BT QB + R)uk = BT Q(rk+1 − Ark )
uk = (BT QB + R)−1 BT Q(rk+1 − Ark )

Theorem C.1.1 — QR-weighted linear plant inversion. Given the discrete


model xk+1 = Axk + Buk , the plant inversion feedforward is

uk = Kf f (rk+1 − Ark )

where Kf f = (BT QB + R)−1 BT Q, rk+1 is the reference at the next timestep,


and rk is the reference at the current timestep.

Figure C.1 shows plant inversion applied to a second-order CIM motor model.
Plant inversion isn’t as effective with both Q and R cost because the R matrix penalized
control effort. The reference tracking with no cost matrices is much better.
C.2 Steady-state feedforward 283

Figure C.1: Second-order CIM motor response with plant inversion

C.2 Steady-state feedforward


Steady-state feedforwards apply the control effort required to keep a system at the ref-
erence if it is no longer moving (i.e., the system is at steady-state). The first steady-state
feedforward converts desired outputs to desired states.

x c = Nx y c

Nx converts desired outputs yc to desired states xc (also known as r). For steady-state,
that is
xss = Nx yss (C.1)

The second steady-state feedforward converts the desired outputs y to the control input
required at steady-state.
u c = Nu y c

Nu converts the desired outputs y to the control input u required at steady-state. For
steady-state, that is
uss = Nu yss (C.2)
284 Appendix C. Feedforwards

C.2.1 Continuous case


To find the control input required at steady-state, set equation (6.3) to zero.

ẋ = Ax + Bu
y = Cx + Du

0 = Axss + Buss
yss = Cxss + Duss

0 = ANx yss + BNu yss


yss = CNx yss + DNu yss
   
0 ANx + BNu
= y
yss CNx + DNu ss
   
0 ANx + BNu
=
1 CNx + DNu
    
0 A B Nx
=
1 C D Nu
   +  
Nx A B 0
=
Nu C D 1

where + denotes the Moore-Penrose pseudoinverse.

C.2.2 Discrete case


Now, we’ll do the same thing for the discrete system. To find the control input required
at steady-state, set xk+1 = Axk + Buk to zero.

xk+1 = Axk + Buk


yk = Cxk + Duk

xss = Axss + Buss


yss = Cxss + Duss

0 = (A − I)xss + Buss
yss = Cxss + Duss
C.2 Steady-state feedforward 285

0 = (A − I)Nx yss + BNu yss


yss = CNx yss + DNu yss
   
0 (A − I)Nx + BNu
= yss
yss CNx + DNu
   
0 (A − I)Nx + BNu
=
1 CNx + DNu
    
0 A − I B Nx
=
1 C D Nu
   +  
Nx A−I B 0
=
Nu C D 1

where + denotes the Moore-Penrose pseudoinverse.

C.2.3 Deriving steady-state input


Now, we’ll find an expression that uses Nx and Nu to convert the reference r to a
control input feedforward uf f . Let’s start with equation (C.1).

xss = Nx yss
+
Nx xss = yss

Now substitute this into equation (C.2).

uss = Nu yss
uss = Nu (N+
x xss )
uss = Nu N+
x xss

uss and xss are also known as uf f and r respectively.

uf f = Nu N+
xr

So all together, we get theorem C.2.1.


286 Appendix C. Feedforwards

Theorem C.2.1 — Steady-state feedforward.

Continuous:
   +  
Nx A B 0
= (C.3)
Nu C D 1
uf f = Nu N+
xr (C.4)

Discrete:
   +  
Nx A−I B 0
= (C.5)
Nu C D 1
uf f,k = Nu N+
x rk (C.6)

In the augmented matrix, B should contain one column corresponding to an actuator


and C should contain one row whose output will be driven by that actuator. More
than one actuator or output can be included in the computation at once, but the result
won’t be the same as if they were computed independently and summed afterward.
After computing the feedforward for each actuator-output pair, the respective col-
lections of Nx and Nu matrices can summed to produce the combined feedforward.

If the augmented matrix in theorem C.2.1 is square (number of inputs = number of


outputs), the normal matrix inverse can be used instead.
Case study: second-order CIM motor model
Each feedforward implementation has advantages. The steady-state feedforward al-
lows using specific actuators to maintain the reference at steady-state. Plant inversion
doesn’t support this, but can be used for reference tracking as well with the same tuning
parameters as LQR design. Figure C.2 shows both types of feedforwards applied to a
second-order CIM motor model.
Plant inversion isn’t as effective in figure C.2 because the R matrix penalized control
effort. If the R cost matrix is removed from the plant inversion calculation, the refer-
ence tracking is much better (see figure C.3).
C.2 Steady-state feedforward 287

Figure C.2: Second-order CIM motor re- Figure C.3: Second-order CIM motor re-
sponse with various feedforwards sponse with plant inversions
This page intentionally left blank
Sunset in an airplane over New Mexico

D. Derivations

D.1 Linear system zero-order hold


Starting with the continuous model
ẋ(t) = Ax(t) + Bu(t)

by premultiplying the model by e−At , we get


e−At ẋ(t) = e−At Ax(t) + e−At Bu(t)
e−At ẋ(t) − e−At Ax(t) = e−At Bu(t)

The derivative of the matrix exponential is


d At
e = AeAt = eAt A
dt

so we recognize the previous equation as


d −At 
e x(t) = e−At Bu(t)
dt

By integrating this equation, we get


Z t
−At
e x(t) − e x(0) =
0
e−Aτ Bu(τ ) dτ
0
290 Appendix D. Derivations
Z t
x(t) = eAt x(0) + eA(t−τ ) Bu(τ ) dτ
0

which is an analytical solution to the continuous model. Now we want to discretize it.
def
xk = x(kT )
Z kT
xk = eAkT x(0) + eA(kT −τ ) Bu(τ ) dτ
0
Z (k+1)T
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z kT
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z (k+1)T
+ eA((k+1)T −τ ) Bu(τ ) dτ
kT
Z kT
xk+1 = eA(k+1)T x(0) + eA((k+1)T −τ ) Bu(τ ) dτ
0
Z (k+1)T
+ eA(kT +T −τ ) Bu(τ ) dτ
kT
Z !
kT
AT AkT A(kT −τ )
xk+1 = e e x(0) + e Bu(τ ) dτ
0
| {z }
xk
Z (k+1)T
+ eA(kT +T −τ ) Bu(τ ) dτ
kT

We assume that u is constant during each timestep, so it can be pulled out of the integral.
Z (k+1)T !
AT A(kT +T −τ )
xk+1 = e xk + e dτ Buk
kT

The second term can be simplified by substituting it with the function v(τ ) = kT +
T − τ . Note that dτ = −dv.
Z v((k+1)T ) !
xk+1 = eAT xk − eAv dv Buk
v(kT )
Z 0 
xk+1 = e AT
xk − e Av
dv Buk
T
D.2 Kalman filter as Luenberger observer 291
Z !
T
AT
xk+1 = e xk + eAv dv Buk
0

xk+1 = eAT xk + A−1 eAv |T0 Buk


xk+1 = eAT xk + A−1 (eAT − eA0 )Buk
xk+1 = eAT xk + A−1 (eAT − I)Buk

which is an exact solution to the discretization problem.

D.2 Kalman filter as Luenberger observer


A Luenberger observer is defined as

k+1 = Ax̂k + Buk + L(yk − ŷk )
x̂+ (D.1)
ŷk = Cx̂−
k (D.2)

where a superscript of minus denotes a priori and plus denotes a posteriori estimate.
Combining equation (D.1) and equation (D.2) gives
− −
k+1 = Ax̂k + Buk + L(yk − Cx̂k )
x̂+ (D.3)

The following is a Kalman filter that considers the current update step and the next
predict step together rather than the current predict step and current update step.

Update step
K k = P− T − T
k C (CPk C + R)
−1
(D.4)
x̂+
k = x̂−
k + Kk (yk − Cx̂−
k) (D.5)
P+k = (I − Kk C)P−
k (D.6)
Predict step
x̂+ +
k+1 = Ax̂k + Buk (D.7)
P−
k+1 = AP+
kA
T
+Q (D.8)

Substitute equation (D.5) into equation (D.7).


− −
k+1 = A(x̂k + Kk (yk − Cx̂k )) + Buk
x̂+
− −
k+1 = Axk + AKk (yk − Cx̂k ) + Buk
x̂+
292 Appendix D. Derivations
− −
k+1 = Ax̂k + Buk + AKk (yk − Cx̂k )
x̂+

Let L = AKk .
− −
k+1 = Ax̂k + Buk + L(yk − Cx̂k )
x̂+ (D.9)

which matches equation (D.3). Therefore, the eigenvalues of the Kalman filter observer
can be obtained by

eig(A − LC)
eig(A − (AKk )(C))
eig(A(I − Kk C)) (D.10)

D.2.1 Luenberger observer with separate prediction and update


To run a Luenberger observer with separate prediction and update steps, substitute the
relationship between the Luenberger observer and Kalman filter matrices derived above
into the Kalman filter equations.
Appendix D.2 shows that L = AKk . Since L and A are constant, one must assume
Kk has reached steady-state. Then, K = A−1 L. Substitute this into the Kalman filter
update equation.
− −
k+1 = x̂k+1 + K(yk+1 − Cx̂k+1 )
x̂+

x̂+
k+1 = x̂k+1 + A
−1
L(yk+1 − Cx̂−
k+1 )

Substitute in equation (9.4).


− −1
x̂+
k+1 = x̂k+1 + A L(yk+1 − ŷk+1 )

The predict step is the same as the Kalman filter’s. Therefore, a Luenberger observer
run with prediction and update steps is written as follows.

Predict step
x̂− −
k+1 = Ax̂k + Buk (D.11)
Update step
− −1
x̂+
k+1 = x̂k+1 + A L(yk+1 − ŷk+1 ) (D.12)
ŷk+1 = Cx̂−
k+1 (D.13)
Sunset in an airplane over New Mexico

E. Classical control theory

This appendix is provided for those who are curious about a lower-level interpretation
of control systems. It describes what a transfer function is and shows how they can be
used to analyze dynamical systems. Emphasis is placed on the geometric intuition of
this analysis rather than the frequency domain math. Many tools exclusive to classical
control theory (root locus, Bode plots, Nyquist plots, etc.) aren’t useful for or relevant
to the examples presented in the main chapters, so they would only complicate the
learning process.
With classical control theory’s geometric interpretation, one can perform stability and
robustness analyses and design reasonable controllers for systems on the back of a nap-
kin. However, computing power is much more plentiful nowadays; we should take
advantage of the modern tools this enables to better express the controls designer’s
intent (for example, via optimal control criteria rather than frequency response charac-
teristics).
Classical control theory should only be used when the controls designer cares about
directly shaping the frequency response of a system. Electrical engineers do this a lot,
but all the other university controls students have been forced to learn it too regardless
of its utility in other disciplines.
294 Appendix E. Classical control theory

E.1 Classical vs modern control theory


State-space notation provides a more convenient and compact way to model and ana-
lyze systems with multiple inputs and outputs. For a system with p inputs and q outputs,
we would have to write q × p transfer functions to represent it. Not only is the resulting
algebra unwieldy, but it only works for linear systems. Including nonzero initial con-
ditions complicates the algebra even more. State-space representation uses the time
domain instead of the Laplace domain, so it can model nonlinear systems[1] and triv-
ially supports nonzero initial conditions.
If modern control theory is so great and classical control theory isn’t needed to use it,
why learn classical control theory at all? We teach classical control theory because it
provides a framework within which to understand results from the mathematical ma-
chinery of modern control as well as vocabulary with which to communicate that un-
derstanding. For example, faster poles (poles moved to the left in the s-plane) mean
faster decay, and oscillation means there is at least one pair of complex conjugate poles.
Not only can you describe what happened succinctly, but you know why it happened
from a theoretical perspective.
This book uses LQR and modern control over, say, loop shaping with Bode and Nyquist
plots because we have accurate dynamical models to leverage, and LQR allows directly
expressing what the author is concerned with optimizing: state excursion relative to
control effort. Applying lead and lag compensators, while effective for robust controller
design, doesn’t provide the same expressive power.

E.2 Transfer functions


We will briefly discuss what transfer functions are, how the locations of poles and zeroes
affect system response and stability, and how controllers affect pole locations.

E.2.1 Laplace transform


For an introduction to Laplace transforms and the geometric intuition behind transfer
functions, we recommend the YouTube video “What does the Laplace Transform really
tell us?” by Zach Star.
A more mathematical introduction is presented later in chapter E.3.

E.2.2 Parts of a transfer function


A transfer function maps an input coordinate to an output coordinate in the Laplace
domain. These can be obtained by applying the Laplace transform to a differential
[1] This book primarily focuses on analysis and control of linear systems. See chapter 8 for more on

nonlinear control.
E.2 Transfer functions 295

“What does the Laplace Transform really tell us?” (21 minutes)
Zach Star
https://youtu.be/n2y7n6jw5d0

equation and rearranging the terms to obtain a ratio of the output variable to the input
variable. Equation (E.1) is an example of a transfer function.
zeroes
z }| {
(s − 9 + 9i)(s − 9 − 9i)
H(s) = (E.1)
s(s + 10)
| {z }
poles

Poles and zeroes


The roots of factors in the numerator of a transfer function are called zeroes because
they make the transfer function approach zero. Likewise, the roots of factors in the de-
nominator of a transfer function are called poles because they make the transfer function
approach infinity; on a 3D graph, these look like the poles of a circus tent (see figure
E.1).
When the factors of the denominator are broken apart using partial fraction expansion
A B
into something like s+a + s+b , the constants A and B are called residues, which
determine how much each pole contributes to the system response.
The factors representing poles are each the Laplace transform of a decaying exponen-
tial.[2] That means the time domain responses of systems comprise decaying exponen-
tials (e.g., y = e−t ).

R Imaginary poles and zeroes always come in complex conjugate pairs (e.g., −2 +
3i, −2 − 3i).

The locations of the closed-loop poles in the complex plane determine the stability of
the system. Each pole represents a frequency mode of the system, and their location
determines how much of each response is induced for a given input frequency. Figure
[2]We are handwaving Laplace transform derivations because they are complicated and neither relevant

nor useful.
296 Appendix E. Classical control theory

Figure E.1: Equation (E.1) plotted in 3D


E.2 Transfer functions 297

E.2 shows the impulse responses in the time domain for transfer functions with various
pole locations. They all have an initial condition of 1.

Im
Stable Unstable

t
t
t

LHP RHP
t

t t

Re

t t t

Figure E.2: Impulse response vs pole location

Poles in the left half-plane (LHP) are stable; the system’s output may oscillate but
it converges to steady-state. Poles on the imaginary axis are marginally stable; the
system’s output oscillates at a constant amplitude forever. Poles in the right half-plane
(RHP) are unstable; the system’s output grows without bound.
Nonminimum phase zeroes
While poles in the RHP are unstable, the same is not true for zeroes. They can be
characterized by the system initially moving in the wrong direction before heading to-
ward the reference. Since the poles always move toward the zeroes, zeroes impose a
“speed limit” on the system response because it takes a finite amount of time to move
the wrong direction, then change directions.
One example of this type of system is bicycle steering. Try riding a bicycle without
holding the handle bars, then poke the right handle; the bicycle turns right. Further-
more, if one is holding the handlebars and wants to turn left, rotating the handlebars
counterclockwise will make the bicycle fall toward the right. The rider has to lean into
the turn and overpower the nonminimum phase dynamics to go the desired direction.
Another example is a Segway. To move forward by some distance, the Segway must
first roll backward to rotate the Segway forward. Once the Segway starts falling in
298 Appendix E. Classical control theory

that direction, it begins rolling forward to avoid falling over until it reaches the target
distance. At that point, the Segway increases its forward speed to pitch backward and
slow itself down. To come to a stop, the Segway rolls backward again to level itself out.
Pole-zero cancellation
Pole-zero cancellation occurs when a pole and zero are located at the same place in the
s-plane. This effectively eliminates the contribution of each to the system dynamics.
By placing poles and zeroes at various locations (this is done by placing transfer func-
tions in series), we can eliminate undesired system dynamics. While this may appear
to be a useful design tool at first, there are major caveats. Most of these are due to
model uncertainty resulting in poles which aren’t in the locations the controls designer
expected.
Notch filters are typically used to dampen a specific range of frequencies in the system
response. If its band is made too narrow, it can still leave the undesirable dynamics,
but now you can no longer measure them in the response. They are still happening, but
they are what’s called unobservable.
Never pole-zero cancel unstable or nonminimum phase dynamics. If the model doesn’t
quite reflect reality, an attempted pole cancellation by placing a nonminimum phase
zero results in the pole still moving to the zero placed next to it. You have the same
dynamics as before, but the pole is also stuck where it is no matter how much feedback
gain is applied. For an attempted nonminimum phase zero cancellation, you have ef-
fectively placed an unstable pole that’s unobservable. This means the system will be
going unstable and blowing up, but you won’t be able to detect this and react to it.
Keep in mind when making design decisions that the model likely isn’t perfect. The
whole point of feedback control is to be robust to this kind of uncertainty.

E.2.3 Transfer functions in feedback


For controllers to regulate a system or track a reference, they must be placed in positive
or negative feedback with the plant (whether to use positive or negative depends on
the plant in question). Stable feedback loops attempt to make the output equal the
reference.
Derivation
Given the feedback network in figure E.3, find an expression for Y (s).
Y (s) = Z(s)G(s)
Z(s) = X(s) − Y (s)H(s)
X(s) = Z(s) + Y (s)H(s)
X(s) = Z(s) + Z(s)G(s)H(s)
E.2 Transfer functions 299

+ Z(s)
X(s) G(s) Y (s)

H(s)

Figure E.3: Closed-loop block diagram

Y (s) Z(s)G(s)
=
X(s) Z(s) + Z(s)G(s)H(s)
Y (s) G(s)
= (E.2)
X(s) 1 + G(s)H(s)

A more general form is


Y (s) G(s)
= (E.3)
X(s) 1 ∓ G(s)H(s)

where positive feedback uses the top sign and negative feedback uses the bottom sign.
Control system with feedback

+
X(s) K G Y (s)

Figure E.4: Feedback controller block diagram


X(s) input H measurement transfer function
K controller gain Y (s) output
G plant transfer function

Following equation (E.3), the transfer function of figure E.4, a control system diagram
with negative feedback, from input to output is
Y (s) KG
Gcl (s) = = (E.4)
X(s) 1 + KGH
300 Appendix E. Classical control theory

The numerator is the open-loop gain and the denominator is one plus the gain around
the feedback loop, which may include parts of the open-loop gain. As another example,
the transfer function from the input to the error is

E(s) 1
Gcl (s) = = (E.5)
X(s) 1 + KGH

The roots of the denominator of Gcl (s) are different from those of the open-loop trans-
fer function KG(s). These are called the closed-loop poles.

DC motor transfer function


If poles are much farther left in the LHP than the typical system dynamics exhibit,
they can be considered negligible. Every system has some form of unmodeled high fre-
quency, nonlinear dynamics, but they can be safely ignored depending on the operating
regime.
To demonstrate this, consider the transfer function for a second-order DC brushed mo-
tor (a CIM motor) from voltage to velocity.

K
G(s) =
(Js + b)(Ls + R) + K 2

where J = 3.2284 × 10−6 kg-m2 , b = 3.5077 × 10−6 N -m-s, Ke = Kt =


0.0181 V /rad/s, R = 0.0902 Ω, and L = 230 × 10−6 H.
This system is second-order because it has two poles; one corresponds to velocity and
the other corresponds to current. To make position the output instead, we’ll multiply
by 1s because position is the integral of velocity.[3]

K
G(s) =
s((Js + b)(Ls + R) + K 2 )

Compare the step response of this system (figure E.5) with the step response of this
system with L set to zero (figure E.6). For small values of K, both systems are stable
and have nearly indistinguishable step responses due to the exceedingly small contribu-
tion from the fast pole. The high frequency dynamics only cause instability for large
values of K that induce fast system responses. In other words, the system responses of
the second-order model and its first-order approximation are similar for low frequency
operating regimes.

[3] This also has the effect of augmenting the system with a position state, making it a third-order system.
E.2 Transfer functions 301

Figure E.5: Step response of second-order Figure E.6: Step response of first-order
DC brushed motor plant augmented with DC brushed motor plant augmented with
position (L = 230 μH) position (L = 0 μH)

Why can’t unstable poles close to the origin be ignored in the same way? The response
of high frequency stable poles decays rapidly. Unstable poles, on the other hand, repre-
sent unstable dynamics which cause the system output to grow to infinity. Regardless
of how slow these unstable dynamics are, they will eventually dominate the response.
302 Appendix E. Classical control theory

E.3 Laplace domain analysis


This chapter uses Laplace transforms and transfer functions to analyze properties of
control systems like steady-state error.
These case studies cover various aspects of PID control using the algebraic approach
of transfer functions. For this, we’ll be using equation (E.6), the transfer function for a
PID controller.
Ki
K(s) = Kp + + Kd s (E.6)
s
First, we will define mathematically what Laplace transforms and transfer functions are,
which is rooted in the concept of orthogonal projections.

E.3.1 Projections
Consider a two-dimensional Euclidean space R2 shown in figure E.7 (each R is a di-
mension whose domain is the set of real numbers, so R2 is the standard x-y plane).

Figure E.7: Euclidean space R2

Ordinarily, we notate points in this plane by their components in the set of basis vectors
{î, ĵ}, where î (pronounced i-hat) is the unit vector in the positive x direction and ĵ
is the unit vector in the positive y direction. Figure E.8 shows an example vector v in
this basis.
E.3 Laplace domain analysis 303

2
v = 2î + 1.5ĵ

1.5

x

Figure E.8: v with basis set {î, ĵ}

How do we find the coordinates of v in this basis mathematically? As long as the basis
is orthogonal (i.e., the basis vectors are at right angles to each other), we simply take
the orthogonal projection of v onto î and ĵ. Intuitively, this means finding “the amount
of v that points in the direction of î or ĵ”. Note that a set of orthogonal vectors have a
dot product of zero with respect to each other.
More formally, we can calculate projections with the dot product - the projection of v
onto any other vector w is as follows.
v·w
projw v =
|w|

Since î and ĵ are unit vectors, their magnitudes are 1 so the coordinates of v are v · î
and v · ĵ.
We can use this same process to find the coordinates of v in any orthogonal basis. For
example, imagine the basis {î + ĵ, î − ĵ} - the coordinates in this basis are given by
v·(î+ĵ)

2
and v·(√î−
2
ĵ)
. Let’s “unwrap” the formula for dot product and look a bit more
closely.
1 X
n
v · (î + ĵ)
√ =√ vi (î + ĵ)i
2 2 i=0

where the subscript i denotes which component of each vector and n is the total number
304 Appendix E. Classical control theory

of components. To change coordinates, we expanded both v and î + ĵ in a basis,


multiplied their components, and added them up. Here’s a more concrete example.
Let v = 2î + 1.5ĵ from figure E.8. First, we’ll project v onto the î + ĵ basis vector.

v · (î + ĵ) 1
√ = √ (2î · î + 1.5ĵ · ĵ)
2 2
v · (î + ĵ) 1
√ = √ (2 + 1.5)
2 2
v · (î + ĵ) 3.5
√ =√
2 2

v · (î + ĵ) 3.5 2
√ =
2 2
v · (î + ĵ) √
√ = 1.75 2
2

Next, we’ll project v onto the î − ĵ basis vector.

v · (î − ĵ) 1
√ = √ (2î · î − 1.5ĵ · ĵ)
2 2
v · (î − ĵ) 1
√ = √ (2 − 1.5)
2 2
v · (î − ĵ) 0.5
√ =√
2 2

v · (î − ĵ) 0.5 2
√ =
2 2
v · (î − ĵ) √
√ = 0.25 2
2

Figure E.9 shows this result geometrically with respect to the basis {î + ĵ, î − ĵ}.
The previous example was only a change of coordinates in a finite-dimensional vector
space. However, as we will see, the core idea does not change much when we move to
more complicated structures. Observe the formula for the Fourier transform.
Z ∞
ˆ
f (ξ) = f (x)e−2πixξ dx where ξ ∈ R
−∞
E.3 Laplace domain analysis 305


0.25 2
v

î + ĵ √
1.75 2

x
î − ĵ

Figure E.9: v with basis {î + ĵ, î − ĵ}

This is fundamentally the same formula we had before. f (x) has taken the place of
vn , e−2πixξ has taken the place of (î + ĵ)i , and the sum over i has turned into an
integral over dx, but the underlying concept is the same. To change coordinates in a
function space, we simply take the orthogonal projection onto our new basis functions.
In the case of the Fourier transform, the function basis is the family of functions of the
form f (x) = e−2πixξ for ξ ∈ R. Since these functions are oscillatory at a frequency
determined by ξ, we can think of this as a “frequency basis”.

R Watch the “Abstract vector spaces” video from 3Blue1Brown’s Essence of linear
algebra series for a more geometric introduction to using functions as a basis.

“Abstract vector spaces” (17 minutes)


3Blue1Brown
https://www.3blue1brown.com/lessons/abstract-vector-spaces

Now, the Laplace transform is somewhat more complicated - as it turns out, the Fourier
306 Appendix E. Classical control theory

basis is orthogonal, so the analogy to the simpler vector space holds almost-precisely.
The Laplace transform is not orthogonal, so we can’t interpret it strictly as a change of
coordinates in the traditional sense. However, the intuition is the same: we are taking
the orthogonal projection of our original function onto the functions of our new basis
set. Z ∞
F (s) = f (t)e−st dt, where s ∈ C
0

Here, it becomes obvious that the Laplace transform is a generalization of the Fourier
transform in that the basis family is strictly larger (we have allowed the “frequency”
parameter to take complex values, as opposed to merely real values). As a result, the
Laplace basis contains functions that grow and decay, while the Fourier basis does not.

E.3.2 Fourier transform


The Fourier transform decomposes a function of time into its component frequencies.
Each of these frequencies is part of what’s called a basis. These basis waveforms can be
multiplied by their respective contribution amount and summed to produce the original
signal (this weighted sum is called a linear combination). In other words, the Fourier
transform provides a way for us to determine, given some signal, what frequencies can
we add together and in what amounts to produce the original signal.
Think of an Fmajor4 chord which has the notes F4 (349.23 Hz), A4 (440 Hz), and C4
(261.63 Hz). The waveform over time looks like figure E.10.
Notice how this complex waveform can be represented just by three frequencies. They
show up as Dirac delta functions[4] in the frequency domain with the area underneath
them equal to their contribution (see figure E.11).

E.3.3 Laplace transform


The Laplace domain is a generalization of the frequency domain that has the frequency
(jω) on the imaginary y-axis and a real number on the x-axis, yielding a two-dimensional
coordinate system. We represent coordinates in this space as a complex number s =
σ+jω. The real part σ corresponds to the x-axis and the imaginary part jω corresponds
to the y-axis (see figure E.12).
To extend our analogy of each coordinate being represented by some basis, we now
have the y coordinate representing the oscillation frequency of the system response
(the frequency domain) and also the x coordinate representing the speed at which that

[4] The Dirac delta function is zero everywhere except at the origin. The nonzero region has an infinitesimal

width and has a height such that the area within that region is 1.
E.3 Laplace domain analysis 307

Figure E.10: Frequency decomposition of Fmajor4 chord

Figure E.11: Fourier transform of Fmajor4 chord


308 Appendix E. Classical control theory

Im(jω)

Re(σ)

Figure E.12: Laplace domain

oscillation decays and the system converges to zero (i.e., a decaying exponential). Fig-
ure E.2 shows this for various points.
If we move the component frequencies in the Fmajor4 chord example parallel to the
real axis to σ = −25, the resulting time domain response attenuates according to the
decaying exponential e−25t (see figure E.13).
Note that this explanation as a basis isn’t exact because the Laplace basis isn’t orthogonal
(that is, the x and y coordinates affect each other and have cross-talk). In the frequency
domain, we had a basis of sine waves that we represented as delta functions in the
frequency domain. Each frequency contribution was independent of the others. In the
1
Laplace domain, this is not the case; a pure exponential is s−a (a rational function
where a is a real number) instead of a delta function. This function is nonzero at points
that aren’t actually frequencies present in the time domain. Figure E.14 demonstrates
this, which shows the Laplace transform of the Fmajor4 chord plotted in 3D.
Notice how the values of the function around each component frequency decrease ac-
cording to √ 21 2 in the x and y directions (in just the x direction, it would be x1 ).
x +y
E.3 Laplace domain analysis 309

Figure E.13: Fmajor4 chord at σ = 0 and σ = −25

Figure E.14: Laplace transform of Fmajor4 chord plotted in 3D


310 Appendix E. Classical control theory

E.3.4 Laplace transform definition


The Laplace transform of a function f (t) is defined as
Z ∞
L{f (t)} = F (s) = f (t)e−st dt
0

We won’t be computing any Laplace transforms by hand using this formula (everyone in
the real world looks these up in a table anyway). Common Laplace transforms (assum-
ing zero initial conditions) are shown in table E.1. Of particular note are the Laplace
transforms for the derivative, unit step,[5] and exponential decay. We can see that a
derivative is equivalent to multiplying by s, and an integral is equivalent to multiplying
by 1s .

Time domain Laplace domain


Linearity a f (t) + b g(t) a F (s) + b G(s)
Convolution (f ∗ g)(t) F (s) G(s)

Derivative f (t) s F (s)
nth derivative f (n) (t) sn F (s)
1
Unit step u(t) s
1
Ramp t u(t) s2

Exponential decay e−αt u(t) 1


s+α

Table E.1: Common Laplace transforms and Laplace transform properties with zero
initial conditions

E.3.5 Steady-state error


To demonstrate the problem of steady-state error, we will use a DC brushed motor
controlled by a velocity PID controller. A DC brushed motor has a transfer function
from voltage (V ) to angular velocity (θ̇) of
Θ̇(s) K
G(s) = = (E.7)
V (s) (Js + b)(Ls + R) + K 2

First, we’ll try controlling it with a P controller defined as


K(s) = Kp
[5] The unit step u(t) is defined as 0 for t < 0 and 1 for t ≥ 0.
E.3 Laplace domain analysis 311

When these are in unity feedback, the transfer function from the input voltage to the
error is
E(s) 1
=
V (s) 1 + K(s)G(s)
1
E(s) = V (s)
1 + K(s)G(s)
1
E(s) =   V (s)
K
1 + (Kp ) (Js+b)(Ls+R)+K 2

1
E(s) = Kp K
V (s)
1+ (Js+b)(Ls+R)+K 2

The steady-state of a transfer function can be found via


lim sH(s) (E.8)
s→0

since steady-state has an input frequency of zero.


ess = lim sE(s)
s→0
1
ess = lim s Kp K
V (s)
s→0 1+ (Js+b)(Ls+R)+K 2
1 1
ess = lim s Kp K
s→0 1+ s
(Js+b)(Ls+R)+K 2
1
ess = lim Kp K
s→0 1+ (Js+b)(Ls+R)+K 2
1
ess = Kp K
1+ (J(0)+b)(L(0)+R)+K 2
1
ess = Kp K
(E.9)
1+ bR+K 2

Notice that the steady-state error is nonzero. To fix this, an integrator must be included
in the controller.
Ki
K(s) = Kp +
s
The same steady-state calculations are performed as before with the new controller.
E(s) 1
=
V (s) 1 + K(s)G(s)
312 Appendix E. Classical control theory

1
E(s) = V (s)
1 + K(s)G(s)
 
1 1
E(s) =   
1 + Kp + s Ki K s
(Js+b)(Ls+R)+K 2
 
1 1
ess = lim s   
s→0
1 + Kp + s Ki K s
(Js+b)(Ls+R)+K 2
1
ess = lim
s→0 Ki
 K

1 + Kp + s (Js+b)(Ls+R)+K 2
1 s
ess = lim
s→0 Ki
 K

s
1 + Kp + s (Js+b)(Ls+R)+K 2
s
ess = lim  
s→0 K
s + (Kp s + Ki ) (Js+b)(Ls+R)+K 2
0
ess =  
K
0 + (Kp (0) + Ki ) (J(0)+b)(L(0)+R)+K 2
0
ess = K
Ki bR+K 2

The denominator is nonzero, so ess = 0. Therefore, an integrator is required to elimi-


nate steady-state error in all cases for this model.
It should be noted that ess in equation (E.9) approaches zero for Kp = ∞. This is
known as a bang-bang controller. In practice, an infinite switching frequency cannot
be achieved, but it may be close enough for some performance specifications.

E.3.6 Do flywheels need PD control?


PID controllers typically control voltage to a motor in FRC independent of the equa-
tions of motion of that motor. For position PID control, large values of Kp can lead
to overshoot and Kd is commonly used to reduce overshoots. Let’s consider a flywheel
controlled with a standard PID controller. Why wouldn’t Kd provide damping for ve-
locity overshoots in this case?
PID control is designed to control second-order and first-order systems well. It can be
used to control a lot of things, but struggles when given higher order systems. It has
three degrees of freedom. Two are used to place the two poles of the system, and the
third is used to remove steady-state error. With higher order systems like a one input,
E.3 Laplace domain analysis 313

seven state system, there aren’t enough degrees of freedom to place the system’s poles
in desired locations. This will result in poor control.
The math for PID doesn’t assume voltage, a motor, etc. It defines an output based on
derivatives and integrals of its input. We happen to use it for motors because it actually
works pretty well for it because motors are second-order systems.
The following math will be in continuous time, but the same ideas apply to discrete
time. This is all assuming a velocity controller.
Our simple motor model hooked up to a mass is
ω
V = IR + (E.10)
Kv
τ = IKt (E.11)

τ =J (E.12)
dt

For an explanation of where these equations come from, read section 12.1.

First, we’ll solve for dt in terms of V .
Substitute equation (E.11) into equation (E.10).
ω
V = IR +
K
  v
τ ω
V = R+
Kt Kv
Substitute in equation (E.12).

J dω
dt ω
V = R+
Kt Kv

Solve for dt .

J dω
dt ω
V = R+
Kt Kv
ω J dω
V − = dt R
Kv Kt
 
dω Kt ω
= V −
dt JR Kv
dω Kt Kt
=− ω+ V
dt JRKv JR
314 Appendix E. Classical control theory

Now take the Laplace transform. Because the Laplace transform is a linear operator,
we can take the Laplace transform of each term individually. Based on table E.1, dω
dt
becomes sω and ω(t) and V (t) become ω(s) and V (s) respectively (the parenthetical
notation has been dropped for clarity).

Kt Kt
sω = − ω+ V (E.13)
JRKv JR
ω
Solve for the transfer function H(s) = V .

Kt Kt
sω = − ω+ V
JRKv JR
 
Kt Kt
s+ ω= V
JRKv JR
Kt
ω JR
= Kt
V s + JRK v

That gives us a pole at − JRK


Kt
v
, which is actually stable. Notice that there is only one
pole.
First, we’ll use a simple P controller.

V = Kp (ωgoal − ω)

Substitute this controller into equation (E.13).

Kt Kt
sω = − ω+ Kp (ωgoal − ω)
JRKv JR
ω
Solve for the transfer function H(s) = ωgoal .

Kt Kt Kp
sω = −
ω+ (ωgoal − ω)
JRKv JR
Kt Kt Kp Kt Kp
sω = − ω+ ωgoal − ω
JRKv JR JR
 
Kt Kt Kp Kt Kp
s+ + ω= ωgoal
JRKv JR JR
Kt Kp
ω
= JR 
ωgoal s+ Kt
+
Kt Kp
JRKv JR
E.3 Laplace domain analysis 315

 
Kt Kp
This has a pole at − Kt
JRKv + JR . Assuming that that quantity is negative (i.e.,
1
we are stable), that pole corresponds to a time constant of Kt Kt Kp .
JRKv + JR

As can be seen above, a flywheel has a single pole. It therefore only needs a single pole
controller to place that pole anywhere on the real axis.
This analysis assumes that the motor is well coupled to the mass and that the time
constant of the inductor is small enough that it doesn’t factor into the motor equations.
The latter is a pretty good assumption for a CIM motor (see figures E.15 and E.16). If
more mass is added to the motor armature, the response timescales increase and the
inductance matters even less.

Figure E.15: Step response of second- Figure E.16: Step response of first-order
order DC brushed motor plant augmented DC brushed motor plant augmented with
with position (L = 230 μH) position (L = 0 μH)

Next, we’ll try a PD loop. (This will use a perfect derivative, but anyone following along
closely already knows that we can’t really take a derivative here, so the math will need
to be updated at some point. We could switch to discrete time and pick a differentiation
method, or pick some other way of modeling the derivative.)

V = Kp (ωgoal − ω) + Kd s(ωgoal − ω)

Substitute this controller into equation (E.13).


Kt Kt
sω = − ω+ (Kp (ωgoal − ω) + Kd s(ωgoal − ω))
JRKv JR
316 Appendix E. Classical control theory

Kt K t Kp K t Kd s
sω = − ω+ (ωgoal − ω) + (ωgoal − ω)
JRKv JR JR
Kt K t Kp Kt Kp Kt Kd s Kt Kd s
sω = − ω+ ωgoal − ω+ ωgoal − ω
JRKv JR JR JR JR

Collect the common terms on separate sides and refactor.


Kt Kd s Kt K t Kp Kt Kp Kt Kd s
sω + ω+ ω+ ω= ωgoal + ωgoal
JR JRKv JR JR JR
   
Kt Kd s Kt Kt Kp Kt Kp Kt Kd s
s+ + + ω= + ωgoal
JR JRKv JR JR JR
   
Kt Kd Kt Kt Kp Kt
s 1+ + + ω= (Kp + Kd s) ωgoal
JR JRKv JR JR

ω
Solve for ωgoal .
Kt
ω JR (Kp + Kd s)
= Kt Kd
 Kt Kt Kp
ωgoal s 1+ + JRK + JR
JR v

Kt Kt Kp
K +
So, we added a zero at − Kpd and moved our pole to − JRKvKt KJR
d
. This isn’t progress.
1+ JR
We’ve added more complexity to our system and, practically speaking, gotten nothing
good in return. Zeroes should be avoided if at all possible because they amplify un-
wanted high frequency modes of the system and are noisier the faster the system is
sampled. At least this is a stable zero, but it’s still undesirable.
In summary, derivative doesn’t help on an ideal flywheel. Kd may compensate for
unmodeled dynamics such as accelerating projectiles slowing the flywheel down, but
that effect may also increase recovery time; Kd drives the acceleration to zero in the
undesired case of negative acceleration as well as well as the actually desired case of
positive acceleration.
Subsection 6.7.2 covers a superior compensation method that avoids zeroes in the con-
troller, doesn’t act against the desired control action, and facilitates better tracking.

E.3.7 Gain margin and phase margin


One generally needs to learn about Bode plots and Nyquist plots to truly understand
gain and phase margin and their origins, but those plots are large topics unto themselves.
Since we won’t be using either of them for controller design, we’ll just cover what gain
and phase margin are in a general sense and how they are used.
E.4 s-plane to z-plane 317

Gain margin and phase margin are two metrics for measuring a system’s relative stabil-
ity. Gain and phase margin are the amounts by which the closed-loop gain and phase
can be varied respectively before the system becomes unstable. In a sense, they are
safety margins for when unmodeled dynamics affect the system response.
For a more thorough explanation of gain and phase margin, watch Brian Douglas’s
video on them.

“Gain and Phase Margins Explained!” (14 minutes)


Brian Douglas
https://youtu.be/ThoA4amCAX4

He has other videos too on classical control methods like Bode and Nyquist plots that
we recommend.

E.4 s-plane to z-plane


Transfer functions are converted to impulse responses using the Z-transform. The s-
plane’s LHP maps to the inside of a unit circle in the z-plane. Table E.2 contains a few
common points and figure E.17 shows the mapping visually.

s-plane z-plane
(0, 0) (1, 0)
imaginary axis edge of unit circle
(−∞, 0) (0, 0)

Table E.2: Mapping from s-plane to z-plane

E.4.1 Discrete system stability


Poles of a system that are within the unit circle are stable, but why is that? Let’s consider
a scalar equation xk+1 = axk . a < 1 makes xk+1 converge to zero. The same applies
to a complex number like z = x+yi for xk+1 = zxk . If the magnitude of the complex
318 Appendix E. Classical control theory

Figure E.17: Mapping of complex plane from s-plane (left) to z-plane (right)

number z is less than one, xk+1 will converge to zero. Values with a magnitude of 1
oscillate forever because xk+1 never decays.

E.4.2 Discrete system behavior


As ω increases in s = jω, a pole in the z-plane moves around the perimeter of the unit
circle. Once it hits ω2s (half the sampling frequency) at (−1, 0), the pole wraps around.
This is due to poles faster than the sample frequency folding down to below the sample
frequency (that is, higher frequency signals alias to lower frequency ones).
You may notice that poles can be placed at (0, 0) in the z-plane. This is known as a
deadbeat controller. An Nth -order deadbeat controller decays to the reference in N
timesteps. While this sounds great, there are other considerations like control effort,
robustness, and noise immunity.
If poles from (1, 0) to (0, 0) on the x-axis approach infinity, then what do poles from
(−1, 0) to (0, 0) represent? Them being faster than infinity doesn’t make sense. Poles
in this location exhibit oscillatory behavior similar to complex conjugate pairs. See
figures E.18 and E.19. The jaggedness of these signals is due to the frequency of the
system dynamics being above the Nyquist frequency (twice the sample frequency). The
discretized signal doesn’t have enough samples to reconstruct the continuous system’s
dynamics.
E.5 Discretization methods 319

Figure E.18: Single poles in various loca- Figure E.19: Complex conjugate poles in
tions in z-plane various locations in z-plane

E.5 Discretization methods


Discretization is done using a zero-order hold. That is, the input is only updated at
discrete intervals and it’s held constant between samples (see figure E.20). The exact
method of applying this uses the matrix exponential, but this can be computationally
expensive. Instead, approximations such as the following are used.
1. Forward Euler method. This is defined as yn+1 = yn + f (tn , yn )∆t.
2. Backward Euler method. This is defined as yn+1 = yn + f (tn+1 , yn+1 )∆t.
−1
3. Bilinear transform. The first-order bilinear approximation is s = T2 1−z
1+z −1 .

where the function f (tn , yn ) is the slope of y at n and T is the sample period for the
discrete system. Each of these methods is essentially finding the area underneath a
curve. The forward and backward Euler methods use rectangles to approximate that
area while the bilinear transform uses trapezoids (see figures E.21 and E.22). Since
these are approximations, there is distortion between the real discrete system’s poles and
the approximate poles. This is in addition to the phase loss introduced by discretizing
at a given sample rate in the first place. For fast-changing systems, this distortion can
quickly lead to instability.
Figures E.23, E.24, and E.25 show simulations of the same controller for different
sampling methods and sample rates, which have varying levels of fidelity to the real
system.
Forward Euler is numerically unstable for low sample rates. The bilinear transform is a
significant improvement due to it being a second-order approximation, but zero-order
hold performs best due to the matrix exponential including much higher orders.
320 Appendix E. Classical control theory

Figure E.20: Zero-order hold of a system Figure E.21: Discretization methods ap-
response plied to velocity data

Figure E.22: Position plot of discretization


methods applied to velocity data

Table E.3 compares the Taylor series expansions of several common discretization
methods (these are found using polynomial division). The bilinear transform does best
with accuracy trailing off after the third-order term. Forward Euler has no second-
order or higher terms, so it undershoots. Backward Euler has twice the second-order
term and overshoots the remaining higher order terms as well.
E.5 Discretization methods 321

Figure E.23: Sampling methods for system Figure E.24: Sampling methods for system
simulation with T = 0.1 s simulation with T = 0.05 s

Figure E.25: Sampling methods for system


simulation with T = 0.01 s

Method Conversion to z Taylor series expansion


Zero-order hold e Ts
1 + T s + 12 T 2 s2 + 16 T 3 s3 + . . .
1+ 12 T s
Bilinear 1− 12 T s
1 + T s + 12 T 2 s2 + 14 T 3 s3 + . . .
Forward Euler 1 + Ts 1 + Ts
1
Backward Euler 1−T s 1 + T s + T 2 s2 + T 3 s3 + . . .

Table E.3: Taylor series expansions of discretization methods (scalar case). The zero-
order hold discretization method is exact.
322 Appendix E. Classical control theory

E.6 Phase loss


Implementing a discrete control system is easier than implementing a continuous one,
but discretization has drawbacks. A microcontroller updates the system input in dis-
crete intervals of duration T ; it’s held constant between updates. This introduces an
average sample delay of T2 that leads to phase loss in the controller. Phase loss is the re-
duction of phase margin that occurs in digital implementations of feedback controllers
from sampling the continuous system at discrete time intervals. As the sample rate of
the controller decreases, the phase margin decreases according to − T2 ω where T is
the sample period and ω is the frequency of the system dynamics. Instability occurs if
the phase margin of the system reaches zero. Large amounts of phase loss can make
a stable controller in the continuous domain become unstable in the discrete domain.
Here are a few ways to combat this.
• Run the controller with a high sample rate.
• Designing the controller in the analog domain with enough phase margin to com-
pensate for any phase loss that occurs as part of discretization.
• Convert the plant to the digital domain and design the controller completely in
the digital domain.

E.7 System identification with Bode plots


It’s straightforward to perform system identification by feeding a system sine waves of
increasing frequency, recording the amplitude of the output oscillations, then collating
that information into a Bode plot. This plot can be used to create a transfer function,
or lead and lag compensators can be applied directly based on the Bode plot.
Sunset near Porter Academic building at UCSC

Glossary

agent An independent actor being controlled through autonomy or human-in-the-loop


(e.g., a robot, aircraft, etc.).

control effort A term describing how much force, pressure, etc. an actuator is exerting.
control input The input of a plant used for the purpose of controlling it.
control law Also known as control policy, is a mathematical formula used by the con-
troller to determine the input u that is sent to the plant. This control law
is designed to drive the system from its current state to some other desired
state.
control system Monitors and controls the behavior of a system.
controller Applies an input to a plant to bring about a desired system state by driving
the difference between a reference signal and the output to zero.

discretization The process by which a continuous (e.g., analog) system or controller


design is converted to discrete (e.g., digital).
disturbance An external force acting on a system that isn’t included in the system’s
model.
disturbance rejection The quality of a feedback control system to compensate for
external forces to reach a desired reference.

error Reference minus an output or state.

feedback controller Used in positive or negative feedback with a plant to bring about
324 Glossary

a desired system state by driving the difference between a reference signal


and the output to zero.
feedback gain The gain from the output to an earlier point in a control system diagram.
feedforward controller A controller that injects information about the system’s dy-
namics (like a model does) or the desired movement. The feedforward han-
dles parts of the control actions we already know must be applied to make a
system track a reference, then the feedback controller compensates for what
we do not or cannot know about the system’s behavior at runtime.

gain A proportional value that shows the relationship between the magnitude of an
input signal to the magnitude of an output signal at steady-state.
gain margin See section E.3.7 on gain and phase margin.

impulse response The response of a system to the Dirac delta function.


input An input to the plant (hence the name) that can be used to change the plant’s
state.

linearization A method by which a nonlinear system’s dynamics are approximated by


a linear system.
localization The process of using measurements of the environment to determine an
agent’s pose.

model A set of mathematical equations that reflects some aspect of a physical system’s
behavior.

noise immunity The quality of a system to have its performance or stability unaffected
by noise in the outputs (see also: robustness).

observer In control theory, a system that estimates the internal state of a given real
system from measurements of the input and output of the real system.
open-loop gain The gain directly from the input to the output, ignoring loops.
output Measurements from sensors.
output-based control Controls the system’s state via the outputs.
overshoot The amount by which a system’s state surpasses the reference after rising
toward it.

phase margin See section E.3.7 on gain and phase margin.


plant The system or collection of actuators being controlled.
pose The position and orientation of an agent in the world, which is represented by all
or part of the agent’s state.
process variable The term used to describe the output of a plant in the context of PID
control.
Glossary 325

realization In control theory, this is an implementation of a given input-output behavior


as a state-space model.
reference The desired state. This value is used as the reference point for a controller’s
error calculation.
regulator A controller that attempts to minimize the error from a constant reference
in the presence of disturbances.
rise time The time a system takes to initially reach the reference after applying a step
input.
robustness The quality of a feedback control system to remain stable in response to
disturbances and uncertainty.

setpoint The term used to describe the reference of a PID controller.


settling time The time a system takes to settle at the reference after a step input is
applied.
state A characteristic of a system (e.g., velocity) that can be used to determine the
system’s future behavior.
state feedback Uses state instead of output in feedback.
steady-state error Error after system reaches equilibrium.
step input A system input that is 0 for t < 0 and a constant greater than 0 for t ≥ 0.
A step input that is 1 for t ≥ 0 is called a unit step input.
step response The response of a system to a step input.
stochastic process A process whose model is partially or completely defined by random
variables.
system A term encompassing a plant and its interaction with a controller and observer,
which is treated as a single entity. Mathematically speaking, a system maps
inputs to outputs through a linear combination of states.
system response The behavior of a system over time for a given input.

time-invariant The system’s fundamental response does not change over time.
tracking In control theory, the process of making the output of a control system follow
the reference.

unity feedback A feedback network in a control system diagram with a feedback gain
of 1.
This page intentionally left blank
Sunset near Porter Academic building at UCSC

Bibliography

Online
[1] Wikimedia Commons. Runge-Kutta methods: Explicit Runge-Kutta methods. URL:
https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#Explicit_
Runge.E2.80.93Kutta_methods (visited on 06/18/2020) (cited on page 100).
[2] Wikipedia Commons. Linear-quadratic regulator. URL: https : / / en . wikipedia .
org / wiki / Linear % E2 % 80 % 93quadratic _ regulator (visited on 03/24/2018)
(cited on page 269).
[3] Wikipedia Commons. Matrix calculus identities. URL: https://en.wikipedia.org/
wiki/Matrix_calculus#Identities (visited on 08/03/2021) (cited on page 43).
[4] Declan Freeman-Gleason. A Dive into WPILib Trajectories. 2020. URL: https :
//pietroglyph.github.io/trajectory-presentation/ (cited on page 248).
[5] Sean Humbert. Why do we have to linearize around an equilibrium point? URL:
https://www.cds.caltech.edu/%7Emurray/courses/cds101/fa02/faq/02- 10-
09_linearization.html (visited on 07/12/2018) (cited on page 104).
[6] Simon J. Julier and Jeffrey K. Uhlmann. A New Extension of the Kalman Filter
to Nonlinear Systems. URL: https://www.cs.unc.edu/~welch/kalman/media/pdf/
Julier1997_SPIE_KF.pdf (visited on 09/29/2019) (cited on page 175).
[7] Matthew Kelly. An Introduction to Trajectory Optimization: How to Do Your
Own Direct Collocation. URL: https://www.matthewpeterkelly.com/research/
MatthewKelly_IntroTrajectoryOptimization_SIAM_Review_2017.pdf (vis-
ited on 08/27/2019) (cited on page 260).
328 Bibliography

[8] Edgar Kraft. A Quaternion-based Unscented Kalman Filter for Orientation Track-
ing. URL: https://kodlab.seas.upenn.edu/uploads/Arun/UKFpaper.pdf (visited
on 07/11/2018) (cited on page 175).
[9] Roger Labbe. Kalman and Bayesian Filters in Python. URL: https://github.com/
rlabbe / Kalman - and - Bayesian - Filters - in - Python/ (visited on 04/29/2020)
(cited on page 135).
[10] Charles F. Van Loan. Computing Integrals Involving the Matrix Exponential. URL:
https://file.tavsys.net/control/papers/Computing%20Integrals%20Involving%
20the % 20Matrix % 20Exponential % 2C % 20Van % 20Loan . pdf (visited on
12/20/2021) (cited on page 84).
[11] Andrew W. Long et al. The Banana Distribution is Gaussian: A Localization
Study with Exponential Coordinates. 2008. URL: https://rpk.lcsr.jhu.edu/wp-
content/uploads/2014/08/p34_Long12_The- Banana- Distribution.pdf (cited
on page 135).
[12] Luca, Oriolo, and Vendittelli. Control of Wheeled Mobile Robots: An Experi-
mental Overview. URL: https : / / www . diag . uniroma1 . it / ~labrob / pub / papers /
RAMSETE_Chap_LNCIS270.pdf (visited on 08/09/2018) (cited on pages 124,
125).
[13] Simon J.A. Malham. An introduction to Lagrangian and Hamiltonian Mechan-
ics. URL: https:/ / www . macs . hw . ac . uk / ~simonm / mechanics . pdf (visited on
08/23/2016) (cited on page 223).
[14] Rudolph van der Merwe and Eric A. Wan. The Square-Root Unscented Kalman
Filter for State and Parameter Estimation. URL: https://www.researchgate.net/
publication / 3908304 _ The _ Square - Root _ Unscented _ Kalman _ Filter _ for _
State_and_Parameter-Estimation (visited on 04/25/2022) (cited on page 175).
[15] J. Nocedal and S. Wright. Numerical Optimization. 2008. URL: https://www.math.
uci.edu/~qnie/Publications/NumericalOptimization.pdf (cited on page 259).
[16] Christoph Sprunk. Planning Motion Trajectories for Mobile Robots Using Splines.
2008. URL: http://www2.informatik.uni-freiburg.de/~lau/students/Sprunk2008.
pdf (cited on page 248).
[17] Russ Tedrake. Underactuated Robotics. 2019. URL: https://underactuated.mit.
edu (cited on page 132).
[18] Russ Tedrake. Chapter 9. Linear Quadratic Regulators. URL: https://underactuated.
mit.edu/lqr.html (visited on 03/22/2020) (cited on page 269).
[19] Gabriel A. Terejanu. Unscented Kalman Filter Tutorial. URL: https://web.archive.
org/web/20220000000000*/https://cse.sc.edu/~terejanu/files/tutorialUKF.
pdf (visited on 09/28/2022) (cited on page 175).
[20] Michael Triantafyllou. lec19.pdf | Maneuvering and Control of Surface and Un-
derwater Vehicles (13.49). URL: https://ocw.mit.edu/courses/2-154-maneuvering-
Bibliography 329

and-control-of-surface-and-underwater-vehicles-13-49-fall-2004/resources/
lec19/ (visited on 04/02/2022) (cited on page 90).
[21] Jeffrey Uhlmann. First-Hand:The Unscented Transform. URL: https://ethw.org/
First - Hand : The _ Unscented _ Transform (visited on 07/27/2021) (cited on
page 175).
[22] Eric A. Wan and Rudolph van der Merwe. The Unscented Kalman Filter for
Nonlinear Estimation. URL: https : / / groups . seas . harvard . edu / courses / cs281 /
papers/unscented.pdf (visited on 06/26/2018) (cited on page 175).
This page intentionally left blank
Sunset near Porter Academic building at UCSC

Index

algebraic Riccati equation digital signal processing


continuous, 40 aliasing, 77
discrete, 41 Nyquist frequency, 77
discretization
block diagrams, 8 backward Euler method, 319
simplification, 263 bilinear transform, 319
forward Euler method, 319
linear system zero-order hold, 79
controller design
matrix exponential, 82
actuator saturation, 24
Taylor series, 80
controllability, 72
controllability Gramian, 72 feedforward
linear time-varying control, 114 linear plant inversion, 94
linear-quadratic regulator, 87 steady-state feedforward, 285
Bryson’s rule, 90 FRC models
definition, 90 DC brushed motor equations,
implicit model following, 272 203
state feedback with output differential drive equations, 110
cost, 270 elevator equations, 57
time delay compensation, 275 flywheel equations, 60
pole placement, 86 holonomic drivetrains, 109
stabilizability, 73 single-jointed arm equations, 68
332 Index

gain, 8 Extended Kalman filter, 173


linear time-varying control, 114
integral control linearization, 104, 107
input error estimation, 55 Lyapunov stability, 104
plant augmentation, 54 Unscented Kalman filter, 175
numerical integration
linear algebra Dormand-Prince, 100
basis vectors, 36 Forward Euler, 97
linear combination, 36 Runge-Kutta fourth-order, 98
vectors, 35
Lyapunov equation observer design
continuous, 42 detectability, 74
discrete, 42 observability, 73
observability Gramian, 74
matrices optimal control
determinant, 37 linear plant inversion, 94
diagonal, 39 linear time-varying control, 114
eigenvalues, 38 linear-quadratic regulator, 87
Hessian, 43 Bryson’s rule, 90
inverse, 37 definition, 90
Jacobian, 42
linear systems, 37 physics
linear transformation, 36 conservation of energy, 215
multiplication, 36 sum of forces, 213
pseudoinverse, 39 sum of torques, 214
rank, 37 PID control, 13, 45
trace, 39 flywheel (classical control), 312
transpose, 38 flywheel (modern control), 65
model augmentation probability, 140
of controller, 52 Bayes’s rule, 146
of observer, 52 central limit theorem, 149
of output, 52 conditional expectation, 146
of plant, 52 conditional probability density
motion profiles functions, 145
1-DOF, 241 conditional variances, 146
2-DOF, 248 covariance, 144
S-curve, 241 covariance matrix, 147
trajectory optimization, 255 expected value, 141
trapezoid, 242 marginal probability density
functions, 145
nonlinear control probability density function, 140
Index 333

probability density functions, Kalman filter


142 as Luenberger observer, 166
random variables, 140 derivations, 150, 156
variance, 142 equations, 161
setup, 163
stability Kalman smoother, 168
eigenvalues, 51, 86, 138 equations, 169
gain margin, 316 Luenberger observer, 136
phase margin, 316 multiple model adaptive
poles, 49 estimation, 176
poles and zeroes, 295 Unscented Kalman filter, 175
state-space controllers steady-state error
continuous closed-loop, 50 classical, 310
continuous open-loop, 48 stochastic
discrete closed-loop, 85 linear systems, 150
discrete open-loop, 85 measurement noise, 150
state-space observers process noise, 150
Extended Kalman filter, 173 two-sensor problem, 152
This page intentionally left blank
When I was a high school student on FIRST Robotics Competition (FRC) team 3512, I
had to learn control theory from scattered internet sources that either weren’t rigorous
enough or assumed too much prior knowledge. After I took graduate-level control
theory courses from University of California, Santa Cruz for my bachelor’s degree, I
realized that the concepts weren’t difficult when presented well, but the information
wasn’t broadly accessible outside academia.

I wanted to fix the information disparity so more people could appreciate the beauty
and elegance I saw in control theory. This book streamlines the learning process to
make that possible.

I wrote the initial draft of this book as a final project for an undergraduate technical
writing class I took at UCSC in Spring 2017 (CMPE 185). It was a 13-page IEEE-
formatted paper intended as a reference manual and guide to state-space control
that summarized the three graduate controls classes I had taken that year. I kept
working on it the following year to flesh it out, and it eventually became long enough
to make into a proper book. I’ve been adding to it ever since as I learn new things.

I contextualized the material within FRC because it’s always been a significant part
of my life, and it’s a useful application sandbox. I maintain implementations of many
of this book’s tools in the FRC standard library (WPILib).

– Tyler Veness

You might also like