Handout 1 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

Chapter 1

State Space Models

After a first course in control system design one learns that intuition is a
starting point in control design, but that intuition may fail for complex
systems such as those with multiple inputs and outputs, or systems with
nonlinearities. In order to augment our intuition to deal with problems as
complex as high speed flight control, or flow control for high speed commu-
nication networks, one must start with a useful mathematical model of the
system to be controlled. It is not necessary to “take the system apart” -
to model every screw, valve, and axle. In fact, to use the methods to be
developed in this book it is frequently more useful to find a simple model
which gives a reasonably accurate description of system behavior. It may
be difficult to verify a complex model, in which case it will be difficult to
trust that the model accurately describes the system to be controlled. More
crucial in the context of this course is that the control design methodology
developed in this text is model based. Consequently, a complex model of
the system will result in a complex control solution, which is usually highly
undesirable in practice.
Although this text treats nonlinear models in some detail, the most
far reaching approximation made is linearity. It is likely that no physical
system is truly linear. The reasons that control theory is effective in practice
even when this ubiquitous assumption fails are that (i) physical systems can
frequently be approximated by linear models; and (ii) control systems are
designed to be robust with respect to inaccuracies in the system model. It
is not simply luck that physical systems can be modeled with reasonable
accuracy using linear models. Newton’s laws, Kirchoff’s voltage and current
laws, and many other laws of physics give rise to linear models. Moreover,
in this chapter we show that a generalized Taylor series expansion allows

1
2 CHAPTER 1. STATE SPACE MODELS

the approximation of even a grossly nonlinear model by a linear one.


The linear models we consider are primarily of the state space form
ẋ = Ax + Bu
y = Cx + Du, (1.1)
where x is a vector signal in Rn , y and u are the output and input of the
system evolving in Rp and Rm , respectively, and A, B, C, D are matrices
of appropriate dimensions. If these matrices do not depend on the time
variable, t, the corresponding linear system will be referred to as linear
time invariant (LTI), whereas if any one of these matrices has time-varying
entries, then the underlying linear system will be called linear time varying
(LTV). If both the input and the output are scalar, then we refer to the
system as single input-single output (SISO); if either a control or output are
of dimension higher than one, then the system is multi input-multi output
(MIMO).
Our first example now is a simple nonlinear system which we approxi-
mate by a linear state space model of the form (1.1) through linearization.

u(t)

y(t)
Reference
Height r

Figure 1.1: Magnetically Suspended Ball

1.1 An electromechanical system


The magnetically suspended metallic ball illustrated in Figure 1.1 is a simple
example which illustrates some of the important modeling issues addressed
1.2. LINEARIZATION ABOUT AN EQUILIBRIUM STATE 3

in this chapter. The input u is the current applied to the electro-magnet,


and the output y is the distance between the center of the ball and some
reference height. Since positive and negative inputs are indistinguishable at
the output of this system, it follows that this cannot be a linear system.
The upward force due to the current input is approximately proportional to
u2 /y 2 , and hence from Newton’s law for translational motion we have

u2
ma = mÿ = mg − c ,
y2
where g is the gravitational constant and c is some constant depending on
the physical properties of the magnet and ball. This input-output model
can be converted to (nonlinear) state space form using x1 = y and x2 = ẏ:

c u2
ẋ1 = x2 , ẋ2 = g −
m x21
where the latter equation follows from the formula ẋ2 = ÿ. This pair of
equations forms a two-dimensional state space model

ẋ1 = x2 = f1 (x1 , x2 , u) (1.2)


c u2
ẋ2 = g − = f2 (x1 , x2 , u) (1.3)
m x21
 
It is nonlinear, since f2 is a nonlinear function of xx12 . Letting x = x1
x2 ,

and f = ff12 , the state equations may be written succinctly as

ẋ = f (x, u).

The motion of a typical solution to a nonlinear state space model in R2 is


illustrated in Figure 1.2.

1.2 Linearization about an equilibrium state


Suppose that a fixed value ue , say positive, is applied, and that the state
xe = ( xxe1
e2 ) has the property that
 
f1 (xe1 ,xe2 ,ue )
f (xe1 , xe2 , ue ) = f2 (xe1 ,xe2 ,ue ) = ϑ.

From the definition of f we must have xe2 = 0, and


r
c
xe1 = ue
mg
4 CHAPTER 1. STATE SPACE MODELS

x(0)

f(x(t),u(t))
x(t)

Figure 1.2: Trajectory of a nonlinear state space model in two dimensions:


ẋ = f (x, u)

which is unique when restricted to be positive. The state xe is called an


equilibrium state since the velocity vector f (x, u) vanishes when x = xe ,
and u = ue . If the signals x1 (t), x2 (t) and u(t) remain close to the fixed
point (xe1 , xe2 , ue ), then we may write

x1 (t) = xe1 + δx1 (t)


x2 (t) = xe2 + δx2 (t)
u(t) = ue + δu(t),

where δx1 (t), δx2 (t), and δu(t) are small-amplitude signals. From the state
equations (1.2) and (1.3) we then have

δẋ1 = xe2 + δx2 (t) = δx2 (t)


δẋ2 = f2 (xe1 + δx1 , xe2 + δx2 , ue + δu)

Applying a Taylor series expansion to the right hand side (RHS) of the
second equation above gives
∂f2 ∂f2
δẋ2 = f2 (xe1 , xe2 , ue ) + δx 1 + δx2
∂x1 (xe1 ,xe2 ,ue ) ∂x2 (xe1 ,xe2 ,ue )
∂f2
+ δu + H.O.T.
∂u (xe1 ,xe2 ,ue )
After computing partial derivatives we obtain the formulae

δx˙ 1 = δx2 .
c u2e 2c ue
δẋ2 = 2 3 δx1 − δu + H.O.T.
m xe1 m x2e1
1.3. LINEARIZATION ABOUT A TRAJECTORY 5
 
x (t)
Letting x denote the bivariate signal x(t) = x12 (t) we may write the lin-
earized system in state space form:
   
0 1 0
δẋ = δx + δu
α 0 β
δy = δx1 ,

where
c u2e c ue
α=2 , β = −2 .
m x3e1 m x2e1

This linearized system is only an approximation, but one that is reasonable


and useful in control design as long as the state δx and the control δu remain
small. For example, we will see in Chapter 4 that “local” stability of the
nonlinear system is guaranteed if the simpler linear system is stable.

1.3 Linearization about a trajectory


Consider now a general nonlinear model

ẋ (t) = f (x(t), u(t), t)

where f is a continuously differentiable (C 1 ) function of its arguments. Sup-


pose that for an initial condition x0 given at time t0 (i.e., x(t0 ) = x0 ), and a
given input un (t), the solution of the nonlinear state equation above exists
and is denoted by xn (t), which we will refer to as the nominal trajectory
corresponding to the nominal input un (t), for the given initial state. For ex-
ample, in the control of a satellite orbiting the earth, the nominal trajectory
xn (t) might represent a desired orbit (see Exercise 7 below). We assume
that the input and state approximately follow these nominal trajectories,
and we again write

x(t) = xn (t) + δx(t)


u(t) = un (t) + δu(t)

From a Taylor series expansion we then have


 ∂f   ∂f 

ẋ = ẋn + δẋ = f (xn , un , t) + n n δx + n n δu + H.O.T.
∂x (x ,u ,t) ∂u (x ,u ,t)
| {z } | {z }
A(t) B(t)
6 CHAPTER 1. STATE SPACE MODELS

Since we must have ẋn = f (xn , un , t), this gives the state space description

δẋ = A(t)δx + B(t)δu,

where the higher order terms have been ignored.

1.4 A two link inverted pendulum


Below is a photograph of the pendubot found at the robotics laboratory at
the University of Illinois, and a sketch indicating its component parts. The
pendubot consists of two rigid aluminum links: link 1 is directly coupled to
the shaft of a DC motor mounted to the end of a table. Link 1 also includes
the bearing housing for the second joint. Two optical encoders provide
position measurements: one is attached at the elbow joint and the other
is attached to the motor. Note that no motor is directly connected to link
2 - this makes vertical control of the system, as shown in the photograph,
extremely difficult!

Link 1
Encoder 1
Motor

Table
Link 1
Encoder 2

Encoder 2

Link 2
Link 2

Motor
and
Encoder 1

Figure 1.3: The Pendubot

Since the pendubot is a two link robot with a single actuator, its dynamic
equations can be derived using the so-called Euler-Lagrange equations found
1.4. A TWO LINK INVERTED PENDULUM 7

in numerous robotics textbooks [11]. Referring to the figure, the equations


of motion are

d11 q̈1 + d12 q̈2 + h1 + φ1 = τ (1.4)


d21 q̈1 + d22 q̈2 + h2 + φ2 = 0 (1.5)

where q1 , q2 are the joint angles and

d11 = m1 ℓ2c1 + m2 (ℓ21 + ℓ2c2 + 2ℓ1 ℓc2 cos(q2 )) + I1 + I2


d22 = m2 ℓ2c2 + I2
d12 = d21 = m2 (ℓ2c2 + ℓ1 ℓc2 cos(q2 )) + I2
h1 = −m2 ℓ1 ℓc2 sin(q2 )q̇22 − 2m2 ℓ1 ℓc2 sin(q2 )q̇2 q̇1
h2 = m2 ℓ1 ℓc2 sin(q2 )q̇12
φ1 = (m1 ℓc1 + m2 ℓ1 )g cos(q1 ) + m2 ℓc2 g cos(q1 + q2 )
φ2 = m2 ℓc2 g cos(q1 + q2 )

The definitions of the variables qi , ℓ1 , ℓci can be deduced from Figure 1.4.

q
2

l c2

q
1

x
lc1
l1

Figure 1.4: Coordinate description of the pendubot: ℓ1 is the length of the


first link, and ℓc1 , ℓc2 are the distances to the center of mass of the respective
links. The variables q1 , q2 are joint angles of the respective links.
8 CHAPTER 1. STATE SPACE MODELS

This model may be written in state space form as a nonlinear vector


differential equation, ẋ = f (x, u), where x = (q1 , q2 , q̇1 , q̇2 )′ , and f is defined
from the above equations. For a range of different constant torque inputs τ ,
this model admits various equilibria. For example, when τ = 0, the vertical
downward position xe = (−π/2, 0, 0, 0) is an equilibrium, as illustrated in
the right hand side of Figure 1.3. When τ = 0 it follows from the equations
of motion that the upright vertical position xe = (+π/2, 0, 0, 0) is also an
equilibrium. It is clear from the photograph given in the left hand side of
Figure 1.3 that the upright equilibrium is strongly unstable in the sense
that with τ = 0, it is unlikely that the physical system will remain at rest.
Nevertheless, the velocity vector vanishes, f (xe , 0) = 0, so by definition the
upright position is an equilibrium when τ = 0. Although complex, we may
again linearize these equations about the vertical equilibrium. The control
design techniques introduced later in the book will provide the tools for
stabilization of the pendubot in this unstable vertical configuration via an
appropriate controller.

Figure 1.5: There is a continuum of different equilibrium positions for the


Pendubot corresponding to different constant torque inputs τ .
1.5. AN ELECTRICAL CIRCUIT 9

1.5 An electrical circuit


A simple electrical circuit is illustrated in Figure 1.6. Using Kirchoff’s volt-
age and current laws we may obtain a state space model in which the current
through the inductor and the capacitor voltage become state variables.

x (t)
1
+ - + -
C
L
+
u(t)
R
-
x2 (t)

Figure 1.6: A simple RLC circuit

u x2 x1
1 1
∑ ∑
L C
- -

1
RC

Figure 1.7: A simulation diagram for the corresponding state space model
with x1 = the voltage across the capacitor, and x2 = the current through
the inductor.

Kirchoff’s Current Law (KCL) gives


1
x2 = C ẋ1 + x1 ,
R
which may be written as
1 1
ẋ1 = − x1 + x2 (1.6)
RC C
From Kirchoff’s Voltage Law (KVL) we have

x1 + Lẋ2 = +u,
10 CHAPTER 1. STATE SPACE MODELS

or,
1 1
ẋ2 = − x1 + u. (1.7)
L L
Equations (1.7) and (1.6) thus give the system of state space equations
 1 1
  
− RC C 0
ẋ = x+ u
− L1 0 1
L

1.6 Transfer Functions & State Space Models


In the previous examples we began with a physical description of the system
to be controlled, and through physical laws obtained a state space model
of the system. In the first and second examples, a linear model could be
obtained through linearization, while in the last example, a linear model
was directly obtained through the KVL and KCL circuit laws. For a given
LTI system, a state space model which is constructed from a given transfer
function G(s) is called a realization of G. Realization of a given G is not
unique, and in fact there are infinitely many such realizations, all of which
are equivalent however from an input-output point of view. In this section
we show how to obtain some of the most common realizations starting from
a transfer function model of a system.
Consider the third-order model
...
y + a2 ÿ + a1 ẏ + a0 y = b2 ü + b1 u̇ + b0 u (1.8)

where the coefficients {ai , bi } are arbitrary real numbers. The corresponding
transfer function description is

Y (s) B(s) b2 s 2 + b1 s + b0
G(s) = = = 3
U (s) A(s) s + a2 s2 + a1 s + a0

where Y and U are the Laplace transforms of the signals y and u, respec-
tively. The numerator term B(s) = b2 s2 + b1 s + b0 can create complexity
in determining a state space model. So, as a starting point, we consider the
“zero-free system” where B(s) ≡ 1, which results in the model
...
w + a2 ẅ + a1 ẇ + a0 w = u

An equivalent simulation diagram description for this system is:


1.6. TRANSFER FUNCTIONS & STATE SPACE MODELS 11

... .. .
u w w w w

- - -
a2

a1

a0

With zero initial conditions one has the relation Y (s) = B(s)W (s), so
that the signal y can be obtained by adding several interconnections to
the above simulation diagram, yielding the simulation diagram depicted in
Figure 1.8. Letting the outputs of the integrators form states for the system
we obtain the state space model

x1 = w ẋ1 = x2
x2 = ẇ ẋ2 = x3
x3 = ẅ ẋ3 = −a2 x3 − a1 x2 − a0 x1 + u

and

y = b0 x1 + b1 x2 + b2 x3

This may be written in matrix form as


   
0 1 0 0
ẋ =  0 0 1  
x + 0 u
−a0 −a1 −a2 1
y = [b0 b1 b2 ]x + [0]u.

This final state space model is called the controllable canonical form (CCF)
- one of the most important system descriptions from the point of view of
analysis.
Several alternative descriptions can be obtained by manipulating the
transfer function G(s) before converting to the time domain, or by defining
states in different ways. One possibility is to take the description

(s3 + a2 s2 + a1 s + a0 )Y (s) = (b0 + b1 s + b2 s2 )U (s) (1.9)

and divide throughout by s3 to obtain


  
a2 a1 a0  b0 b1 b2
1+ + 2 + 3 Y (s) = + + U (s)
s s s s3 s2 s
12 CHAPTER 1. STATE SPACE MODELS

b2

b1

b0

u w
∑ ∑

-
- -
y
a2

a1

a0

Figure 1.8: Controllable Canonical Form

a0
a1
a2

u
- - - y
b0 ∑ ∑ ∑

b1
b2

Figure 1.9: Observable Canonical Form


1.6. TRANSFER FUNCTIONS & STATE SPACE MODELS 13

Rearranging terms then gives


1 1
Y = 3
(b0 U − a0 Y ) + 2 (b1 U − a1 Y )
s s
1
+ (b2 U − a2 Y )
s
We may again describe this equation using a simulation diagram, as given
in Figure 1.9. As before, by letting x1 , x2 , x3 denote the outputs of the
integrators we obtain a state space model which now takes the form
ẋ1 = x2 − a2 x1 + b2 u
ẋ2 = x3 − a1 x1 + b1 u
ẋ3 = −a0 x1 + b0 u
y = x1
or in matrix form
   
−a2 1 0 b2
ẋ =  −a1 0 1  x +  b1  u
−a0 0 0 b0
y = [1 0 0]x + [ 0 ]u.
This final form is called the observable canonical form (OCF).
In the example above, the degree n0 of the denominator of G(s) is 3,
and the degree m0 of the numerator is 2, so that n0 > m0 . In this case
the model is called strictly proper . In the case where n0 = m0 , the “D”
matrix in (1.1) will be non-zero in any state space realization. To see this,
try adding the term b3 s3 to the right hand side of (1.9), or solve Exercise 10
of this chapter.
Both controllable and observable canonical forms admit natural gen-
eralizations to the n-dimensional case. For a SISO LTI system, let the
input-output transfer function be given by
Y (s) B(s) bn−1 sn−1 + · · · + b1 s + b0
G(s) = = = n
U (s) A(s) s + an−1 sn−1 + · · · + a2 s2 + a1 s + a0
Then, the n-dimensional state space realization in controllable canonical
form is identified by the following A, B, and C matrices:
     T
0 1 0 0 ··· 0 0 b0
 0 0 1 0 ··· 0   ..   b1 
     
A= . . . .  ; B= . ; C= .  .
 . . .
. .
. . .   0  .
 . 
−a0 −a1 −a2 a3 · · · an−1 1 bn−1
14 CHAPTER 1. STATE SPACE MODELS

The observable canonical form, on the other hand, will have the following
A, B, and C matrices:
     T
−an−1 1 0 0 · · · 0 bn−1 1
 −an−2 0 1 0 · · · 0   ..   0 
     
A= .. .. .. .. ; B= . ; C= ..  .
 . . . .   b1   . 
−a0 0 0 0 · · · 0 b0 0

Another alternative is obtained by applying a partial fraction expansion


to the transfer function G:
X ki n
b(s)
G(s) = =d+ ,
a(s) s − pi
i=1

where {pi : 1 ≤ i ≤ n} are the poles of G, which are simply the roots of
a. A partial expansion of this form is always possible if all of the poles are
distinct. In general, a more complex partial fraction expansion must be em-
ployed. When a simple partial fraction expansion is possible, as above, the
system may be viewed as a parallel network of simple first order simulation
diagrams; see Figure 1.10.
The significance of this form is that it yields a strikingly simple system
description:

ẋ1 = p1 x1 + k1 u 

.. decoupled dynamic equations.
. 

ẋn = pn xn + kn u

This gives the state space model


   
p1 0 · · · 0 k1
 0 p2 · · · 0   k2 
   
ẋ =  . .. x +  .. u
 .. .   . 
0 0 · · · pn kn
y = [1, . . . , 1]x + [d]u.

This is often called the modal form, and the states xi (t) are then called
modes. It is important to note that this form is not always possible if the
roots of the denominator polynomial are not distinct. Exercises 12 and 13
below address some generalizations of the modal form.
1.6. TRANSFER FUNCTIONS & STATE SPACE MODELS 15

x1
k1 ∑

p1

x2 y
k2 ∑ ∑
u

p2

x3
k3 ∑

p3

Figure 1.10: Partial fraction expansion of a transfer function

Matlab Commands
Matlab is not well suited to nonlinear problems. However, the Matlab pro-
gram Simulink can be used for simulation of both linear and nonlinear
models. Some useful Matlab commands for system analysis of linear
systems are
RLOCUS calculates the root locus. e.g. rlocus(num,den), or rlocus(A,B,C,D).
STEP computes the step response of a linear system.
BODE computes the Bode plot.
NYQUIST produces the Nyquist plot.
TF2SS gives the CCF state space representation of a transfer function
model, but in a different form than given here.
SS2TF computes the transfer function of a model, given any state space
representation.
16 CHAPTER 1. STATE SPACE MODELS

RESIDUE may be used to obtain the partial fraction expansion of a trans-


fer function.
1.6. TRANSFER FUNCTIONS & STATE SPACE MODELS 17

Summary and References

State space models of the form

ẋ = f (x, u)
y = g(x, u)

occur naturally in the mathematical description of many physical systems,


where u is the input to the system, y is the output, and x is the state variable.
In such a state space model, each of these signals evolves in Euclidean space.
The functions f and g are often linear, so that the model takes the form

ẋ = Ax + Bu
y = Cx + Du.

If not, the functions f, g may be approximated using a Taylor series expan-


sion which again yields a linear state space model. The reader is referred to
[7] for more details on modeling from a control perspective.
A given linear model will have many different state space representa-
tions. Three methods for constructing a state space model from its transfer
function have been illustrated in this chapter:

(a) Construct a simulation diagram description, and define the outputs


of the integrators as states. This approach was used to obtain the
controllable canonical form, but it is also more generally applicable.
(b) Manipulate the transfer function to obtain a description which is more
easily described in state space form. For instance, a simple division
by sn led to the observable canonical form.
(c) Express the transfer function as a sum of simpler transfer functions –
This approach was used to obtain a modal canonical form.

These three system descriptions, the modal, controllable, and observable


canonical forms, will be applied in control analysis in later chapters.
Much more detail on the synthesis of state space models for linear sys-
tems may be found in Chapter 6 of [6], and Chapter 3 of [5].
18 CHAPTER 1. STATE SPACE MODELS

1.7 Exercises

1.7.1 You are given a nonlinear input-output system which satisfies the non-
linear differential equation:

ÿ(t) = 2y − (y 2 + 1)(ẏ + 1) + u.

(a) Obtain a nonlinear state-space representation.


(b) Linearize this system of equations around its equilibrium output
trajectory when u(·) ≡ 0, and write it in state space form.

1.7.2 Repeat Exercise 1 with the new system

ÿ(t) = 2y − (y 2 + 1)(ẏ + 1) + u + 2u̇.

1.7.3 Obtain state equations for the following circuit. For the states, use the
voltage across the capacitor, and the current through the inductor.

i L (t)
vC (t)
+ - + -
C
+ L +
u (t) u2 (t)
1
- R -

1.7.4 In each circuit below,

(a) Obtain a transfer function and a state space realization.


(b) Sketch a frequency response.
(c) Use the step command in Matlab to obtain a step response.
1.7. EXERCISES 19

+ +
R
(a) Lead network u(t) R y(t)
- -

+ +
R
(b) Lag network u(t) R y(t)
- C -

C C

+ +
R R
(c) Notch network u(t) y(t)
R/3 3C
- -

1.7.5 Consider the mass-spring system shown below

y(t)

u(t) = f(t)
k2 k1
m1 m2
k1 b

Assume that a force is acting on m1 , and let the horizontal position


of m2 represent the output of this system.

(a) Derive a set of differential equations which describes this input-


output system. To solve this problem you will require Newton’s
law of translational motion, and the following facts: (i) The force
exerted by a spring is proportional to its displacement, and (ii)
the force exerted by a frictional source is proportional to the
relative speed of the source and mass.
(b) Find the transfer function for the system.
(c) Obtain a state space description of the system.
20 CHAPTER 1. STATE SPACE MODELS

1.7.6 The n-dimensional nonlinear vector differential equation ẋ = f (x) has


a unique solution from any x ∈ Rn if the function f has continuous
partial derivatives. To see that just continuity of f is not sufficient for
uniqueness, and that some additional conditions are needed, consider
the scalar differential equation

p
ẋ = 1 − x2 , x(0) = 1.

Show that this differential equation with the given initial condition
has at least two solutions: One is x(t) ≡ 1, and another one is x(t) =
cos(t).

1.7.7 Consider a satellite in planar orbit about the earth. The situation is
modeled as a point mass m in an inverse square law force field, as
sketched below. The satellite is capable of thrusting (using gas jets,
for example) with a radial thrust u1 and a tangential (θ direction)
thrust u2 . Recalling that acceleration in polar coordinates has a radial
component (r̈ − r θ̇ 2 ), and a tangential component (r θ̈ + 2ṙ θ̇), Newton’s
Law gives

k
m(r̈ − r θ̇ 2 ) = − + u1
r2
m(r θ̈ + 2ṙ θ̇) = u2 ,

where k is a gravitational constant.

(a) Convert these equations to (nonlinear) state space form using x1 =


r, x2 = ṙ, x3 = θ, x4 = θ̇.
(b) Consider a nominal circular trajectory r(t) = r0 ; θ(t) = ω0 t, where
r0 and ω0 are constants. Using u1 (t) = u2 (t) = 0, obtain expres-
sions for the nominal state variables corresponding to the circular
trajectory. How are k, r0 , m, and ω0 related?
(c) Linearize the state equation in (a) about the state trajectory in
(b). Express the equations in matrix form in terms of r0 , ω0 and
m (eliminate k).
1.7. EXERCISES 21

r
θ

1.7.8 Using Matlab or Simulink, simulate the nonlinear model for the mag-
netically suspended ball.

(a) Using proportional feedback u = −k1 y + k2 r, can you stabilize


the ball to a given reference height r? Interpret your results by
examining a root locus diagram for the linearized system.
(b) Can you think of a better control law? Experiment with other
designs.

Transfer Functions & State Space Models


1.7.9 A SISO LTI system is described by the transfer function

s+4
G(s) =
(s + 1)(s + 2)(s + 3)

(a) Obtain a state space representation in the controllable canonical


form;
(b) Now obtain one in the observable canonical form;
(c) Use partial fractions to obtain a representation with a diagonal
state matrix A (modal form).

In each of (a)–(c) draw an all-integrator simulation diagram.


1.7.10 A SISO LTI system is described by the transfer function

s3 + 2
G(s) =
(s + 1)(s + 3)(s + 4)

(a) Obtain a state space representation in the controllable canonical


form;
(b) Now obtain one in the observable canonical form;
22 CHAPTER 1. STATE SPACE MODELS

(c) Use partial fractions to obtain a representation of this model with


a diagonal state matrix A.
In each of (a)–(c) draw an all-integrator simulation diagram.
1.7.11 For the multiple input-multiple output (MIMO) system described by
the pair of differential equations
...
y 1 + 2ẏ1 + 3y2 = u1 + ü1 + u̇2
ÿ2 − 3ẏ2 + ẏ1 + y2 + y1 = u2 + u̇3 − u3
obtain a state space realization by choosing y1 and y2 as state variables.
Draw the corresponding simulation diagram.
1.7.12 This exercise generalizes modal form to the case where some eigenval-
ues are repeated. For each of the following transfer functions, obtain a
state-space realization for the corresponding LTI system by breaking
H into simple additive terms, and drawing the corresponding simula-
tion diagrams for the sum. Choose the outputs of the integrators as
state variables.
2s2
(a) H(s) =
s3 − s2 + s − 1
s2 + s + 1
(b) H(s) = 3
s + 4s2 + 5s + 2
1.7.13 This exercise indicates that a useful generalization of the modal form
may be constructed when some eigenvalues are complex.
(a) Use partial fractions to obtain a diagonal state space representa-
tion for a SISO LTI system with transfer function
s+6
G(s) = .
s2 + 2s + 2
Note that complex gains appear in the corresponding all-integrator
diagram.
(b) Given a transfer function in the form
s+β
G(s) =
(s − λ1 )(s − λ2 )
and a corresponding state space realization with A matrix Com-
pute the eigenvalues λ1 , λ2 of the matrix
 
σ ω
A =
−ω σ
1.7. EXERCISES 23

where σ ≥ 0, ω > 0, find the relationships between λ1 , λ2 , β


and σ, ω. In view of this, complete the state space realization
by obtaining the B and C matrices, and draw the corresponding
simulation diagram.
s2 + s + 1
(c) For the transfer function H(s) = 3 , obtain a state-
s + 4s2 + 5s + 2
space realization for the corresponding LTI system by breaking
H into simple additive terms, and drawing the corresponding
simulation diagrams for the sum.
(d) Apply your answer in (b) to obtain a state space realization of
G(s) in (a) with only real coefficients.
24 CHAPTER 1. STATE SPACE MODELS
Chapter 2

Vector Spaces

Vectors and matrices, and the spaces where they belong, are fundamental to
the analysis and synthesis of multivariable control systems. The importance
of the theory of vector spaces in fact goes well beyond the subject of vectors
in finite-dimensional spaces, such as Rn . Input and output signals may be
viewed as vectors lying in infinite dimensional function spaces. A system is
then a mapping from one such vector to another, much like a matrix maps
one vector in a Euclidean space to another. Although abstract, this point of
view greatly simplifies the analysis of state space models and the synthesis
of control laws, and is the basis of much of current optimal control theory. In
this chapter we review the theory of vector spaces and matrices, and extend
this theory to the infinite-dimensional setting.

2.1 Fields
A field is any set of elements for which the operations of addition, subtrac-
tion, multiplication, and division are defined. It is also assumed that the
following axioms hold for any α, β, γ ∈ F

(a) α + β ∈ F and α · β ∈ F.
(b) Addition and multiplication are commutative:

α + β = β + α, α · β = β · α.

(c) Addition and multiplication are associative:

(α + β) + γ = α + (β + γ), (α · β) · γ = α · (β · γ).

25
26 CHAPTER 2. VECTOR SPACES

(d) Multiplication is distributive with respect to addition:

(α + β) · γ = α · γ + β · γ

(e) There exists a unique null element 0 such that 0 · α = 0 and 0 + α = α.


(f) There exists a unique identity element 1 such that 1 · α = α.
(g) For every α ∈ F there exists a unique element β ∈ F such that α + β =
0; this unique element is sometimes referred to as the additive inverse
or negative of α, and is denoted as −α.
(h) To every α ∈ F which is not the element 0 (i.e., α 6= 0), there corre-
sponds an element γ such that α · γ = 1; this element is referred to as
the multiplicative inverse of α, and is sometimes written as α−1 .

Fields are a generalization of R, the set of all real numbers. The next
example is the set of all complex numbers, denoted C. These are the only
examples of fields that will be used in the text, although we will identify
others in the exercises at the end of the chapter.

2.2 Vector Space


A set of vectors X is a set on which vector addition, and scalar multiplication
are defined, generalizing the concept of the vector space Rn . In this abstract
setting a vector space over a field F, denoted (X , F), is defined as follows:

(a) For every x1 , x2 ∈ X , the vector sum x1 + x2 ∈ X .


(b) Addition is commutative: For every x1 , x2 ∈ X , the sum x1 + x2 =
x2 + x1 .
(c) Addition is associative: For every x1 , x2 , x3 ∈ X ,

(x1 + x2 ) + x3 = x1 + (x2 + x3 )

(d) The set X contains a vector ϑ such that ϑ + x = x for all x ∈ X .


(e) For every x ∈ X , there is a vector y ∈ X such that x + y = ϑ.
(f) For every x ∈ X , α ∈ F, the scalar product α · x ∈ X .
(g) Scalar multiplication is associative: for every α, β ∈ F, and x ∈ X ,

α(βx) = (αβ)x.
2.3. BASES 27

Below is a list of some of the vector spaces which are most important in
applications

(Rn , R) – the real vector space of n dimensional real-valued vectors.

(Cn , C) – the complex vector space of n dimensional complex-valued vec-


tors.

(C n [a, b], R) – the vector space of real-valued continuous functions on the


interval [a, b], taking values in Rn .

(D n [a, b], R) – the vector space of real-valued piecewise-continuous func-


tions on the interval [a, b], taking values in Rn .

(Lnp [a, b], R) – the vector space of functions on the interval [a, b], taking
values in Rn , which satisfy the bound
Z b
|f (t)|p dt < ∞, f ∈ Lp [a, b].
a

b(s)
(R(C), R) – the vector space of rational functions a(s) of a complex variable
s, with real coefficients.

A subspace Y of a vector space X is a subset of X which is itself a


vector space with respect to the operations of vector addition and scalar
multiplication. For example, the set of complex n-dimensional vectors whose
first component is zero is a subspace of (Cn , C), but Rn is not a subspace of
(Cn , C).

2.3 Bases
A set of vectors S = (x1 , . . . , xn ) in (X , F) is said to be linearly independent
if the following equality

α1 x1 + α2 x2 + · · · + αn xn = 0

holds for a set of n elements {αi : 1 ≤ i ≤ n} ⊂ F, then α1 = α2 = · · · αn =


0. If the set S contains an infinite number of vectors, then we call S linearly
independent if every finite subset of S is linearly independent, as defined
above.
28 CHAPTER 2. VECTOR SPACES

In the case where (X , F) = (Rn , R), we will have linear independence of


{x1 , . . . , xn } if and only if
 1 
x1 xn1
 
det[x1 x2 · · · xn ] = det  ... · · · ...  6= 0.
x1n xnn

The maximum number of linearly independent vectors in (X , F) is called


the dimension of (X , F). For example, (Rn , R) and (Cn , C) both have di-
mension n. What is the dimension of (Cn , R)? (see Exercise 6).
More interesting examples can be found in function spaces. If for exam-
ple (X , F) = (C[0, 1], R), where C[0, 1] is the set of real-valued continuous
functions on [0, 1], then we can easily find a set S of infinite size which is
linearly independent. One such set is the collection of simple polynomials
S = {t, t2 , t3 , . . .}. To see that S is linearly independent, note that for any
n,
n
X
αi ti = ϑ only if αi = 0, 1 ≤ i ≤ n,
i=1

where ϑ ∈ C[0, 1] is the function which is identically zero on [0, 1]. We have
thus shown that the dimension of (C[0, 1], R) is infinite.
A set of linearly independent vectors S = {e1 , . . . , en } in (X , F) is said
to be a basis of X if every vector in X can be expressed as a unique linear
combination of these vectors. That is, for any x ∈ X , one can find {βi , 1 ≤
i ≤ n} such that

x = β1 e1 + β2 e2 + · · · + βn en .

Because the set S is linearly independent, one can show that for any vector
x, the scalars {βi } are uniquely specified in F. The n-tuple {β1 , . . . , βn } is
often called the representation of x with respect to the basis {e1 , . . . , en }.
We typically denote a vector x ∈ Rn by
 
x1
 
x =  ...  .
xn

There are two interpretations of this equation:

(a) x is a vector (in Rn ), independent of basis.


2.4. CHANGE OF BASIS 29

(b) x is a representation of a vector with respect to the natural basis:


     
1 0 0
 0   1   0 
     
     
x = x1  ...  + x2  0  + · · · + xn  ...  .
   ..   
 0   .   0 
0 0 1

The following theorem is easily proven in Rn using matrix manipulations,


and the general proof is similar:
Theorem 2.3.1. In any n-dimensional vector space, any set of n linearly
independent vectors qualifies as a basis. ⊓

In the case of Euclidean space (X , F) = (Rn , R), with {e1 , . . . , en } a
given basis, any vector x ∈ Rn may be expressed as

x = β1 e1 + · · · + βn en ,

where {βi } are all real scalars. This expression may be equivalently written
as x = Eβ, where
     
x1 e11 · · · e1n β1
 ..   .. ..  , β =  ... 
.  
x =  . , E =  . ,
xn en1 · · · enn βn

which through inversion shows that β is uniquely given by

β = E −1 x (2.1)

Here E −1 stands for the matrix inverse of E (i.e., E −1 E = EE −1 = I, where


I is the identity matrix), which exists since ei ’s are linearly independent.
Consider the numerical example with e1 = ( 11 ), e2 = ( 01 ), and x = ( 25 ).
Then we have x = β1 e1 + β2 e2 , with
   −1       
β1 1 0 2 1 0 2 2
= = =
β2 1 1 5 −1 1 5 3

2.4 Change of basis


Suppose now we are given two sets of basis vectors:

{e1 , · · · , en } {ē1 , · · · , ēn }


30 CHAPTER 2. VECTOR SPACES

A vector x in X can be represented in two possible ways, depending on


which basis is chosen:
n
X
x = β1 e1 + · · · βn en = βk ek (2.2)
k=1

or,
n
X
x = β̄1 ē1 + · · · β̄n ēn = β̄k ēk . (2.3)
k=1

Since {ei } ⊂ X , there exist scalars {pki : 1 ≤ k, i ≤ n} such that for any i,
n
X
ei = p1i ē1 + · · · + pni ēn = pki ēk .
k=1

From (2.2) it then follows that a vector x may be represented as


n n
! n n
!
X X X X
x= βi pki ēk = pki βi ēk . (2.4)
i=1 k=1 k=1 i=1

In view of (2.3) and (2.4) we have by subtraction


n
" n #
X X
pki βi − β̄k ēk = ϑ.
k=1 i=1

By linear independence of {ēk }, this implies that each coefficient in brackets


is 0. This gives a matrix relation between the coefficients {βi } and {β̄i }:
n
X
β̄k = pki βi , k = 1, . . . , n
i=1

or using compact matrix notation,


    
β̄1 p11 . . . p1n β1
 ..   ..   .. 
β̄ =  . = .   .  = P β.
β̄n pn1 . . . pnn βn

The transformation P maps F n → F n , and is one to one, and onto. It


therefore has an inverse P −1 , so that β can also be computed through β̄:

β = P −1 β̄.
2.5. LINEAR OPERATORS 31

For the special case where (X , F) = (Rn , R), the vectors {ei } can be
stacked to form a matrix to obtain as in (2.1),

x = Eβ = Ē β̄.

Hence the transformation P can be computed explicitly: β̄ = Ē −1 Eβ, so


that P = Ē −1 E. The inverse Ē −1 again exists by linear independence.

2.5 Linear Operators


A linear operator A is simply a function from one vector space (X , F) to
another (Y, F), which is linear . This means that for any scalars α1 , α2 , and
any vectors x1 , x2 ,

A(α1 x1 + α2 x2 ) = α1 A(x1 ) + α2 A(x2 ).

For a linear operator A or a general function from a set X into a set Y we


adopt the terminology

X : Domain of the mapping A


Y : Co-Domain of the mapping A

When A is applied to every x ∈ X , the resulting set of vectors in Y is


called the range (or image) of A, and is denoted by R(A):
[
R(A) := A(x)
x∈X

Pictorially, these notions are illustrated as follows:

R(A)

A(x)

X x Y

The rank of A is defined to be the dimension of R(A).


32 CHAPTER 2. VECTOR SPACES

In the special case of a linear operator A : Rn → Rm defined as A(x) =


Ax for an m × n matrix A,
 
x1
 
y = Ax = [a1 . . . an ]  ... 
xn
1 n
= x1 a + · · · + xn a ,

the range of A is the set of all possible linear combinations of the columns
{ai } of A. That is, the space spanned by the columns of A. The dimension
of R(A) is then the maximum number of linearly independent columns of
A.
Theorem 2.5.1. For a linear operator A, the set R(A) is a subspace of Y.
Proof To prove the theorem it is enough to check closure under addition
and scalar multiplication. Suppose that y 1 , y 2 ∈ R(A), and that α1 , α2 ∈ F.
Then by definition of R(A) there are vectors x1 , x2 such that

A(x1 ) = y 1 , A(x2 ) = y 2 ,

and then by linearity of the mapping A,

A(α1 x1 + α2 x2 ) = α1 A(x1 ) + α2 A(x2 ) = α1 y 1 + α2 y 2 .

Hence, α1 y 1 + α2 y 2 ∈ R(A), which establishes the desired closure property.




By assumption, the set Y contains a zero vector, which could be mapped
from numerous x ∈ X . The set of all such x ∈ X is called the nullspace of
A, denoted N (A):

N (A) := {x ∈ X such that A(x) = 0} .

Theorem 2.5.2. For any linear operator A : X → Y, the nullspace N (A)


is a subspace of X .
Proof Again we check that N (A) is closed under addition and scalar
multiplication. Suppose that x1 , x2 ∈ N (A), and that α1 , α2 ∈ F. Then by
definition,

A(x1 ) = A(x2 ) = ϑ.

Again by linearity of A it is clear that A(α1 x1 + α2 x2 ) = ϑ, which proves


the theorem. ⊓

2.6. LINEAR OPERATORS AND MATRICES 33

2.6 Linear operators and matrices


Suppose that V and W are two finite-dimensional vector spaces over the field
F, with bases {v 1 , . . . , v n } and {w1 , . . . , wm } respectively. If A : V → W,
then A may be represented by a matrix. To see this, take any v ∈ V, and
write
X n
v = αi v i
i=1

for scalars {αi } ⊂ F. By linearity we then have,


Xn  X n
A(v) = A αi v i = αi A(vi )
i=1 i=1

But for any i we have that A(v i ) ∈ W, which implies that for some scalars
{aji },
m
X
i
A(v ) = aji wj , 1 ≤ i ≤ n.
j=1

From the form of v we must therefore have


X n Xm
A(v) = αi aji wj
i=1 j=1
n X
X m 
= aji αi wj (2.5)
j=1 i=1

Recall that the vector w = A(v) in W has a unique representation


m
X
A(v) = βj wj
j=1

Consequently, the terms in parentheses in (2.5) are identical to the {βj }:


n
X
βj = aji αi j = 1, . . . , m.
i=1

From this we see how the representations of v and w are transformed through
the linear operator A:
    
β1 a11 . . . ain α1
   ..   ..  = Aα
β =  ...  =  ... .  . 
βm ami . . . amn αn
34 CHAPTER 2. VECTOR SPACES

so that A is the representation of A with respect to {v i } and {wi }.


The special case where A is a mapping of (X , F) into itself is of partic-
ular interest. A question that frequently arises is “if the linear operator is
represented by a matrix A of elements in F, how is A affected by a change
of basis?”. Let xb = A(xa ), and write
n
X n
X
xa = αi ei = ᾱi ēi
i=1 i=1
m
X m
X
xb = βi ei = β̄i ēi
i=1 i=1

where the α and β are related by

β = Aα β̄ = Āᾱ.

To see how A and Ā are related, recall that there is a matrix P such that

ᾱ = P α β̄ = P β

Combining these four equations gives

P Aα = P β = β̄ = Āᾱ = ĀP α.

Since α ∈ X is arbitrary, we conclude that P A = ĀP , and hence we can


also conclude that

Ā = P AP −1
A = P −1 ĀP

When these relationships hold, we say that the matrices A and Ā are similar .

2.7 Eigenvalues and Eigenvectors


For a linear operator A : X → X on an arbitrary vector space (X , F), a
scalar λ is called an eigenvalue of A if there exists a non-zero vector x for
which
A(x) = λx. (2.6)
The vector x in (2.6) is then called an eigenvector .
Let X = Cn , F = C, and A be a matrix representation for A. If an
eigenvalue λ of A does exist, then one may infer from the equation

(A − λI)x = 0,
2.7. EIGENVALUES AND EIGENVECTORS 35

that the matrix A − λI is singular. For nontrivial solutions, we must then


have
∆(λ) := det(λI − A) = 0. (2.7)
The function ∆( · ) is called the characteristic polynomial of the matrix A,
and (2.7) is known as the characteristic equation. The characteristic poly-
nomial is a polynomial of degree n, which must therefore have n roots. Any
root of ∆ is an eigenvalue, so at least in the case of operators on (Cn , C),
eigenvalues always exist. Note that if Ā is some other matrix representation
for A, since A and Ā are necessarily similar, Ā has the same characteris-
tic polynomial as A. Hence, the eigenvalues do not depend on the specific
representation picked.
If the roots of the characteristic polynomial are distinct, then the vector
space Cn admits a basis consisting entirely of eigenvectors:

Theorem 2.7.1. Suppose that λ1 , . . . , λn are the distinct eigenvalues of the


n × n matrix A, and let v 1 , . . . , v n be the associated eigenvectors. Then the
set {v i , i = 1 . . . n} is linearly independent over C.

When the eigenvalues of A are distinct, the modal matrix defined as

M := [v 1 . . . v n ]

is nonsingular. It satisfies the equation AM = M Λ, where


 
λ1 0
 .. 
Λ= . 
0 λn

and therefore A is similar to the diagonal matrix Λ:

Λ = M −1 AM. (2.8)

Unfortunately, not every matrix can be diagonalized, and hence the eigen-
vectors do not span Cn in general. Consider the matrix
 
1 1 2
A =  0 1 3 
0 0 2

The characteristic polynomial for A is

∆(λ) = det(λI − A) = (λ − 1)(λ − 1)(λ − 2)


36 CHAPTER 2. VECTOR SPACES

So, eigenvalues are λ1 = 1, λ2 = 1, λ3 = 2.


Considering the eigenvector equation
 
0 1 2
(A − λ1 I)x =  0 0 3  x = ϑ,
0 0 1
 
1
we see that x =  0  and its constant multiples are the only eigenvec-
0
tors associated with λ1 . It then follows that there does not exist a state
transformation P for which
 
1 0 0
Λ =  0 1 0  = P AP −1 .
0 0 2

To obtain a nearly diagonal representation of A we use generalized eigen-


vectors, defined as follows. Search for a solution to

(A − λ1 I)k x = 0

such that (A − λ1 I)k−1 x 6= 0. Then x is called a generalized eigenvector of


grade k.
Note that it then follows that the vector (A − λ1 I)x is a generalized
eigenvector of grade k − 1. In the example,
  
0 1 2 0 1 2
(A − λ1 I)2 x =  0 0 3   0 0 3  x
0 0 1 0 0 1
 
0 0 5
=  0 0 3  x.
0 0 1
 
0
So x =  1  is a generalized eigenvector of grade k.
0
Letting y = (A − λ1 I)x, we have

(A − λ1 I)y = (A − λ1 I)2 x = 0.

So y is an eigenvector to A, and x, y are linearly independent. In fact, y =


(1, 0, 0)′ is the eigenvector computed earlier. To obtain an approximately
2.8. INNER PRODUCTS 37

diagonal form, let x1 = y, x2 = x, x3 any eigenvector corresponding to


λ3 = 2. We then have

Ax1 = λ1 x1
Ax2 = Ax = y + λ1 x = x1 + λ1 x2
Ax3 = λ3 x3

Letting M = [x1 |x2 |x3 ] it follows that


        
λ1 1 0 λ1 1 0

AM = M  0 λ1 0  = M  0  M  λ1  M  0 

0 0 λ3 0 0 λ3
= MJ

where
 
λ1 1 0
J =  0 λ1 0  = M −1 AM
0 0 λ3

This representation of A with respect to a basis of generalized eigenvectors


is known as the Jordan form.

2.8 Inner Products


Inner products are frequently applied in the solution of optimization prob-
lems because they give a natural measure of distance between vectors. This
abstract notion of distance can often be interpreted as a cost in applications
to finance, or as energy in mechanical systems. Inner products also provide
a way of defining angles between vectors, and thereby introduce geometry to
even infinite dimensional models where any geometric structure is far from
obvious at first glance.
To define an inner product we restrict our attention to a vector space X
over the complex field C. An inner product is then a complex-valued function
of two vectors, denoted h · , · i, such that the following three properties hold:

(a) hx, yi = hy, xi (complex conjugate).


(b) hx, α1 y 1 +α2 y 2 i = α1 hx, y 1 i+α2 hx, y 2 i, for all x, y 1 , y 2 ∈ X , α1 , α2 ∈
C.
(c) hx, xi ≥ 0 for all x ∈ X , and hx, xi = 0 if and only if x = 0.
38 CHAPTER 2. VECTOR SPACES

In the special case where X = Cn we typically define

hx, yi = x∗ y

where x∗ denotes the complex conjugate transpose of the vector x. Another


important example is the function space Lp [a, b] with p = 2. It can be shown
that the formula
Z b
hf, gi := f (t)g(t) dt, f, g ∈ L2 [a, b],
a

defines on inner product on L2 [a, b].


The most obvious application of the inner product is in the formulation
of a norm. In general, the norm of a vector x in a vector space (X , C),
denoted kxk, is a real-valued function of a vector x such that

1. kxk ≥ 0, and kxk = 0 if and only if x = 0.

2. kαxk = |α| kxk, for any α ∈ C.

3. kx + yk ≤ kxk + kyk.

The third defining property is known as the triangle inequality.


In Rn we usually define the norm as the usual Euclidean norm,
v
u n
√ uX
kxk = x x = t
T x2i ,
i=1

which we will henceforth write as |x|, and reserve the notation k · k for norm
of an infinite-dimensional vector. This Euclidean norm can also be defined
using the inner product: p
|x| = hx, xi (2.9)
In fact, one can show that the expression (2.9) defines a norm in an arbitrary
(finite- or infinite-dimensional) inner product space.
We define the norm of a vector f ∈ Lp [a, b] as
Z b 1/p
kf kLp := |f (t)|p dt .
a

In the case p = 2, this norm is derived from the inner product on L2 [a, b],
but for general p this norm is not consistent with any inner product.
2.9. ORTHOGONAL VECTORS AND RECIPROCAL BASIS VECTORS39

2.9 Orthogonal vectors and reciprocal basis vec-


tors
Two vectors x, y in an inner product space (X , C) are said to be orthogonal
if hx, yi = 0. This concept has many applications in optimization, and
orthogonality is also valuable in computing representations of vectors. To
see the latter point, write
n
X
x= αi v i ,
i=1

where {v i , i = 1 . . . n} is a basis for (X , C). By orthogonality we then have


n
X n
X
hv j , xi = hv j , αi v i i = αi hv j , v i i j = 1 . . . n
i=1 i=1

This may be written explicitly as

hv 1 , xi = α1 hv 1 , v 1 i + α2 hv 1 , v 2 i + · · · + αn hv 1 , v n i
..
.
hv n , xi = α1 hv n , v 1 i + α2 hv 1 , v 2 i + · · · + αn hv n , v n i

or in matrix form
 1   1 1  
hv , xi hv , v i · · · hv 1 , v n i α1
 ..   ..   .. 
 .  =  .  . 
hv n , xi hv n , v 1 i · · · hv n , v n i αn
| {z }
G

The n × n matrix G is called the Grammian. Its inverse gives a formula for
the representation α:
 1 
hv , xi
 .. 
α = G−1  . 
hv n , xi
If the basis is orthogonal, then G is diagonal, in which case the computation
of the inverse G−1 is straightforward.
A basis is said to be orthonormal if

j i 1, i = j
hv , v i = δij =
0, i 6= j
40 CHAPTER 2. VECTOR SPACES

In this case G = I (the identity matrix), so that G−1 = G.


A basis {r i } is said to be reciprocal to (or dual to) the basis {v i } if

hr i , v j i = δij , i = 1, . . . , n, j = 1, . . . , n. (2.10)

If a dual basis is available, then again the representation of x with respect


to the basis {v i } is easily computed. For suppose that x is represented by
α:
n
X
x= αi v i
i=1

Then, by the dual property and linearity we have


n
X n
X
j j i
hr , xi = hr , αi v i = αi hr j , v i i.
i=1 i=1
Pn
Since hr j , v i i = δij , this shows that x = i i
i=1 hr , xiv . Of course, to to
apply this formula we must have the reciprocal basis {r i }, which may be as
difficult to find as the inverse Grammian.
In the vector space (Cn , C) define the matrices
 1∗ 
r
1 n  .. 
M = [v · · · v ] R= . 
r n∗

From the defining property (2.10) of the dual basis, we must have RM = I,
so that R = M −1 .

2.10 Adjoint transformations


Suppose that A is a linear operator from the vector space (X , C) to another
vector space (Y, C), that is, A : X → Y. Then the adjoint is a linear
operator working in the reverse direction:

A∗ : Y → X .

Its definition is subtle since it is not directly defined through A. We say


that A∗ is the adjoint of A, if for any x ∈ X and any y ∈ Y,

hA(x), yi = hx, A∗ (y)i.


2.10. ADJOINT TRANSFORMATIONS 41

To illustrate this concept, let us begin with the finite dimensional case

X = Cn , Y = Cm

and suppose that A is defined through an m × n matrix A, so that A(x) =


Ax, x ∈ X . We may then compute the adjoint using the definition of the
inner products on X and Y as follows

hA(x), yi = (Ax)∗ y = x̄T ĀT y = hx, ĀT yi

Thus, the adjoint of A is defined through the complex conjugate transpose


of A:

A∗ (y) = ĀT y = A∗ y

Matlab Commands

Matlab is virtually designed to deal with the numerical aspects of the vector
space concepts described in this chapter. Some useful commands are

INV to compute the inverse of a matrix.

DET to compute the determinant.

EIG finds eigenvalues and eigenvectors.

RANK computes the rank of a matrix.


42 CHAPTER 2. VECTOR SPACES

Summary and References

In this chapter we have provided a brief background on several different


topics in linear algebra, including

(a) Fields and vector spaces.


(b) Linear independence and bases.
(c) Representations of vectors, and how these representations change under
a change of basis.
(d) Linear operators and matrices.
(e) Inner products and norms.
(f) Adjoint operators.
(g) Eigenvectors.

Good surveys on linear algebra and matrix theory may be found in Chapter 2
of [6], or Chapters 4 and 5 of [5].
2.11. EXERCISES 43

2.11 Exercises
2.11.1 Determine conclusively which of the following are fields:
(a) The set of integers.
(b) The set of rational numbers.
(c) The set of polynomials of degree less than 3 with real coefficients.
(d) The set of all n × n nonsingular matrices.
(e) The set {0, 1} with addition being binary “exclusive-or” and mul-
tiplication being binary “and”.
2.11.2 Define rules of addition and multiplication such that the set consist-
ing of three elements {a, b, c} forms a field. Be sure to define the zero
and unit elements.
2.11.3 Let (X, F) be a vector space, and Y ⊂ X a subset of X. If Y satisfies
the closure property y1 , y2 ∈ Y, α1 , α2 ∈ F =⇒ α1 y1 + α2 y2 ∈ Y , show
carefully using the definitions that (Y, F) is a subspace of (X, F).

Linear independence and bases


2.11.4 Which of the following sets are linearly independent?
     
1 0 0
(a) 4 , 2 , 0 in (R3 , R).
2 3 1
       
1 4 7 7912
(b) 2 , 5 , 8 , −314 in (R3 , R).
3 6 9 0.098
     
1 0 j
(c) 4j , −2 , 0 in (C3 , C).
    
2 j 0
(d) sin(t), cos(t), t in (C[0, 1], R) - the set of real-valued continuous
functions on [0, 1] over the real field.
(e) ejt , sin(t), cos(t), t in (C[0, 1], C) - the set of complex-valued con-
tinuous functions on [0, 1] over the complex field.
2.11.5 Determine which of the following sets of vectors are linearly indepen-
dent in R3 by computing the determinant of an appropriate matrix.
     
1 2 0
(a) 0 , 0 , 5.
2 1 1
44 CHAPTER 2. VECTOR SPACES
     
4 1 2
(b) 5 ,  2  , 1.
1 −1 3

2.11.6 For the vector space (Cn , R),

(a) Verify that this is indeed a vector space.


(b) What is its dimension?
(c) Find a basis for (Cn , R).

2.11.7 Let R2×2 be the set of all 2 × 2 real matrices.

(a) Briefly verify that R2×2 is a vector space under the usual matrix
addition and scalar multiplication.
(b) What is the dimension of R2×2 ?
(c) Find a basis for R2×2 .

2.11.8 Is the set I, A, A2 linearly dependent or independent in (R2×2 , R),
 
1 1
with A = ?
0 2

Representations of vectors
      
 1 2 1 
2.11.9 Given the basis v 1 = 1 , v 2 = 0 , v 3 = 0 and the vector
 
1 0 1
 
3
x = 2 = α1 v 1 + α2 v 2 + α3 v 3
1

(a) Compute the reciprocal basis.


(b) Compute the Grammian.
(c) Compute the representation of x with respect to {v i } using your
answer to (a).
(d) Compute the representation of x with respect to {v i } using your
answer to (b).

Linear operators and matrices


2.11.10 Compute the null space, range space, and rank of the following
matrices.
2.11. EXERCISES 45
 
1 1 2
(a) A = 0 2 2 .
0 3 3
 
1 3 2 1
(b) A =  2 0 1 −1 .
−1 1 0 1

2.11.11 Let b ∈ Rn and A ∈ Rn×m . Give necessary and sufficient conditions


on A and b in order that the linear system of equations Ax = b has a
solution x ∈ Rm .
2.11.12 Let A : R3 → R3 be a linear operator. Consider the two sets
B = {b1 , b2 , b3 } and C = {c1 , c2 , c3 } below
            
 1 0 0   1 0 1 
B = 0 , 1 , 0 , C = 1 , 1 , 0
   
0 0 1 0 1 1

It should be clear that these are bases for R3 .

(a) Find the transformation P relating the two bases.


(b) Suppose the linear operator A maps
     
2 0 0
A(b1 ) = −1 , A(b1 ) = 0 , A(b1 ) = 4
0 0 2

Write down the matrix representation of A with respect to the


basis B and also with respect to the basis C.

2.11.13 Find the inverse of the matrix A, where B is a matrix.


 
I B
A=
0 I

2.11.14 Consider the set Pn of all polynomials of degree strictly less than
n, with real coefficients, where x ∈ Pn may be written x = a0 + a1 t +
· · · + an−1 tn−1 .

(a) Verify that Pn is a vector space, with the usual definitions of poly-
nomial addition, and scalar multiplication.
(b) Explain why {1, t, . . . , tn−1 } is a basis, and thus why Pn is n-
dimensional.
46 CHAPTER 2. VECTOR SPACES

(c) Suppose x = 10 − 2t + 2t2 − 3t3 . Find the representation α of x


with respect to the basis in (b) for n = 4.
d
(d) Consider differentiation, dt , as an operator A : Pn → Pn−1 . That
d
is, A(x) = dt x. Show that A is a linear operator, and compute
its null space and range space.
(e) Find A, the matrix representation of A for n = 4, using the basis
in (b). Use your A and α to compute the derivative of x in (c).

Inner products and norms


2.11.15 Let (V, C) be an inner product space.

(a) Let x, y ∈ V with x orthogonal to y. Prove the Pythagorean


theorem:

kx + yk2 = kxk2 + kyk2

(b) Prove that in an inner product space,

kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2

where k·k is the norm induced by the inner product. This is called
the Parallelogram law. Can you give a geometric interpretation
of this law in R2 ?

Adjoint operators
2.11.16 If A : X → Y where X and Y are inner product spaces, the adjoint
A∗ is a mapping A∗ : Y → X . Hence, the composition Z = A∗ ◦ A is
a mapping from X to itself. Prove that N (Z) = N (A).
2.11.17 ForR 1 ≤ p < ∞, let Lp denote functions f : (−∞, ∞) → C such

that −∞ |f (s)|p ds < ∞. For p = ∞, Lp denotes bounded functions
f : (−∞, ∞) → C. The set Lp is a vector space over the complex field.
Define the function A : Lp → Lp as A(f ) = a ∗ f , where “∗” denotes
convolution. We assume that for some constants C < ∞, c > 0, we
have the bound |a(t)| ≤ Ce−c|t| for all t ∈ R. This is sufficient to
ensure that A : Lp → Lp for any p.

(a) First consider the case where p = ∞, and let fω (t) = ejωt , where
ω ∈ R. Verify that fω ∈ L∞ is an eigenvector of A. What is the
corresponding eigenvalue?
2.11. EXERCISES 47

(b) In the special case p = 2, Lp is an inner product space with


Z
< f, g >L2 = f ∗ (s)g(s) ds, f, g ∈ L2 .

Compute the adjoint A∗ : L2 → L2 , and find conditions on a


under which A is self adjoint.

2.11.18 Let X = Rn with the usual inner product.


R∞ Let Y = Ln2 [0, ∞), the
set of functions f : [0, ∞) → R with 0 |f (s)|2 ds < ∞. We define
n

the inner product as before:


Z
< f, g >Y = f ⊤ (s)g(s) ds f, g ∈ Y

For an n × n Hurwitz matrix A, consider the differential equation


ẋ = Ax. By stability, for each initial condition x0 ∈ X, there exists a
unique solution x ∈ Y . Define A : X → Y to be the map which takes
the initial condition x0 ∈ X to the solution x ∈ Y .

(a) Explain why A is a linear operator.


(b) What is the null space N (A)? What is the rank of A?
(c) Compute the adjoint A∗ .

Eigenvectors
2.11.19 Find the eigenvalues of the matrix
 
1 1 0 0
0 1 0 0
A= 4 5 1

5
1 2 0 1

2.11.20 An n × n matrix A is called positive definite if it is symmetric,

A = A∗ = ĀT ,

and if for any x 6= ϑ, x ∈ Rn ,

x∗ Ax > 0.

The matrix A is positive semi-definite if the strict inequality in the


above equation is replaced by “≥”. Show that for a positive definite
matrix,
48 CHAPTER 2. VECTOR SPACES

(a) Every eigenvalue is real and strictly positive.


(b) If v 1 and v 2 are eigenvectors corresponding to different eigenvalues
λ1 and λ2 , then v 1 and v 2 are orthogonal.

2.11.21 For a square matrix X suppose that (i) all of the eigenvalues of X
are strictly positive, and (ii) the domain of X possesses an orthogonal
basis consisting entirely of eigenvectors of X. Show that X is a pos-
itive definite matrix (and hence that these two properties completely
characterize positive definite matrices).
Hint: Make a change of basis using the modal matrix M = [v 1 · · · v n ],
where {vi } is an orthonormal basis of eigenvectors.
2.11.22 Left eigenvectors {ω i } of an n × n matrix A are defined by ω i A =
λi ω i , where {ω i } are row vectors (1 × n matrices) and the {λi } are the
eigenvalues of A. Assume that the eigenvalues are distinct.

(a) How are the {ω i } related to the ordinary (right) eigenvectors of


A∗ ?
(b) How are the {ω i } related to the reciprocal (dual) eigenvectors of
A?

You might also like