Linear Systems

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

CS 70/170, Spring 2015

Professor Hany Farid


Department of Computer Science
Dartmouth College

Linear Systems
Linear algebra is a fantastic subject. On the one hand it is clean
and beautiful.
Gilbert Strang, Linear Algebra and its Applications
This wonderful branch of mathematics is both beautiful and useful, and is
the cornerstone upon which this course is built.

Vectors
At the heart of linear algebra is machinery for solving linear equations. In
the simplest case, the number of unknowns equals the number of equations.
For example, here are two equations in two unknowns:
2x y = 1
x + y = 5.

(1)

There are at least two ways in which we can think of solving these equations
for x and y. The first is to consider each equation as describing a line, with
the solution being at the intersection of the lines: in this case the point (2, 3),
Figure 1(a). This solution is termed a row solution because the equations
are considered in isolation of one another. This is in contrast to a column
solution in which the equations are rewritten in vector form:
 
 
 
2
1
1
x+
y =
.
(2)
1
1
5
1

(1,5)

y
(3,3)

2xy=1

(4,2)

(x,y)=(2,3)
(1,1)

(2,1)

x
x+y=5

(a)

(5,2,9)

(b)

(c)

Figure 1: (a) Row solution in 2-D; (b) column solution in 2-D; and (c)
column solution in 3-D.

The solution reduces to finding values for x and y that scale the vectors (2, 1)
and (1, 1) so that their sum is equal to the vector (1, 5), Figure 1(b). Of
course the solution is again x = 2 and y = 3.
These solutions generalize to higher dimensions. Here is an example with
n = 3 unknowns and equations:
2u + v + w = 5
4u 6v + 0w = 2
2u + 7v + 2w = 9.

(3)

Each equation now corresponds to a plane, and the row solution corresponds
to the intersection of the planes (i.e., the intersection of two planes is a line,
and that line intersects the third plane at a point: in this case, the point
u = 1, v = 1, w = 2). In vector form, the equations take the form:




2
1
1
5
4 u + 6 v + 0 w = 2 .
(4)
2
7
2
9
The solution again amounts to finding values for u, v, and w that scale the
vectors on the left so that their sum is equal to the vector on the right,
Figure 1(c).
In the context of solving linear equations we have introduced the notion
of a vector, scalar multiplication of a vector, and vector sum. In its most
general form, a n-dimensional column vector is represented as:

x1
x2

~x = .. ,
(5)
.
xn
2

and a n-dimensional row vector as:


~y =


y1 y2 . . . yn .

(6)

Scalar multiplication of a vector ~x by a scalar value c, scales the length of


the vector by an amount c, Figure 1(b), and is given by:

cv1
..
c~v = . .
(7)
cvn
The vector sum w
~ = ~x + ~y is computed via the parallelogram construction
or by stacking the vectors head to tail, Figure 1(b), and is computed by a
pairwise addition of the individual vector components:

w1
x1 + y1
w2
x2 + y2

(8)
.. = .. .
.
.
wn
xn + yn
The linear combination of vectors by vector addition and scalar multiplication
is one of the central ideas in linear algebra.

Matrices
In solving n linear equations in n unknowns there are three quantities to
consider. For example consider again the following set of equations:
2u + v + w = 5
4u 6v + 0w = 2
2u + 7v + 2w = 9.

(9)

On the right is the column vector:

5
2 ,
9

(10)

and on the left are the three unknowns that can also be written as a column
vector:

u
v .
(11)
w
The set of nine coefficients (3 rows, 3
form:

2
1
4 6
2 7

columns) can be written in matrix

1
0
2

(12)

Matrices, like vectors, can be added and scalar multiplied. Not surprising,
since we may think of a vector as a skinny matrix: a matrix with only one
column. Consider the following 3 3 matrix:

a1 a2 a3
A = a4 a5 a6 .
(13)
a7 a8 a9
The matrix cA, where c is a scalar value, is

ca1 ca2
cA = ca4 ca5
ca7 ca8

given by:

ca3
ca6 .
ca9

And the sum of two matrices, A = B + C,

a1 a2 a3
b1 + c 1
a4 a5 a6 = b4 + c4
a7 a8 a9
b7 + c 7

b2 + c 2 b3 + c 3
b5 + c 5 b6 + c 6 .
b8 + c 8 b9 + c 9

(14)

is given by:
(15)

With the vector and matrix notation we can rewrite the three equations
in the more compact form of A~x = ~b:



2
1 1
u
5
4 6 0 v = 2 .
(16)
2 7 2
w
9
Where the multiplication of the matrix A with vector ~x must be such that the
three original equations are reproduced. The first component of the product
4

comes from multiplying the first row of A (a row vector) with the column
vector ~x as follows:

 u

2 1 1 v = 2u + 1v + 1w .
(17)
w
This quantity is equal to 5, the first component of ~b, and is simply the first
of the three original equations. The full product is computed by multiplying
each row of the matrix A with the vector ~x as follows:



2
1 1
u
2u + 1v + 1w
5
4 6 0 v = 4u 6v + 0w = 2 .
(18)
2 7 2
w
2u + 7v + 2w
9
In its most general form the product of a m n matrix with a n dimensional column vector is a m dimensional column vector whose ith component
is:
n
X

aij xj ,

(19)

j=1

where aij is the matrix component in the ith row and j th column. The sum
along the ith row of the matrix is referred to as the inner product or dot
product between the matrix row (itself a vector) and the column vector ~x.
Inner products are another central idea in linear algebra. The computation
for multiplying two matrices extends naturally from that of multiplying a
matrix and a vector. Consider for example the following 3 4 and 4 2
matrices:

b11 b12
a11 a12 a13 a14
b21 b22

A = a21 a22 a23 a24 and B =


(20)
b31 b32 .
a31 a32 a33 a34
b41 b42
The product C = AB is a 3 2 matrix given by:
a11 b11 + a12 b21 + a13 b31 + a14 b41
a21 b11 + a22 b21 + a23 b31 + a24 b41
a31 b11 + a32 b21 + a33 b31 + a34 b41

a11 b12 + a12 b22 + a13 b32 + a14 b42


a21 b12 + a22 b22 + a23 b32 + a24 b42
a31 b12 + a32 b22 + a33 b32 + a34 b42

!
.

(21)

That is, the i, j component of the product C is computed from an inner


product of the ith row of matrix A and the j th column of matrix B. Notice
that this definition is completely consistent with the product of a matrix
and vector. In order to multiply two matrices A and B (or a matrix and a
vector), the column dimension of A must equal the row dimension of B. In
other words if A is of size mn, then B must be of size np (the product is of
size m p). This constraint immediately suggests that matrix multiplication
is not commutative: usually AB 6= BA. However matrix multiplication is
both associative (AB)C = A(BC) and distributive A(B + C) = AB + AC.
The identity matrix I is a special matrix with 1 on the diagonal and zero
elsewhere:

1 0 ... 0 0
0 1 . . . 0 0

(22)
I = ..
.. .
.
.
.
.
.
0 0 ... 0 1
Given the definition of matrix multiplication, it is easily seen that for any
vector ~x, I~x = ~x, and for any suitably sized matrix, IA = A and BI = B.
In the context of solving linear equations we have introduced the notion
of a vector and a matrix. The result is a compact notation for representing
linear equations, A~x = ~b. Multiplying both sides by the matrix inverse A1
yields the desired solution to the linear equations:
A1 A~x = A1~b
I~x = A1~b
~x = A1~b
(23)
A matrix A is invertible if there exists1 a matrix B such that BA = I and
AB = I, where I is the identity matrix. The matrix B is the inverse of
A and is denoted as A1 . Note that this commutative property limits the
discussion of matrix inverses to square matrices.
Not all matrices have inverses. Lets consider some
simple examples. The

1
inverse of a 1 1 matrix A = a is A = 1/a ; but the inverse does not
exist when a = 0. The inverse of a 2 2 matrix can be calculated as:



1
1
d b
a b
,
(24)
=
c d
ad bc c a
1

The inverse of a matrix is unique: assume that B and C are both the inverse of matrix
A, then by definition B = B(AC) = (BA)C = C, so that B must equal C.

but does not exist when ad bc = 0. Any diagonal matrix is invertible:

a1
1/a1

1
..
..
A=
(25)
and A =
,
.
.
an
1/an
as long as all the diagonal components are non-zero. The inverse of a product of matrices AB is (AB)1 = B 1 A1 . This is easily proved using the
associativity of matrix multiplication.2 The inverse of an arbitrary matrix,
if it exists, can itself be calculated by solving a collection of linear equations.
Consider for example a 3 3 matrix A whose inverse we know must satisfy
the constraint that AA1 = I:


1 0 0
2
1 1
4 6 0 ~x1 ~x2 ~x3 = ~e1 ~e2 ~e3 = 0 1 0 .
(26)
0 0 1
2 7 2
This matrix equation can be considered a column at a time yielding a system
of three equations Ax~1 = e~1 , Ax~2 = e~2 , and Ax~3 = e~3 . These can be solved
independently for the columns of the inverse matrix, or simultaneously using
the Gauss-Jordan method.
A system of linear equations A~x = ~b can be solved by simply left multiplying with the matrix inverse A1 (if it exists). We must naturally wonder the
fate of our solution if the matrix is not invertible. The answer to this question
is explored later. But before moving forward we need one last definition.
The transpose of a matrix A, denoted as AT , is constructed by placing
the ith row of A into the ith column of AT . For example:



1 4
1 2 1
and AT = 2 6 .
A=
(27)
4 6 0
1 0
In general, the transpose of a m n matrix is a n m matrix with (AT )ij =
Aji . The transpose of a sum of two matrices is the sum of the transposes:
(A + B)T = AT + B T . The transpose of a product of two matrices has
the familiar form (AB)T = B T AT . And the transpose of the inverse is the
In order to prove (AB)1 = B 1 A1 , we must show (AB)(B 1 A1 ) = I:
(AB)(B 1 A1 ) = A(BB 1 )A1 = AIA1 = AA1 = I, and that (B 1 A1 )(AB) = I:
(B 1 A1 )(AB) = B 1 (A1 A)B = B 1 IB = B 1 B = I.
2

inverse of the transpose: (A1 )T = (AT )1 . Of particular interest will be the


class of symmetric matrices that are equal to their own transpose AT = A.
Symmetric matrices are necessarily square, here is a 3 3 symmetric matrix:

2 1 4
A = 1 6 0 ,
(28)
4 0 3
notice that, by definition, aij = aji .

Vector Spaces
The most common vector space is that defined over the reals, denoted as Rn .
This space consists of all column vectors with n real-valued components,
with rules for vector addition and scalar multiplication. A vector space has
the property that the addition and multiplication of vectors always produces
vectors that lie within the vector space. In addition, a vector space must
satisfy the following properties, for any vectors ~x, ~y , ~z, and scalar c:
1.
2.
3.
4.
5.
6.
7.
8.

~x + ~y = ~y + ~x
(~x + ~y ) + ~z = ~x + (~y + ~z)
there exists a unique zero vector ~0 such that ~x + ~0 = ~x
there exists a unique inverse vector ~x such that
~x + (~x) = ~0
1~x = ~x
(c1 c2 )~x = c1 (c2~x)
c(~x + ~y ) = c~x + c~y
(c1 + c2 )~x = c1~x + c2~x

Vector spaces need not be finite dimensional, R is a vector space. Matrices


can also make up a vector space. For example the space of 3 3 matrices
can be thought of as R9 (imagine stringing out the nine components of the
matrix into a column vector).
A subspace of a vector space is a non-empty subset of vectors that is closed
under vector addition and scalar multiplication. That is, the following constraints are satisfied: (1) the sum of any two vectors in the subspace remains
in the subspace; (2) multiplication of any vector by a scalar yields a vector
in the subspace. With the closure property verified, the eight properties of a
vector space automatically hold for the subspace.
8

Example: Consider the set of all vectors in R2 whose components are greater than or equal to zero. The sum of any two
vectors in this space remains
of,
  in the space, but multiplication
 
1
1
for example, the vector
by 1 yields the vector
which
2
2
is no longer in the space. Therefore, this collection of vectors does
not form a vector space.
Vector subspaces play a critical role in understanding systems of linear
equations of the form A~x = ~b. Consider for example the following system:


u1 v1  
b1
u2 v2 x1
= b2 .
(29)
x2
u3 v3
b3
Unlike the earlier system of equations, this system is over-constrained, there
are more equations (three) than unknowns (two). A solution to this system
exists if the vector ~b lies in the subspace of the columns of matrix A. To see
why this is so, we rewrite the above system according to the rules of matrix
multiplication yielding an equivalent form:



u1
v1
b1

b2 .
x 1 u 2 + x2 v 2
=
(30)
u3
v3
b3
In this form, we see that a solution exists when the scaled columns of the
matrix sum to the vector ~b. This is simply the closure property necessary for
a vector subspace.
The vector subspace spanned by the columns of the matrix A is called
the column space of A. It is said that a solution to A~x = ~b exists if and only
if the vector ~b lies in the column space of A.
Example: Consider the following over-constrained system:
~
A~x = b
1 0  
b1
5 4 u
= b2
v
2 4
b3

The column space of A is the plane spanned by the vectors


T
T
1 5 2 and 0 4 4 . Therefore, the solution ~b cannot be
9

an arbitrary vector in R3 , but is constrained to lie in the plane


spanned by these two vectors.
At this point we have seen three seemingly different classes of linear equations of the form A~x = ~b, where the matrix A is either:
1. square and invertible (non-singular),
2. square but not invertible (singular),
3. over-constrained.
In each case solutions to the system exist if the vector ~b lies in the column
space of the matrix A. At one extreme is the invertible n n square matrix
whose solutions may be any vector in the whole of Rn . At the other extreme
is the zero matrix A = 0 with only the zero vector in its column space,
and hence the only possible solution. In between are the singular and overconstrained cases, where solutions lie in a subspace of the full vector space.
The second important vector space is the nullspace of a matrix. The
vectors that lie in the nullspace of a matrix consist of all solutions to the
system A~x = ~0. The zero vector is always in the nullspace.
Example: Consider the following system:
x = ~
0

A~

0
1 0 1
u
5 4 9 v = 0 .
0
2 4 6
w
T
T
The nullspace of A contains the zero vector u v w = 0 0 0 .
Notice also that the third column of A is the sum of the first two
columns, therefore the nullspace of A also contains all vectors of
T
T
the form u v w = c c c (i.e., all vectors lying on a
one-dimensional line in R3 ).

Basis
Recall that if the matrix A in the system A~x = ~b is invertible, then left
multiplying with A1 yields the desired solution: ~x = A1~b. In general it is
said that a n n matrix is invertible if it has rank n or is full rank, where the
10

rank of a matrix is the number of linearly independent rows in the matrix.


Formally, a set of vectors ~u1 , ~u2 , ..., ~un are linearly independent if:
c1~u1 + c2~u2 + ... + cn~un = ~0

(31)

is true only when c1 = c2 = ... = cn = 0. Otherwise, the vectors are linearly


dependent. In other words, a set of vectors are linearly dependent if at least
one of the vectors can be expressed as a sum of scaled copies of the remaining
vectors.
Linear independence is easy to visualize in lowerdimensional subspaces. In 2-D, two vectors are lin(2,2)
early dependent if they lie along a line, as shown
on the right. That is, there is a non-trivial combination of the vectors that yields the zero vec(1,1)
tor. In 2-D, any three vectors are guaranteed to
be linearly dependent. For example, as shown on
the right, the vector 1 2 can be expressed as a
sum of the
 remaining
 linearly independent vectors:
(2,2)
3
2
0
2
2
+
. In 3-D, three vectors are lin2
early dependent if they lie in the same plane. Also
in 3-D, any four vectors are guaranteed to be lin(2,0)
early dependent.
Linear independence is directly related to the
nullspace of a matrix. Specifically, the columns of
a matrix are linearly independent (i.e., the matrix (1,2)
(2,2)
is full rank) if the matrix nullspace contains only
the zero vector. For example, consider the following
system of linear equations:
(2,0)



u1 v1 w1
x1
0
u2 v2 w2 x2 = 0 .
(32)
u3 v3 w3
x3
0
Recall that the nullspace contains all vectors ~x such that x1~u +x2~v +x3 w
~ = 0.
Notice that this is also the condition for linear independence. If the only
solution is the zero vector then the vectors are linearly independent and the
matrix is full rank and invertible.
Linear independence is also related to the column space of a matrix. If
the column space of a n n matrix is all of Rn , then the matrix is full rank.
11

For example, consider the

u1
u 2
u3

following system of linear equations:




v1 w 1
x1
b1
v 2 w 2 x 2 = b2 .
v3 w3
x3
b3

(33)

If the the matrix is full rank, then the solution ~b can be any vector in R3 . In
such cases, the vectors ~u, ~v , w
~ are said to span the space.
Now, a linear basis of a vector space is a set of linearly independent vectors
that span the space. Both conditions are important. Given an n dimensional
vector space with n basis vectors ~v1 , ..., ~vn , any vector ~u in the space can be
written as a linear combination of these n vectors:
~u = a1 v~1 + ... + an v~n .

(34)

In addition, the linear independence guarantees that this linear combination


is unique. If there is another combination such that:
~u = b1 v~1 + ... + bn v~n ,

(35)

then the difference of these two representations yields


~0 = (a1 b1 )v~1 + ... + (an bn )v~n
= c1 v~1 + ... + cn v~n

(36)

which would violate the linear independence condition. While the representation is unique, the basis is not. A vector space has infinitely many different
bases. For example in R2 any two vectors that do not lie on a line form a
basis, and in R3 any three vectors that do not lie in a common plane or line
form a basis.


Examples: (1) The vectors 1 0 and 0 1 form the canonical
basis for R2 . These vectors are both linearly independent
and

span the entirevector space; (2) The vectors 1 0 0 , 0 1 0
and 1 0 0 do not form a basis for R3 . These vectors lie in
a 2-D plane and do not span the entire vector
space; and (3) The

vectors 1 0 0 , 0 1 0 , 0 0 2 , and 1 1 0 do not
form a basis. Although these vectors span the vector space, the
fourth vector is linearly dependent on the first two. Removing
the fourth vector leaves a basis for R3 .
12

You might also like