Matrix Algebra For Engineers: Jeffrey R. Chasnov
Matrix Algebra For Engineers: Jeffrey R. Chasnov
Matrix Algebra For Engineers: Jeffrey R. Chasnov
Jeffrey R. Chasnov
The Hong Kong University of Science and Technology
Department of Mathematics
Clear Water Bay, Kowloon
Hong Kong
Copyright ○
c 2018–2022 by Jeffrey Robert Chasnov
This work is licensed under the Creative Commons Attribution 3.0 Hong Kong License. To view
a copy of this license, visit http://creativecommons.org/licenses/by/3.0/hk/ or send a letter to
Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
Preface
View the promotional video on YouTube
These are my lecture notes for my online Coursera course, Matrix Algebra for Engineers.
I have divided these notes into chapters called Lectures, with each Lecture corresponding
to a video on Coursera. I have also uploaded all my Coursera videos to YouTube, and
links are placed at the top of each Lecture.
There are problems at the end of each lecture chapter and I have tried to choose prob-
lems that exemplify the main idea of the lecture. Students taking a formal university
course in matrix or linear algebra will usually be assigned many more additional prob-
lems, but here I follow the philosophy that less is more. I give enough problems for
students to solidify their understanding of the material, but not too many problems that
students feel overwhelmed and drop out. I do encourage students to attempt the given
problems, but if they get stuck, full solutions can be found in the Appendix.
There are also additional problems at the end of coherent sections that are given as
practice quizzes on the Coursera platform. Again, students should attempt these quizzes
on the platform, but if a student has trouble obtaining a correct answer, full solutions are
also found in the Appendix.
The mathematics in this matrix algebra course is at the level of an advanced high school
student, but typically students would take this course after completing a university-level
single variable calculus course. There are no derivatives and integrals in this course, but
student’s are expected to have a certain level of mathematical maturity. Nevertheless,
anyone who wants to learn the basics of matrix algebra is welcome to join.
Jeffrey R. Chasnov
Hong Kong
July 2018
Contents
I Matrices 1
1 Definition of a matrix 2
3 Special matrices 6
4 Transpose matrix 9
6 Inverse matrix 13
7 Orthogonal matrices 16
8 Rotation matrices 18
9 Permutation matrices 20
10 Gaussian elimination 25
12 Computing inverses 30
13 Elementary matrices 34
14 LU decomposition 36
15 Solving (LU)x = b 38
iv
CONTENTS v
16 Vector spaces 44
17 Linear independence 46
19 Gram-Schmidt process 52
21 Null space 58
23 Column space 63
25 Orthogonal projections 69
29 Laplace expansion 81
30 Leibniz formula 84
31 Properties of a determinant 86
35 Matrix diagonalization 98
Matrices
In this week’s lectures, we learn about matrices. Matrices are rectangular arrays of numbers
or other mathematical objects and are fundamental to engineering mathematics. We will define
matrices and how to add and multiply them, discuss some special matrices such as the identity and
zero matrix, learn about transposes and inverses, and define orthogonal and permutation matrices.
1
Lecture 1 | Definition of a matrix
View this lecture on YouTube
An m-by-n matrix is a rectangular array of numbers (or other mathematical objects) with
m rows and n columns. For example, a two-by-two matrix A, with two rows and two
columns, looks like !
a b
A= .
c d
The first row has elements a and b, the second row has elements c and d. The first column
has elements a and c; the second column has elements b and d. As further examples,
two-by-three and three-by-two matrices look like
! a d
a b c
B= , C = b e .
d e f
c f
Of special importance are column matrices and row matrices. These matrices are also
called vectors. The column vector is in general n-by-one and the row vector is one-by-n.
For example, when n = 3, we would write a column vector as
a
x = b ,
Here, the matrix element of A in the ith row and the jth column is denoted as aij .
2
WEEK I. MATRICES 3
a) Write down the three-by-three matrix with ones on the diagonal and zeros else-
where.
b) Write down the three-by-four matrix with ones on the diagonal and zeros elsewhere.
c) Write down the four-by-three matrix with ones on the diagonal and zeros elsewhere.
Matrices can be added only if they have the same dimension. Addition proceeds element
by element. For example,
! ! !
a b e f a+e b+ f
+ = .
c d g h c+g d+h
Matrices can also be multiplied by a scalar. The rule is to just multiply every element of
the matrix. For example, ! !
a b ka kb
k = .
c d kc kd
Matrices (other than the scalar) can be multiplied only if the number of columns of the
left matrix equals the number of rows of the right matrix. In other words, an m-by-n
matrix on the left can only be multiplied by an n-by-k matrix on the right. The resulting
matrix will be m-by-k. Evidently, matrix multiplication is generally not commutative. We
illustrate multiplication using two 2-by-2 matrices:
! ! ! ! ! !
a b e f ae + bg a f + bh e f a b ae + c f be + d f
= , = .
c d g h ce + dg c f + dh g h c d ag + ch bg + dh
First, the first row of the left matrix is multiplied against and summed with the first
column of the right matrix to obtain the element in the first row and first column of the
product matrix. Second, the first row is multiplied against and summed with the second
column. Third, the second row is multiplied against and summed with the first column.
And fourth, the second row is multiplied against and summed with the second column.
In general, an element in the resulting product matrix, say in row i and column j, is
obtained by multiplying and summing the elements in row i of the left matrix with the
elements in column j of the right matrix. We can formally write matrix multiplication in
terms of the matrix elements. Let A be an m-by-n matrix with matrix elements aij and let
B be an n-by-p matrix with matrix elements bij . Then C = AB is an m-by-p matrix, and its
ij matrix element can be written as
n
cij = ∑ aik bkj .
k =1
Notice that the second index of a and the first index of b are summed over.
4
WEEK I. MATRICES 5
1 3 4 0 0 4
4. Prove the associative law for matrix multiplication. That is, let A be an m-by-n matrix,
B an n-by-p matrix, and C a p-by-q matrix. Then prove that A(BC) = (AB)C.
The zero matrix, denoted by 0, can be any size and is a matrix consisting of all zero
elements. Multiplication by a zero matrix results in a zero matrix. The identity matrix,
denoted by I, is a square matrix (number of rows equals number of columns) with ones
down the main diagonal. If A and I are the same sized square matrices, then
AI = IA = A,
and multiplication by the identity matrix leaves the matrix unchanged. The zero and
identity matrices play the role of the numbers zero and one in matrix multiplication. For
example, the two-by-two zero and identity matrices are given by
! !
0 0 1 0
0= , I= .
0 0 0 1
A diagonal matrix has its only nonzero elements on the diagonal. For example, a two-by-
two diagonal matrix is given by !
d1 0
D= .
0 d2
Usually, diagonal matrices refer to square matrices, but they can also be rectangular.
A band (or banded) matrix has nonzero elements only on diagonal bands. For ex-
ample, a three-by-three band matrix with nonzero diagonals one above and one below a
nonzero main diagonal (called a tridiagonal matrix) is given by
d1 a1 0
B = b1 d2 a2 .
0 b2 d3
An upper or lower triangular matrix is a square matrix that has zero elements below or
above the diagonal. For example, three-by-three upper and lower triangular matrices are
given by
a b c a 0 0
U = 0 d e , L = b d 0 .
0 0 f c e f
6
WEEK I. MATRICES 7
Construct a two-by-two matrix B such that AB is the zero matrix. Use two different
nonzero columns for B.
2. Verify that ! ! !
a1 0 b1 0 a1 b1 0
= .
0 a2 0 b2 0 a2 b2
Prove in general that the product of two diagonal matrices is a diagonal matrix, with
elements given by the product of the diagonal elements.
3. Verify that ! ! !
a1 a2 b1 b2 a1 b1 a1 b2 + a2 b3
= .
0 a3 0 b3 0 a3 b3
Prove in general that the product of two upper triangular matrices is an upper triangular
matrix, with the diagonal elements of the product given by the product of the diagonal
elements.
3. Let A and B be n-by-n matrices with (AB)ij = ∑nk=1 aik bkj . If A and B are upper trian-
gular matrices, then aik = 0 or bkj = 0 when
a) A and C only
b) A and D only
c) B and C only
d) B and D only
8
Lecture 4 | Transpose matrix
View this lecture on YouTube
A less obvious fact is that the transpose of the product of matrices is equal to the product
of the transposes with the order of multiplication reversed, i.e.,
(AB)T = BT AT .
c e f −c −e 0
9
WEEK I. MATRICES 10
2. Show using the transpose operator that any square matrix A can be written as the sum
of a symmetric and a skew-symmetric matrix.
The inner product (or dot product or scalar product) between two vectors is obtained from
the matrix product of a row vector times a column vector. A row vector can be obtained
from a column vector by the transpose operator. With the 3-by-1 column vectors u and v,
their inner product is given by
v1
uT v = u 1 u2 u3 v2 = u1 v1 + u2 v2 + u3 v3 .
v3
If the inner product between two nonzero vectors is zero, we say that the vectors are
orthogonal. The norm of a vector is defined by
1/2 1/2
||u|| = uT u = u21 + u22 + u23 .
If the norm of a vector is equal to one, we say that the vector is normalized. If a set of vectors
are mutually orthogonal and normalized, we say that these vectors are orthonormal.
An outer product is also defined, and is used in some applications. The outer product
between u and v is given by
u1 u1 v1 u1 v2 u1 v3
uvT = u2 v1 v2 v3 = u2 v1 u2 v2 u2 v3 .
u3 u3 v1 u3 v2 u3 v3
Notice that every column is a multiple of the single vector u, and every row is a multiple
of the single vector vT .
11
WEEK I. MATRICES 12
c f
is a symmetric square matrix and that the sum of its diagonal elements is the sum of the
squares of all the elements of A.
2. The trace of a square matrix B, denoted as Tr B, is the sum of the diagonal elements of
B. Prove that Tr(AT A) is the sum of the squares of all the elements of A.
Square matrices may have inverses. When a matrix A has an inverse, we say it is in-
vertible and denote its inverse by A−1 . The inverse matrix satisfies
AA−1 = A−1 A = I.
If A and B are invertible matrices, then (AB)−1 = B−1 A−1 . Furthermore, if A is invertible
then so is AT , and (AT )−1 = (A−1 )T .
It is illuminating to derive the inverse of a general 2-by-2 matrix. Write
! ! !
a b x1 x2 1 0
= ,
c d y1 y2 0 1
and try to solve for x1 , y1 , x2 and y2 in terms of a, b, c, and d. There are two inhomoge-
neous and two homogeneous linear equations:
To solve, we can eliminate y1 and y2 using the two homogeneous equations, and find x1
and x2 using the two inhomogeneous equations. The solution for the inverse matrix is
found to be ! −1 !
a b 1 d −b
= .
c d ad − bc −c a
The term ad − bc is just the definition of the determinant of the two-by-two matrix:
!
a b
det = ad − bc.
c d
The determinant of a two-by-two matrix is the product of the diagonals minus the prod-
uct of the off-diagonals. Evidently, a two-by-two matrix A is invertible only if det A ̸= 0.
Notice that the inverse of a two-by-two matrix, in words, is found by switching the di-
agonal elements of the matrix, negating the off-diagonal elements, and dividing by the
determinant.
Later, we will show that an n-by-n matrix is invertible if and only if its determinant is
nonzero. This will require a more general definition of the determinant.
13
WEEK I. MATRICES 14
2. Prove that if A and B are same-sized invertible matrices , then (AB)−1 = B−1 A−1 .
5. Consider the parallelogram constructed by the two lines drawn from the origin to the
points ( a, b) and (c, d), as drawn in the figure.
Show that the area of the parallelogram is given by the absolute value of the determinant
!
a b
Area = det .
c d
a) AT BT CT
b) AT CT BT
c) CT AT BT
d) CT BT AT
a) A + AT
b) AAT
c) A − AT
d) AT A
!
2 2
3. Which matrix is the inverse of ?
1 2
!
1 −2
2
a)
2 −1 2
!
1 −2 2
b)
2 1 −2
!
1 2 2
c)
2 −1 −2
!
1 −2 −2
d)
2 1 2
15
Lecture 7 | Orthogonal matrices
View this lecture on YouTube
Q − 1 = QT
QQT = I and QT Q = I.
where q1 and q2 are the two-by-one column vectors of the matrix Q. Then
! !
qT1 qT1 q1 qT1 q2
QT Q = q1 q2 = .
qT2 qT2 q1 qT2 q2
That is, the columns of Q form an orthonormal set of vectors. The same argument can
also be made for the rows of Q.
Therefore, an equivalent definition of an orthogonal matrix is a square matrix with
real entries whose columns (and also rows) form a set of orthonormal vectors.
There is a third equivalent definition of an orthogonal matrix. Let Q be an n-by-n
orthogonal matrix, and let x be an n-by-one column vector. Then the length squared of
the vector Qx is given by
The length of Qx is therefore equal to the length of x, and we say that an orthogonal
matrix is a matrix that preserves lengths. In the next lecture, an example of an orthogonal
matrix will be the matrix that rotates a two-dimensional vector in the plane.
16
WEEK I. MATRICES 17
A matrix that rotates a vector in space doesn’t change the vector’s length and so should
be an orthogonal matrix. Consider the two-by-two rotation matrix that rotates a vector
through an angle θ in the x-y plane, shown above. Trigonometry and the addition formula
for cosine and sine results in
x ′ = r cos (θ + ψ) y′ = r sin (θ + ψ)
= r (cos θ cos ψ − sin θ sin ψ) = r (sin θ cos ψ + cos θ sin ψ)
= x cos θ − y sin θ = x sin θ + y cos θ.
The above two-by-two matrix is a rotation matrix and we will denote it by Rθ . Observe
that the rows and columns of Rθ are orthonormal and that the inverse of Rθ is just its
transpose. The inverse of Rθ rotates a vector by −θ.
18
WEEK I. MATRICES 19
2. Find the three-by-three matrix that rotates a three-dimensional vector an angle θ coun-
terclockwise around the z-axis.
The rows of a three-by-three matrix have 3! = 6 possible permutations, namely {1, 2, 3},
{1, 3, 2}, {2, 1, 3}, {2, 3, 1}, {3, 1, 2}, {3, 2, 1}. For example, the row permutation {3, 1, 2} is
achieved by
0 0 1 a b c g h i
1 0 0 d e f = a b c .
0 1 0 g h i d e f
Notice that the permutation matrix is obtained by permuting the corresponding rows of
the identity matrix, with the rows of the identity matrix permuted as {1, 2, 3} → {3, 1, 2}.
That a permutation matrix is just a row-permuted identity matix is made evident by
writing
PA = (PI)A,
where P is a permutation matrix and PI is the identity matrix with permuted rows. The
identity matrix is orthogonal, and so is the permutation matrix obtained by permuting
the rows of the identity matrix.
20
WEEK I. MATRICES 21
2. Find the inverses of all the three-by-three permutation matrices. Explain why some
matrices are their own inverses, and others are not.
0 sin θ cos θ
sin θ 0 cos θ
b) 0 1 0
cos θ 0 − sin θ
cos θ − sin θ 0
c) sin θ cos θ 0
0 0 1
cos θ sin θ 0
d) − sin θ cos θ 0
0 0 1
22
WEEK I. MATRICES 23
3. Which matrix, when left multiplying another matrix, moves row one to row two, row
two to row three, and row three to row one?
0 1 0
a) 0 0 1
1 0 0
0 0 1
b) 1 0 0
0 1 0
0 0 1
c) 0 1 0
1 0 0
1 0 0
d) 0 0 1
0 1 0
In this week’s lectures, we learn about solving a system of linear equations. A system of linear
equations can be written in matrix form, and we can solve using Gaussian elimination. We will
learn how to bring a matrix to reduced row echelon form, and how this can be used to compute a
matrix inverse. We will also learn how to find the LU decomposition of a matrix, and how to use
this decomposition to efficiently solve a system of linear equations.
24
Lecture 10 | Gaussian elimination
View this lecture on YouTube
3 −4 4 x3 −6
or symbolically as Ax = b.
The standard numerical algorithm used to solve a system of linear equations is called
Gaussian elimination. We first form what is called an augmented matrix by combining the
matrix A with the column vector b:
−3 2 −1 −1
6 −6 7 −7 .
3 −4 4 −6
Row reduction is then performed on this augmented matrix. Allowed operations are (1)
interchange the order of any rows, (2) multiply any row by a constant, (3) add a multiple
of one row to another row. These three operations do not change the solution of the
original equations. The goal here is to convert the matrix A into upper-triangular form,
and then use this form to quickly solve for the unknowns x.
We start with the first row of the matrix and work our way down as follows. First we
multiply the first row by 2 and add it to the second row. Then we add the first row to the
third row, to obtain
−3 2 −1 −1
0 −2 5 −9 .
0 −2 3 −7
We then go to the second row. We multiply this row by −1 and add it to the third row to
obtain
−3 2 −1 −1
0 −2 5 −9 .
0 0 −2 2
The original matrix A has been converted to an upper triangular matrix, and the trans-
25
WEEK II. SYSTEMS OF LINEAR EQUATIONS 26
These equations can be solved by back substitution, starting from the last equation and
working backwards. We have
x3 = −1,
1
x2 = − (−9 − 5x3 ) = 2,
2
1
x1 = − (−1 + x3 − 2x2 ) = 2.
3
x3 −1
When performing Gaussian elimination, the matrix element that is used during the elim-
ination procedure is called the pivot. To obtain the correct multiple, one uses the pivot
as the divisor to the matrix elements below the pivot. Gaussian elimination in the way
done here will fail if the pivot is zero. If the pivot is zero, a row interchange must first be
performed.
Even if no pivots are identically zero, small values can still result in an unstable nu-
merical computation. For very large matrices solved by a computer, the solution vector
will be inaccurate unless row interchanges are made. The resulting numerical technique
is called Gaussian elimination with partial pivoting, and is usually taught in a standard
numerical analysis course.
WEEK II. SYSTEMS OF LINEAR EQUATIONS 27
(a)
(b)
x1 − 2x2 + 3x3 = 1,
− x1 + 3x2 − x3 = −1,
2x1 − 5x2 + 5x3 = 1.
A matrix is said to be in reduced row echelon form if the first nonzero entry in every
row is a one, all the entries below and above this one are zero, and any zero rows occur
at the bottom of the matrix.
The row elimination procedure of Gaussian elimination can be continued to bring a
matrix to reduced row echelon form. We notate the reduced row echelon form of a matrix
A as rref(A). For example, consider the three-by-four matrix
1 2 3 4
A = 4 5 6 7 .
6 7 8 9
6 7 8 9 0 −5 −10 −15 0 1 2 3 0 0 0 0
0 0 0 0
We say that the matrix A has two pivot columns, that is, two columns that contain a pivot
position with a one in the reduced row echelon form.
Note that rows may need to be exchanged when computing the reduced row echelon
form. Also, the reduced row echelon form of a matrix A is unique, and if A is a square
invertible matrix, then rref(A) is the identity matrix.
28
WEEK II. SYSTEMS OF LINEAR EQUATIONS 29
(a)
−7 −2 −7
3
A = −3 5 1 5
6 −4 0 2
(b)
1 2 1
A = 2 4 1
3 6 2
By bringing an invertible matrix to reduced row echelon form, that is, to the identity
matrix, we can compute the matrix inverse. Given a matrix A, consider the equation
AA−1 = I,
for the unknown inverse A−1 . Let the columns of A−1 be given by the vectors a1−1 , a2−1 ,
and so on. The matrix A multiplying the first column of A−1 is the equation
T
Aa1−1 = e1 , with e1 = 1 0 ... 0 ,
Aai−1 = ei ,
3 −4 4 0 0 1 0 −2 3 1 0 1
−3 2−1 1 0 0 −3 0 4 3 1 0
0 −2 5 2 1 0 → 0 −2 5 2 1 0 →
0 0 −2 −1 −1 1 0 0 −2 −1 −1 1
−3 0 0 −1 1 2 1 0 0 −1/3 1/3 −2/3
0 −2 0 −1/2 −3/2 5/2 → 0 1 0 1/4 3/4 −5/4 ;
30
WEEK II. SYSTEMS OF LINEAR EQUATIONS 31
6 −4 0
4 −7 1 −2
1 −2 1 0
a) 0 1 −1 1
0 0 −2 −3
1 −2 1 0
b) 0 1 −1 1
0 0 −2 3
1 −2 1 0
c) 0 1 −1 1
0 0 −3 −2
1 −2 1 0
d) 0 1 −1 1
0 0 −3 2
0 0 1 2
1 2 0 0
b) 0 0 1 0
0 0 0 1
1 0 1 0
c) 0 1 0 0
0 0 1 1
1 0 0 0
d) 0 1 2 0
0 0 0 1
32
WEEK II. SYSTEMS OF LINEAR EQUATIONS 33
3 −7 −2
3. The inverse of −3 5 1 is
6 −4 0
4/3 2/3 1/2
a) 2 1 1/2
−3 −5 −1
2/3 1/2 4/3
b) 1 1/2 2
−3 −5 −1
2/3 4/3 1/2
c) 1 2 1/2
−5 −3 −1
2/3 4/3 1/2
d) 1 2 1/2
−3 −5 −1
3 −4 4 3 −4 4 0 0 1
To construct the elementary matrix M1 , the number two is placed in column-one, row-two.
This matrix multiplies the first row by two and adds the result to the second row.
The next step in row elimination is
−3 2 −1 −3 2 −1 1 0 0
0 −2 5 → 0 −2 5 = M2 M1 A, where M2 = 0 1 0 .
3 −4 4 0 −2 3 1 0 1
Here, to construct M2 the number one is placed in column-one, row-three, and the matrix
multiplies the first row by one and adds the result to the third row.
The last step in row elimination is
−3 2 −1 −3 2 −1 1 0 0
0 −2 5 → 0 −2 5 = M3 M2 M1 A, where M3 = 0 1 0 .
0 −2 3 0 0 −2 0 −1 1
where U is an upper triangular matrix. This discussion will be continued in the next
lecture.
34
WEEK II. SYSTEMS OF LINEAR EQUATIONS 35
In the last lecture, we have found that row reduction of a matrix A can be written as
M3 M2 M1 A = U,
Now, the matrix M1 multiples the first row by two and adds it to the second row. To invert
this operation, we simply need to multiply the first row by negative-two and add it to the
second row, so that
1 0 0 1 0 0
M1 = 2 1 0 , M1−1 = −2 1 0 .
0 0 1 0 0 1
Similarly,
1 0 0 1 0 0 1 0 0 1 0 0
M2 = 0 1 0 , M2−1 = 0 1 0 ; M3 = 0 1 0 , M3−1 = 0 1 0 .
1 0 1 −1 0 1 0 −1 1 0 1 1
Therefore,
L = M1−1 M2−1 M3−1
is given by
1 0 0 1 0 0 1 0 0 1 0 0
L = −2 1 0 0 1 0 0 1 0 = −2 1 0 ,
0 0 1 −1 0 1 0 1 1 −1 1 1
which is lower triangular. Also, the non-diagonal elements of the elementary inverse
matrices are simply combined to form L. Our LU decomposition of A is therefore
−3 2 −1 1 0 0 −3 2 −1
6 −6 7 = −2 1 0 0 −2 5 .
3 −4 4 −1 1 1 0 0 −2
36
WEEK II. SYSTEMS OF LINEAR EQUATIONS 37
6 −4 0
The LU decomposition is useful when one needs to solve Ax = b for many right-hand-
sides. With the LU decomposition in hand, one writes
(LU)x = L(Ux) = b,
and lets y = Ux. Then we solve Ly = b for y by forward substitution, and Ux = y for x by
backward substitution. It is possible to show that for large matrices, solving (LU)x = b is
substantially faster than solving Ax = b directly.
−1 1 1 0 0 −2 −6
−1 1 1 y3 −6
y1 = −1,
y2 = −7 + 2y1 = −9,
y3 = −6 + y1 − y2 = 2.
0 0 −2 x3 2
x3 = −1,
1
x2 = − (−9 − 5x3 ) = 2,
2
1
x1 = − (−1 − 2x2 + x3 ) = 2,
3
38
WEEK II. SYSTEMS OF LINEAR EQUATIONS 39
x3 −1
WEEK II. SYSTEMS OF LINEAR EQUATIONS 40
6 −4 0 2 −5 1 0 0 −1
2 1
41
WEEK II. SYSTEMS OF LINEAR EQUATIONS 42
3 −7 −2
2. Which of the following is the LU decomposition of −3 5 1?
6 −4 0
1 0 0 3 −7 −2
a) −1 1 0 0 −2 −1
2 −5 1/2 0 0 −2
1 0 0 3 −7 −2
b) −1 1 0 0 −2 −1
2 −5 1 0 0 −1
1 0 0 3 −7 −2
c) −1 2 −1 0 −1 −1
2 −10 6 0 0 −1
1 0 0 3 −7 −2
d) −1 1 0 0 −2 −1
4 −5 1 −6 14 3
1 0 0 3 −7 −2 1
3. Suppose L = −1 1 0, U = 0 −2 −1 and b = −1. Solve LUx = b
2 −5 1 0 0 −1 1
by letting y = Ux. The solutions for y and x are
−1 1/6
a) y = 0, x = 1/2
1 −1
1 −1/6
b) y = 0, x = −1/2
−1 1
1 1/6
c) y = 0, x = −1/2
−1 1
−1 −1/6
d) y = 0, x = 1/2
1 1
Vector Spaces
In this week’s lectures, we learn about vector spaces. A vector space consists of a set of vectors and
a set of scalars that is closed under vector addition and scalar multiplication and that satisfies the
usual rules of arithmetic. We will learn some of the vocabulary and phrases of linear algebra, such
as linear independence, span, basis and dimension. We will learn about the four fundamental sub-
spaces of a matrix, the Gram-Schmidt process, orthogonal projection, and the matrix formulation
of the least-squares problem of drawing a straight line to fit noisy data.
43
Lecture 16 | Vector spaces
View this lecture on YouTube
A vector space consists of a set of vectors and a set of scalars. Although vectors can be
quite general, for the purpose of this course we will only consider vectors that are real
column matrices, and scalars that are real numbers.
For the set of vectors and scalars to form a vector space, the set of vectors must be
closed under vector addition and scalar multiplication. That is, when you multiply any
two vectors in the set by real numbers and add them, the resulting vector must still be in
the set.
As an example, consider the set of vectors consisting of all three-by-one matrices, and
let u and v be two of these vectors. Let w = au + bv be the sum of these two vectors
multiplied by the real numbers a and b. If w is still a three-by-one matrix, then this set of
vectors is closed under scalar multiplication and vector addition, and is indeed a vector
space. The proof is rather simple. If we let
u1 v1
u = u2 , v = v2 ,
u3 v3
then
au1 + bv1
w = au + bv = au2 + bv2
au3 + bv3
44
WEEK III. VECTOR SPACES 45
2. Explain why the following sets of three-by-one matrices (with real number scalars) are
vector spaces:
(a) The set of three-by-one matrices with zero in the first row;
(b) The set of three-by-one matrices with first row equal to the second row;
(c) The set of three-by-one matrices with first row a constant multiple of the third row.
The vectors {u1 , u2 , . . . , un } are linearly independent if for any scalars c1 , c2 , . . . , cn , the
equation
c 1 u1 + c 2 u2 + · · · + c n u n = 0
has only the solution c1 = c2 = · · · = cn = 0. What this means is that one is unable to
write any of the vectors u1 , u2 , . . . , un as a linear combination of any of the other vectors.
For instance, if there was a solution to the above equation with c1 ̸= 0, then we could
solve that equation for u1 in terms of the other vectors with nonzero coefficients.
As an example consider whether the following three three-by-one column vectors are
linearly independent:
1 0 2
u = 0 , v = 1 , w = 3 .
0 0 0
Indeed, they are not linearly independent, that is, they are linearly dependent, because w
can be written in terms of u and v. In fact, w = 2u + 3v.
Now consider the three three-by-one column vectors given by
1 0 0
u = 0 , v = 1 , w = 0 .
0 0 1
These three vectors are linearly independent because you cannot write any one of these
vectors as a linear combination of the other two. If we go back to our definition of linear
independence, we can see that the equation
a 0
au + bv + cw = b = 0
c 0
46
WEEK III. VECTOR SPACES 47
−1
1 1
(b) 1 , −1 , 1
1 1 −1
0
1 1
(c) 1 , 0 , 1
0 1 1
Given a set of vectors, one can generate a vector space by forming all linear combina-
tions of that set of vectors. The span of the set of vectors {v1 , v2 , . . . , vn } is the vector
space consisting of all linear combinations of v1 , v2 , . . . , vn . We say that a set of vectors
spans a vector space.
For example, the set of vectors given by
1
0 2
0 , 1 , 3
0 0 0
spans the vector space of all three-by-one matrices with zero in the third row. This vector
space is a vector subspace of all three-by-one matrices.
One doesn’t need all three of these vectors to span this vector subspace because any
one of these vectors is linearly dependent on the other two. The smallest set of vectors
needed to span a vector space forms a basis for that vector space. Here, given the set of
vectors above, we can construct a basis for the vector subspace of all three-by-one matrices
with zero in the third row by simply choosing two out of three vectors from the above
spanning set. Three possible basis vectors are given by
1
0 1
2 0
2
0 1 ,
, 0 3 ,
, 1 3 .
,
0 0 0 0 0 0
Although all three combinations form a basis for the vector subspace, the first combination
is usually preferred because this is an orthonormal basis. The vectors in this basis are
mutually orthogonal and of unit norm.
The number of vectors in a basis gives the dimension of the vector space. Here, the
dimension of the vector space of all three-by-one matrices with zero in the third row is
two.
48
WEEK III. VECTOR SPACES 49
b) The set of three-by-one matrices with the sum of all rows equal to one.
c) The set of three-by-one matrices with the first row equal to the third row.
d) The set of three-by-one matrices with the first row equal to the sum of the second
and third rows.
2
1 4
b) 1 , −1 , 6
1 2 −2
1 0 1
c) 0 , 1 , −1
−1 −1 0
3
3 2
d) 2 , 1 , 1
1 2 0
50
WEEK III. VECTOR SPACES 51
3. Which one of the following is an orthonormal basis for the vector space of all three-by-
one matrices with the sum of all rows equal to zero?
1 1 −1
1
a) √ −1 , √ 1
2
2
0 0
1 1 1
1
b) √ −1 , √ 1
2
6
0 −2
1 1 1 0
1 1
c) √ −1 , √ 0 , √ 1
2
2 2
0 −1 −1
1 2 −1 −1
1 1
d) √ −1 , √ 2 , √ −1
6
6 6
−1 −1 2
Given any basis for a vector space, we can use an algorithm called the Gram-Schmidt
process to construct an orthonormal basis for that space. Let the vectors v1 , v2 , . . . , vn be
a basis for some n-dimensional vector space. We will assume here that these vectors are
column matrices, but this process also applies more generally.
We will construct an orthogonal basis u1 , u2 , . . . , un , and then normalize each vector
to obtain an orthonormal basis. First, define u1 = v1 . To find the next orthogonal basis
vector, define
(uT1 v2 )u1
u 2 = v2 − .
uT1 u1
Observe that u2 is equal to v2 minus the component of v2 that is parallel to u1 . By
multiplying both sides of this equation with uT1 , it is easy to see that uT1 u2 = 0 so that
these two vectors are orthogonal.
The next orthogonal vector in the new basis can be found from
Here, u3 is equal to v3 minus the components of v3 that are parallel to u1 and u2 . We can
continue in this fashion to construct n orthogonal basis vectors. These vectors can then be
normalized via
u1
b1 =
u , etc.
(uT1 u1 )1/2
Since uk is a linear combination of v1 , v2 , . . . , vk , the vector subspace spanned by the
first k basis vectors of the original vector space is the same as the subspace spanned by the
first k orthonormal vectors generated through the Gram-Schmidt process. We can write
this result as
span{u1 , u2 , . . . , uk } = span{v1 , v2 , . . . , vk }.
52
WEEK III. VECTOR SPACES 53
and construct an orthonormal basis for this subspace. Let u1 = v1 . Then u2 is found from
(uT1 v2 )u1
u 2 = v2 −
uT1 u1
0 1 −2
2 1
= 1 − 1 = 1 .
3 3
1 1 1
Notice that the initial two vectors v1 and v2 span the vector subspace of three-by-one
column matrices for which the second and third rows are equal. Clearly, the orthonormal
basis vectors constructed from the Gram-Schmidt process span the same subspace.
54
WEEK III. VECTOR SPACES 55
Use the Gram-Schmidt process to construct an orthonormal basis for this subspace.
2. Consider a subspace of all four-by-one column vectors with the following basis:
1
0 0
1 1 0
W=
,
, .
1
1 1
1
1 1
Use the Gram-Schmidt process to construct an orthonormal basis for this subspace.
is always perpendicular to
a) v1
b) v2
c) v3
d) v4
( ! !)
1 1
2. The Gram-Schmidt process applied to {v1 , v2 } = , results in
1 −1
( ! !)
1 1 1 1
a) {u b2 } = √
b1 , u , √
2 1 2 −1
( ! !)
1 1 0
b) {u b2 } = √
b1 , u ,
2 1 0
( ! !)
1 0
c) {u b2 } =
b1 , u ,
0 1
( ! !)
1 1 1 2
d) {u b2 } =
b1 , u √ , √
3 2 3 −1
56
WEEK III. VECTOR SPACES 57
1 0
3. The Gram-Schmidt process applied to {v1 , v2 } = 1 , 1 results in
−1 −1
1 1 0
1
a) {u b 2 } = √ 1 , √ 1
b1 , u
3
2
−1 1
1 1 −2
1
b) {u b 2 } = √ 1 , √ 1
b1 , u
3
6
−1 −1
1 1 1
1
c) {u b 2 } = √ 1 , √ −1
b1 , u
3 2
−1 0
1 1 1
1
d) {u b2 } =
b1 , u √ 1 , √ 0
3
2
−1 1
The null space of a matrix A, which we denote as Null(A), is the vector space spanned by
all column vectors x that satisfy the matrix equation
Ax = 0.
Clearly, if x and y are in the null space of A, then so is ax + by so that the null space is
closed under vector addition and scalar multiplication. If the matrix A is m-by-n, then
Null(A) is a vector subspace of all n-by-one column matrices. If A is a square invertible
matrix, then Null(A) consists of just the zero vector.
To find a basis for the null space of a noninvertible matrix, we bring A to reduced row
echelon form. We demonstrate by example. Consider the three-by-five matrix given by
−3 6 −1 1 −7
A = 1 −2 2 3 −1 .
2 −4 5 8 −4
By judiciously permuting rows to simplify the arithmetic, one pathway to construct rref(A)
is
−3 6 −1 −7
1 1 −2 2 3 −1
1 − 2 2 3 −1 → −3 6 −1 1 −7
2 −4 5 8 −4 2 −4 5 8 −4
1 −2 2 3 −1 1 −2 2 3 −1
→ 0 0 5 10 −10 → 0 0 1 2 −2
0 0 1 2 −2 0 0 5 10 −10
1 −2 0 −1 3
→ 0 0 1 2 −2 .
0 0 0 0 0
We call the variables associated with the pivot columns, x1 and x3 , basic variables, and the
variables associated with the non-pivot columns, x2 , x4 and x5 , free variables. Writing the
basic variables on the left-hand side of the Ax = 0 equations, we have from the first and
second rows
x1 = 2x2 + x4 − 3x5 ,
x3 = −2x4 + 2x5 .
58
WEEK III. VECTOR SPACES 59
Eliminating x1 and x3 , we can write the general solution for vectors in Null(A) as
2x2 + x4 − 3x5 2 1 −3
x2 1 0 0
−2x4 + 2x5 = x2 0 + x4 −2 + x5 2 ,
x4 0 1 0
x5 0 0 1
where the free variables x2 , x4 , and x5 can take any values. By writing the null space in
this form, a basis for Null(A) is made evident, and is given by
2 1 −3
1 0 0
0 , −2 , 2 .
0 1 0
0 0 1
1 0 1 1
2x1 + 2x2 + x3 = 0,
2x1 − 2x2 − x3 = 1,
The null space satisfying Au = 0 is determined from u1 = 0 and u2 = −u3 /2, and we can
write
0
Null(A) = span −1 .
2
61
WEEK III. VECTOR SPACES 62
The column space of a matrix is the vector space spanned by the columns of the ma-
trix. When a matrix is multiplied by a column vector, the resulting vector is in the column
space of the matrix, as can be seen from
! ! ! ! !
a b x ax + by a b
= =x +y .
c d y cx + dy c d
2 −4 5 8 −4 0 0 0 0 0
The matrix equation Ax = 0 expresses the linear dependence of the columns of A, and
row operations on A do not change the dependence relations. For example, the second
column of A above is −2 times the first column, and after several row operations, the
second column of rref(A) is still −2 times the first column.
It should be self-evident that only the pivot columns of rref(A) are linearly indepen-
dent, and the dimension of the column space of A is therefore equal to its number of
pivot columns; here it is two. A basis for the column space is given by the first and third
columns of A, (not rref(A)), and is
−3
−1
1 , 2 .
2 5
Recall that the dimension of the null space is the number of non-pivot columns—equal
to the number of free variables—so that the sum of the dimensions of the null space and
the column space is equal to the total number of columns. A statement of this theorem is
as follows. Let A be an m-by-n matrix. Then
dim(Col(A)) + dim(Null(A)) = n.
63
WEEK III. VECTOR SPACES 64
1 0 1 1
In addition to the column space and the null space, a matrix A has two more vector
spaces associated with it, namely the column space and null space of AT , which are called
the row space and the left null space.
If A is an m-by-n matrix, then the row space and the null space are subspaces of all
n-by-one column matrices, and the column space and the left null space are subspaces of
all m-by-one column matrices.
The null space consists of all vectors x such that Ax = 0, that is, the null space is the
set of all vectors that are orthogonal to the row space of A. We say that these two vector
spaces are orthogonal.
A basis for the row space of a matrix can be found from computing rref(A), and
is found to be rows of rref(A) (written as column vectors) with pivot columns. The
dimension of the row space of A is therefore equal to the number of pivot columns, while
the dimension of the null space of A is equal to the number of nonpivot columns. The
union of these two subspaces make up the vector space of all n-by-one matrices and we
say that these subspaces are orthogonal complements of each other.
Furthermore, the dimension of the column space of A is also equal to the number of
pivot columns, so that the dimensions of the column space and the row space of a matrix
are equal. We have
dim(Col(A)) = dim(Row(A)).
We call this dimension the rank of the matrix A. This is an amazing result since the
column space and row space are subspaces of two different vector spaces. In general, we
must have rank(A) ≤ min(m, n). When the equality holds, we say that the matrix is of
full rank. And when A is a square matrix and of full rank, then the dimension of the null
space is zero and A is invertible.
65
WEEK III. VECTOR SPACES 66
Check to see that null space is the orthogonal complement of the row space, and the
left null space is the orthogonal complement of the column space. Find rank(A). Is this
matrix of full rank?
3 6 1 1
−2 4
1 −2
a)
,
0
0
0 0
0
0
b)
0
0
0
0
c)
−3
2
−2
1
d)
0
0
67
WEEK III. VECTOR SPACES 68
x1 + 2x2 + x4 = 1,
2x1 + 4x2 + x3 + x4 = 1,
3x1 + 6x2 + x3 + x4 = 1,
is
0 −2
0 1
a) 0 + 0
a
1 0
−2 0
1 0
b) 0 + 0
a
0 1
0 0
0 0
c) 0 + −3
a
1 2
0 0
0 0
d) −3 + 0
a
2 1
1 2 0 1
3. What is the rank of the matrix 2 4 1 1?
3 6 1 1
a) 1
b) 2
c) 3
d) 4
Suppose that V is the n-dimensional vector space of all n-by-one matrices and W is a
p-dimensional subspace of V. Let {s1 , s2 , . . . , s p } be an orthonormal basis for W. Extend-
ing the basis for W, let {s1 , s2 , . . . , s p , t1 , t2 , . . . , tn− p } be an orthonormal basis for V.
Any vector v in V can be expanded using the basis for V as
v = a1 s1 + a2 s2 + · · · + a p s p + b1 t1 + b2 t2 + bn− p tn− p ,
where the a’s and b’s are scalar coefficients. The orthogonal projection of v onto W is then
defined as
vprojW = a1 s1 + a2 s2 + · · · + a p s p ,
w = c1 s1 + c2 s2 + · · · + c p s p .
The distance between v and w is given by the norm ||v − w||, and we have
or ||v − vprojW || ≤ ||v − w||, a result that will be used later in the problem of least squares.
69
WEEK III. VECTOR SPACES 70
c
1
0 1 0
span 1 , 1 . What are the projections when v = 0 and when v = 1?
1 1 0 0
Suppose there is some experimental data that you want to fit by a straight line. This
is called a linear regression problem and an illustrative example is shown below.
y1 = β 0 + β 1 x1 , y2 = β 0 + β 1 x2 , ..., yn = β 0 + β 1 xn .
These equations constitute a system of n equations in the two unknowns β 0 and β 1 . The
corresponding matrix equation is given by
1 x1 y1
!
1 x2 y2
β0
.
. ..
=
.. .
. . β1 .
1 xn yn
71
WEEK III. VECTOR SPACES 72
AT Ax = AT b.
A unique solution to this matrix equation exists when the columns of A are linearly
independent.
An interesting formula exists for the matrix which projects b onto the column space of
A. Multiplying the normal equations on the left by A(AT A)−1 , we obtain
Notice that the projection matrix P = A(AT A)−1 AT satisfies P2 = P, that is, two projec-
tions is the same as one. If A itself is a square invertible matrix, then P = I and b is
already in the column space of A.
As an example of the application of the normal equations, consider the toy least-
squares problem of fitting a line through the three data points (1, 1), (2, 3) and (3, 2).
With the line given by y = β 0 + β 1 x, the overdetermined system of equations is given by
1 1 ! 1
β0
1 2 = 3 .
β1
1 3 2
or ! ! !
3 6 β0 6
= .
6 14 β1 13
We can use Gaussian elimination to determine β 0 = 1 and β 1 = 1/2, and the least-squares
line is given by y = 1 + x/2. The graph of the data and the line is shown on the next page.
73
WEEK III. VECTOR SPACES 74
1 2 3
1
1
a) 1
3
−2
−1
1
b) −1
3
2
2
1
c) −1
3
−1
−2
1
d) 1
3
1
2. Suppose we have data points given by ( xn , yn ) = (1, 1), (2, 1) and (3, 3). If the data is
to be fit by the line y = β 0 + β 1 x, which is the overdetermined equation for β 0 and β 1 ?
1 1 ! 1
β0
a) 1 1 = 2
β1
3 1 3
1 1 ! 1
β0
b) 2 1 = 1
β1
3 1 3
1 1 ! 1
β0
c) 1 1 = 2
β1
1 3 3
1 1 ! 1
β0
d) 1 2 = 1
β1
1 3 3
76
WEEK III. VECTOR SPACES 77
3. Suppose we have data points given by ( xn , yn ) = (1, 1), (2, 1) and (3, 3). Which is the
best fit line to the data?
1
a) y = +x
3
1
b) y = − + x
3
1
c) y = 1 + x
3
1
d) y = 1 − x
3
Solutions to the Practice quiz
Week IV
In this week’s lectures, we will learn about determinants and the eigenvalue problem. We will learn
how to compute determinants using a Laplace expansion, the Leibniz formula, or by row or column
elimination. We will formulate the eigenvalue problem and learn how to find the eigenvalues and
eigenvectors of a matrix. We will learn how to diagonalize a matrix using its eigenvalues and
eigenvectors, and how this leads to an easy calculation of a matrix raised to a power.
78
Lecture 28 | Two-by-two and three-
by-three determinants
View this lecture on YouTube
If A is invertible, then the equation Ax = b has the unique solution x = A−1 b. But if A
is not invertible, then Ax = b may have no solution or an infinite number of solutions.
When det A = 0, we say that the matrix A is singular.
It is also straightforward to define the determinant for a three-by-three matrix. We
consider the system of equations Ax = 0 and determine the condition for which x = 0 is
the only solution. With
a b c x1
d e f x2 = 0,
g h i x3
one can do the messy algebra of elimination to solve for x1 , x2 , and x3 . One finds that
x1 = x2 = x3 = 0 is the only solution when det A ̸= 0, where the definition, apart from a
constant, is given by
An easy way to remember this result is to mentally draw the following picture:
a b c a b a b c a b
d e f d e
— d e f d e
g h i g h g h i g h
.
The matrix A is periodically extended two columns to the right, drawn explicitly here but
usually only imagined. Then the six terms comprising the determinant are made evident,
with the lines slanting down towards the right getting the plus signs and the lines slanting
down towards the left getting the minus signs. Unfortunately, this mnemonic only works
for three-by-three matrices.
79
WEEK IV. EIGENVALUES AND EIGENVECTORS 80
2. Show that the three-by-three determinant changes sign when the first two rows are
interchanged.
We first expand in minors down the second column. The only nonzero contribution comes
from the two in the third row, and we cross out the second column and third row (and
multiply by a minus sign) to obtain a three-by-three determinant:
81
WEEK IV. EIGENVALUES AND EIGENVECTORS 82
1 0 0 −1
1 0 −1
3 0 0 5
= − 2 3 0 5 .
2 2 4 −3
1
5 0
1 0 5 0
We then again expand in minors down the second column. The only nonzero contribution
comes from the five in the third row, and we cross out the second column and third
row (and mutiply by a minus sign) to obtain a two-by-two determinant, which we then
compute:
1
0 −1
1 −1
−2 3 0 5 = 10 = 80.
3 5
1 5 0
The trick here is to expand by minors across the row or column containing the most zeros.
WEEK IV. EIGENVALUES AND EIGENVECTORS 83
4 0 3 2 0
Another way to generalize the three-by-three determinant is called the Leibniz formula,
or more descriptively, the big formula. The three-by-three determinant can be written as
a b c
d e f = aei − a f h + b f g − bdi + cdh − ceg,
g h i
where each term in the formula contains a single element from each row and from each
column. For example, to obtain the third term b f g, b comes from the first row and second
column, f comes from the second row and third column, and g comes from the third row
and first column. As we can choose one of three elements from the first row, then one of
two elements from the second row, and only one element from the third row, there are
3! = 6 terms in the formula, and the general n-by-n matrix without any zero entries will
have n! terms.
The sign of each term depends on whether the choice of columns as we go down the
rows is an even or odd permutation of the columns ordered as {1, 2, 3, . . . , n}. An even
permutation is when columns are interchanged an even number of times, and an odd
permutation is when they are interchanged an odd number of times. Even permutations
get a plus sign and odd permutations get a minus sign.
For the determinant of the three-by-three matrix, the plus terms aei, b f g, and cdh
correspond to the column orderings {1, 2, 3}, {2, 3, 1} and {3, 1, 2}, which are even per-
mutations of {1, 2, 3}, and the minus terms a f h, bdi, and ceg correspond to the column
orderings {1, 3, 2}, {2, 1, 3} and {3, 2, 1}, which are odd permutations.
84
WEEK IV. EIGENVALUES AND EIGENVECTORS 85
The determinant is a function that maps a square matrix to a scalar. It is uniquely defined
by the following three properties:
Property 1: The determinant of the identity matrix is one;
Property 2: The determinant changes sign under row interchange;
Property 3: The determinant is a linear function of the first row, holding all other rows
fixed.
Using two-by-two matrices, the first two properties are illustrated by
1 0 a b c d
=1 and = − ;
0 1 c d a b
Both the Laplace expansion and Leibniz formula for the determinant can be proved from
these three properties. Other useful properties of the determinant can also be proved:
∙ The determinant is a linear function of any row, holding all other rows fixed;
∙ The determinant of an upper or lower triangular matrix is the product of the diago-
nal elements;
∙ The determinant of the product of two matrices is equal to the product of the deter-
minants;
∙ The determinant of the inverse matrix is equal to the reciprical of the determinant;
86
WEEK IV. EIGENVALUES AND EIGENVECTORS 87
Notably, these properties imply that Gaussian elimination, done on rows or columns
or both, can be used to simplify the computation of a determinant. Row interchanges
and multiplication of a row by a constant change the determinant and must be treated
correctly.
WEEK IV. EIGENVALUES AND EIGENVECTORS 88
2. Using the defining properties of a determinant, prove that the determinant is a linear
function of any row, holding all other rows fixed.
3. Using the results of the above problems, prove that if we add k times row-i to row-j,
the determinant doesn’t change.
0 −1 1
−3 3 3 0 −2
a) 48
b) 42
c) −42
d) −48
a e 0 0
b f g 0
2. The determinant of
c
is equal to
0 h i
d 0 0 j
c) agij − beij + ce f j − de f h
d) agij + beij − ce f j − de f h
3. Assume A and B are invertible n-by-n matrices. Which of the following identities is
false?
b) det AT = det A
89
Lecture 32 | The eigenvalue
problem
View this lecture on YouTube
Let A be a square matrix, x a column vector, and λ a scalar. The eigenvalue problem
for A solves
Ax = λx
for eigenvalues λi with corresponding eigenvectors xi . Making use of the identity matrix
I, the eigenvalue problem can be rewritten as
(A − λI)x = 0,
where the matrix (A − λI) is just the matrix A with λ subtracted from its diagonal. For
there to be nonzero eigenvectors, the matrix (A − λI) must be singular, that is,
det (A − λI) = 0.
This equation is called the characteristic equation of the matrix A. From the Leibniz formula,
the characteristic equation of an n-by-n matrix is an n-th order polynomial equation in λ.
For each found λi , a corresponding eigenvector xi can be determined directly by solving
(A − λi I)x = 0 for x.
For illustration, we compute the eigenvalues of a general two-by-two matrix. We have
a−λ b
0 = det (A − λI) = = ( a − λ)(d − λ) − bc = λ2 − ( a + d)λ + ( ad − bc);
c d−λ
λ2 − Tr A λ + det A = 0,
90
WEEK IV. EIGENVALUES AND EIGENVECTORS 91
We compute here the two real eigenvalues and eigenvectors ! of a two-by-two matrix.
0 1
Example: Find the eigenvalues and eigenvectors of A = .
1 0
The characteristic equation of A is given by
λ2 − 1 = 0,
with solutions λ1 = 1 and λ2 = −1. The first eigenvector is found by solving (A − λ1 I)x =
0, or ! !
−1 1 x1
= 0.
1 −1 x2
The equation from the second row is just a constant multiple of the equation from the first
row and this will always be the case for two-by-two matrices. From the first row, say, we
find x2 = x1 . The second eigenvector is found by solving (A − λ2 I)x = 0, or
! !
1 1 x1
= 0,
1 1 x2
92
WEEK IV. EIGENVALUES AND EIGENVECTORS 93
0 1 2
λ2 = 0,
so that there is a degenerate eigenvalue of zero. The eigenvector associated with the zero
eigenvalue is found from Bx = 0 and has zero second component. This matrix therefore
has only one eigenvalue and eigenvector, given by
!
1
λ = 0, x = .
0
!
0 −1
Example: Find the eigenvalues of C = .
1 0
The characteristic equation of C is given by
λ2 + 1 = 0,
which has the imaginary solutions λ = ±i. Matrices with complex eigenvalues play an
important role in the theory of linear differential equations.
94
WEEK IV. EIGENVALUES AND EIGENVECTORS 95
a) 1 ± 3i
√
b) 1 ± 3
√
c) 3 3 ± 1
d) 3 ± i
96
WEEK IV. EIGENVALUES AND EIGENVECTORS 97
2 1 0
3. Which of the following is an eigenvector of 1 2 1?
0 1 2
1
a) 0
1
1
√
b) 2
1
0
c) 1
0
√
2
d) 1
√
2
For concreteness, consider a two-by-two matrix A with eigenvalues and eigenvectors given
by ! !
x11 x12
λ 1 , x1 = ; λ 2 , x2 = .
x21 x22
Generalizing, we define S to be the matrix whose columns are the eigenvectors of A, and
Λ to be the diagonal matrix with eigenvalues down the diagonal. Then for any n-by-n
matrix with n linearly independent eigenvectors, we have
AS = SΛ,
where S is an invertible matrix. Multiplying both sides on the right or the left by S−1 , we
derive the relations
A = SΛS−1 or Λ = S−1 AS.
To remember the order of the S and S−1 matrices in these formulas, just remember that A
should be multiplied on the right by the eigenvectors placed in the columns of S.
98
WEEK IV. EIGENVALUES AND EIGENVECTORS 99
2. Prove that if the columns of an n-by-n matrix are linearly independent, then the matrix
is invertible. (An n-by-n matrix whose columns are eigenvectors corresponding to distinct
eigenvalues is therefore invertible.)
!
a b
Example: Diagonalize the matrix A = .
b a
The eigenvalues of A are determined from
a − λ b
det(A − λI) = = ( a − λ)2 − b2 = 0.
b a − λ
Solving for λ, the two eigenvalues are given by λ1 = a + b and λ2 = a − b. The corre-
sponding eigenvector for λ1 is found from (A − λ1 I)x1 = 0, or
! ! !
−b b x11 0
= ;
b −b x21 0
Solving for the eigenvectors and normalizing them, the eigenvalues and eigenvectors are
given by
! !
1 1 1 1
λ1 = a + b, x1 = √ ; λ2 = a − b, x2 = √ .
2 1 2 −1
The matrix S of eigenvectors can be seen to be orthogonal so that S−1 = ST . We then have
!
1 1 1
S= √ and S−1 = ST = S;
2 1 −1
100
WEEK IV. EIGENVALUES AND EIGENVECTORS 101
0 1 2
Diagonalizing a matrix facilitates finding powers of that matrix. Suppose that A is di-
agonalizable, and consider
In general, Λ p has the eigenvalues raised to the power of p down the diagonal, and
A p = SΛ p S−1 .
102
WEEK IV. EIGENVALUES AND EIGENVECTORS 103
1 2 1
ex = 1 + x + x + x3 + . . . .
2! 3!
1 2 1
eA = I + A + A + A3 + . . . .
2! 3!
where
e λ1 0 ... 0
0 e λ2 ... 0
Λ
e = .. .. .. .. .
.
. . .
0 0 ... eλn
104
WEEK IV. EIGENVALUES AND EIGENVECTORS 105
106
Problem and practice quiz
solutions
107
PROBLEM AND PRACTICE QUIZ SOLUTIONS 108
1.
1 0 0
1 0 0 1 0 0 0
0 1 0
a) 0 1 0 b) 0 1 0 0 c)
0 0 1
0 0 1 0 0 1 0
0 0 0
!
4 7
2. AB = AC = .
8 14
2 3 4 2 2 2
3. AD = 2 6 12 , DA = 3 6 9 .
2 9 16 4 12 16
n n p p n p
4. [A(BC)]ij = ∑ aik [BC]kj = ∑ ∑ aik bkl clj = ∑∑ aik bkl clj = ∑ [AB]il clj = [(AB)C)]ij .
k =1 k =1 l =1 l =1 k =1 l =1
2. Let A be an m-by-p diagonal matrix, B a p-by-n diagonal matrix, and let C = AB. The
ij element of C is given by
p
cij = ∑ aik bkj .
k =1
Since A is a diagonal matrix, the only nonzero term in the sum is k = i and we have
cij = aii bij . And since B is a diagonal matrix, the only nonzero elements of C are the
diagonal elements cii = aii bii .
3. Let A and B be n-by-n upper triangular matrices, and let C = AB. The ij element of C
is given by
n
cij = ∑ aik bkj .
k =1
Since A and B are upper triangular, we have aik = 0 when k < i and bkj = 0 when k > j.
Excluding the zero terms from the summation, we have
j
cij = ∑ aik bkj ,
k =i
which is equal to zero when i > j proving that C is upper triangular. Furthermore,
cii = aii bii .
1. d. With aij = i − j, we have a11 = a22 = 0, a12 = −1, and a21 = 1. Therefore
!
0 −1
A= .
1 0
! ! !
1−1 −1 1 −2 2
2. a. =
−1 1 1 −1 2 −2
3. b. For upper triangular matrices A and B, aik = 0 when k < i and bkj = 0 when k > j.
p p
cTij = c ji = ∑ a jk bki = ∑ bikT aTkj .
k =1 k =1
2. The square matrix A + AT is symmetric, and the square matrix A − AT is skew sym-
metric. Using these two matrices, we can write
1 1
A= A + AT + A − AT .
2 2
(AT A)T = AT A.
1.
! a d !
T a b c a2 + b2 + c2 ad + be + c f
A A= b e = .
d e f ad + be + c f d2 + e2 + f 2
c f
n n m m n
Tr(AT A) = ∑ (AT A) jj = ∑ ∑ aTji aij = ∑ ∑ a2ij ,
j =1 j =1 i =1 i =1 j =1
(AB)−1 (AB) = I.
Taking the transpose of both sides of these two equations, using both IT = I and (AB)T =
BT AT , we obtain
4. Let A be an invertible matrix, and suppose B and C are its inverse. To prove that B = C,
we write
B = BI = B(AC) = (BA)C = C.
We can use the Pythagorean theorem and the Law of Cosines to determine sin θ. We have
p
sin θ = 1 − cos2 θ,
and p p
( a − c )2 + ( b − d )2 = ( a2 + b2 ) + ( c2 + d2 ) − 2 a2 + b2 c2 + d2 cos θ.
( ac + bd)
cos θ = √ √ ,
a2 + b2 c2 + d2
so that p | ad − bc|
sin θ = 1 − cos2 θ = √ √ .
a2 + b2 c2 + d2
PROBLEM AND PRACTICE QUIZ SOLUTIONS 112
3. a. Exchange the diagonal elements, negate the off-diagonal elements, and divide by the
! −1 !
2 2 1 2 −2
determinant. We have = .
1 2 2 −1 2
2. The z-coordinate stays fixed, and the vector rotates an angle θ in the x-y plane. There-
fore,
cos θ − sin θ 0
Rz = sin θ cos θ 0 .
0 0 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 113
1.
1 0 0 1 0 0 0 1 0
P123 = 0 1 0 , P132 = 0 0 1 , P213 = 1 0 0 ,
0 0 1 0 1 0 0 0 1
0 1 0 0 0 1 0 0 1
P231 = 0 0 1 , P312 = 1 0 0 , P321 = 0 1 0 .
1 0 0 0 1 0 1 0 0
2.
−1 −1 −1 −1
P123 = P123 , P132 = P132 , P213 = P213 , P321 = P321 ,
−1 −1
P231 = P312 , P312 = P231 .
The matrices that are their own inverses correspond to either no permutation or a single
permutation of rows (or columns), e.g., {1, 3, 2}, which permutes row (column) two and
three. The matrices that are not their own inverses correspond to two permutations,
e.g., {2, 3, 1}, which permutes row (column) one and two, and then two and three. For
example, commuting rows by left multiplication, we have
−1 −1 −1
P231 = P213 P132 = P213 P132 .
−1
Because matrices in general do not commute, P231 ̸= P231 . Note also that the permutation
matrices are orthogonal, so that the inverse matrices are equal to the transpose matrices.
Therefore, only the symmetric permutation matrices can be their own inverses.
1. d. An orthogonal!matrix has orthonormal rows and columns. The rows and columns of
1 −1
the matrix are not orthonormal and therefore this matrix is not an orthogonal
0 0
matrix.
0 sin θ cos θ
PROBLEM AND PRACTICE QUIZ SOLUTIONS 114
1 0 0 0 0 1
3. b. Interchange the rows of the identity matrix: 0 1 0 → 1 0 0.
0 0 1 0 1 0
1.
6 −4 0 2 0 10 4 16 0 0 −1 6
x3 = −6,
1
x2 = − ( x3 − 2) = 4,
2
1
x1 = (7x2 + 2x3 − 7) = 3.
3
x3 −6
2 −5 5 1 0 −1 −1 −1 0 0 1 −1
x3 = −1,
x2 = −2x3 = 2,
x1 = 2x2 − 3x3 + 1 = 8.
x3 −1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 115
1.
6 −4 0 2 0 10 4 16 0 0 −1 6
3 0 0 9 1 0 0 3
→ 0 −2 0 −8 → 0 1 0 4 .
0 0 −1 6 0 0 1 −6
3 6 2 0 0 −1 0 0 −1 0 0 0
1.
3−7 −2 1 0 0 3−7 −2 1 0 0
− 3 5 1 0 1 0 → 0 −2 −1 1 1 0 →
6 −4 0 0 0 1 0 10 4 −2 0 1
3 0 3/2 −5/2 −7/2 0 3 0 3/2 −5/2 −7/2 0
0 −2 −1 1 1 0 → 0 −2 −1 1 1 0
0 0 −1 3 5 1 0 0 −1 3 5 1
1 0 0 2/3 4/3 1/2
→ 0 1 0 1 2 1/2 .
0 0 1 −3 −5 −1
Therefore,
−1
3−7 −2 2/3 4/3 1/2
−3 5 1 = 1 2 1/2 .
6 −4 0 −3 −5 −1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 116
4 −7 1 −2 0 1 −3 −2 0 1 −3 −2
1 −2 1 0
→ 0 1 −1 1.
0 0 −2 −3
2. c. A matrix in reduced row echelon form has all its pivots equal to one, and all the
entries above and below the pivots eliminated. The only matrix that is not in reduced row
1 0 1 0
echelon form is 0 1 0 0. The pivot in the third row, third column has a one above
0 0 1 1
it in the first row, third column.
3. d. There are many ways to do this computation by hand, and here is one way:
3 −7 −2 1 0 0 3 −7 −2 1 0 0 3 −7 −2 1 0 0
−3 5 1 0 1 0 → 0 −2 −1 1 1 0 → 0 −2 −1 1 1 0
6 −4 0 0 0 10
0 1 4 −2 0 1 0 0 −1 3 5 1
3 −7 0 −5 −10 −2 3 −7 0 −5 −10 −2
→ 0 −2 −1 1 1 0 → 0 −2 0 −2 −4 −1
0 0 1 −3 −5 −1 0 0 1 −3 −5 −1
3 0 0 2 4 3/2 1 0 0 2/3 4/3 1/2
→ 0 1 0 1 2 1/2 → 0 1 0 1 2 1/2 .
0 0 1 −3 −5 −1 0 0 1 −3 −5 −1
−1
3 −7 −2 2/3 4/3 1/2
Therefore, −3 5 1 = 1 2 1/2.
6 −4 0 −3 −5 −1
1.
1 0 0 0
0 1 0 0
M=
0
.
0 1 0
0 2 0 1
1.
3−7 −2 3 −7 −2 1 0 0 3 −7 −2
−3 5 1 → 0 −2 −1 = 1 1 0 −3 5 1
6 −4 0 6 −4 0 0 0 1 6 −4 0
PROBLEM AND PRACTICE QUIZ SOLUTIONS 117
3 −7 −2 3 −7 −2 1 0 0 3 −7 −2
0 −2 −1 → 0 −2 −1 = 0 1 0 0 −2 −1
6 −4 0 0 10 4 −2 0 1 6 −4 0
3 −7 −2 3 −7 −2 1 0 0 3 −7 −2
0 −2 −1 → 0 −2 −1 = 0 1 0 0 −2 −1
0 10 4 0 0 −1 0 5 1 0 10 4
Therefore,
3−7 −2 1 0 0 −7 −2
3
−3 5 1 = −1 1 0 0 −2 −1 .
6 −4 0 2 −5 1 0 0 −1
1. We know
3−7 −2 1 0 0 −7 −2
3
A = −3 5 1 −1
= 1 0 0 −2 −1 = LU.
6 −4 0 2 −5 1 0 0 −1
To solve LUx = b, we let y = Ux, solve Ly = b for y, and then solve Ux = y for x.
(a)
−3
b = 3
y1 = −3,
−y1 + y2 = 3,
2y1 − 5y2 + y3 = 2,
(b)
1
b = −1
1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 118
y1 = 1,
−y1 + y2 = −1,
2y1 − 5y2 + y3 = 1,
1. c. Start with the identity matrix. In the third row (changed row) and second column
1 0 0 0
0 1 0 0
(the row which is multiplied by 2) , place a 2. The elementary matrix is
.
0 2 1 0
0 0 0 1
2. b.
−7 −2
3 3 −7 −2 1 0 0
A = −3 5 1 → 0 −2 −1 = M1 A, where M1 = 1 1 0 ;
6 −4 0 6 −4 0 0 0 1
−7 −2
3 3 −7 −2 1 0 0
0 −2 −1 → 0 −2 −1 = M2 M1 A, where M2 = 0 1 0 ;
6 −4 0 0 10 4 −2 0 1
3 −7 −2 3 −7 −2 1 0 0
0 −2 −1 → 0 −2 −1 = M3 M2 M1 A, where M3 = 0 1 0 .
0 10 4 0 0 −1 0 5 1
Therefore,
1 0 0 −7 −2
3
A = −1 1 0 0 −2 −1 .
2 −5 1 0 0 −1
3. b. To solve LUx = b, let y = Ux. Then solve Ly = b for y and Ux = y for x. The
PROBLEM AND PRACTICE QUIZ SOLUTIONS 119
y1 = 1,
−y1 + y2 = −1,
2y1 − 5y2 + y3 = 1.
1
Solution by forward substitution gives y = 0. The equations given by Ux = y are
−1
2. In all of the examples, the vector spaces are closed under scalar multiplication and
vector addition.
0 1 1 0 1 1 0 0 2 0 0 1
Because the reduced row echelon form is the identity matrix, the vectors are linearly
independent.
(b) We place the vectors as the rows of a matrix and compute the reduced row echelon
form:
−1 1 1 1 −1 −1 1 −1 −1 1 0 0
1 −1 1 → 0 0 2 → 0 1 0 → 0 1 0 .
1 1 −1 0 2 0 0 0 1 0 0 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 120
Because the reduced row echelon form is the identity matrix, the vectors are linearly
independent.
(c) By inspection,
1 0 1
= +
1 1 0 .
1 0 1
1. b. A vector space must be closed under vector addition and scalar multiplication. The
set of three-by-one matrices with the sum of all rows equal to one is not a closed set. For
example, k times a vector whose sum of all rows is equal to one results in a vector whose
sum of all rows is equal to k.
0 0 0 2 −2 1 −1 0 −1
so that these sets of three matrices are linearly dependent. The remaining set of vectors
can be put in the rows of a matrix and the reduced row echelon form can be computed:
3 2 1 3 2 1 3 0 3 1 0 0
3 1 2 → 0 −1 1 → 0 1 −1 → 0 1 0 .
2 1 0 0 −1/3 −2/3 0 0 −1 0 0 1
Because the reduced row echelon form is the identity matrix, the vectors are linearly
independent.
3. b. Since a three-by-one matrix has three degrees of freedom, and the constraint that the
sum of all rows equals zero eliminates one degree of freedom, the basis should consist of
PROBLEM AND PRACTICE QUIZ SOLUTIONS 121
1
two vectors. We can arbitrarily take the first unnormalized vector to be −1. The vector
0
1
orthogonal to this first vector with sum of all rows equal to zero is 1. Normalizing
−2
1 1 1
1
both of these vectors, we get √ −1 , √ 1 .
2
6
0 −2
1.
(uT1 v4 )u1 (uT2 v4 )u2 (uT3 v4 )u3
u 4 = v4 − − − .
uT1 u1 uT2 u2 uT3 u3
1. Define
0 1
{v1 , v2 } = 1 , 1 .
−1 −1
(uT1 v2 )u1
u2 = v2 −
uT1 u1
1 0 1
= 1 − 1 = 0 .
−1 −1 0
2. Define
1 0 0
1 1 0
{v1 , v2 , v3 } =
, , .
1
1
1
1 1 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 122
(uT1 v2 )u1
u 2 = v2 −
uT1 u1
0 1 −3
1 3 1
1
1
1 − 4 1 = 4 1 ;
=
1 1 1
2. a. Since the vectors are already orthogonal, we need only normalize them to find
( ! !)
1 1 1 1
{u b2 } =
b1 , u √ , √ .
2 1 2 −1
1
3. b. Let u1 = v1 = 1. Then,
−1
0 −2 1
(uT1 v2 )u1 2 1
u 2 = v2 − = 1 − 1 = 1 .
uT1 u1 3 3
−1 −1 −1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 123
1 0 1 1 0 −1 0 1 0 0 −1 1 0 0 1 −1
1 0 1 1 1 0 0 2
→ 0 1 0 −1 → 0 1 0 −1 .
0 0 1 −1 0 0 1 −1
The equation Ax = 0 with the pivot variables on the left-hand sides is given by
x1 = −2x4 , x2 = x4 , x3 = x4 ,
−2x4 −2
x4 1
and a general vector in the nullspace can be written as x = x4 1. A basis for
4
x4 1
−2
1
1.
the null space is therefore given by the single vector
1
We form the augmented matrix and bring the first four columns to reduced row echelon
form:
−3 6 −1 1 −7 1 −2 0 −1 3
1 −2 2 3 −1 → 0 0 1 2 −2 .
2 −4 5 8 −4 0 0 0 0 0
PROBLEM AND PRACTICE QUIZ SOLUTIONS 124
The null space is found from the first four columns solving Au = 0, and writing the basic
variables on the left-hand side, we have the system
u1 = 2u2 + u4 , u3 = −2u4 ;
from which we can write the general form of the null space as
2u2 + u4 2 1
u2 1 0
= u2 + u4 .
−2u 0 −2
4
u4 0 1
The free variables v2 and v4 can be set to zero, and the particular solution is determined to
be v1 = 3 and v3 = −2. The general solution to the underdetermined system of equations
is therefore given by
2 1 3
1 0 0
0 + b −2 + −2 .
x = a
0 1 0
1. We find
1 1 1 0 1 0 0 2
A = 1 1 0 1 , rref(A) = 0 1 0 −1 ,
1 0 1 1 0 0 1 −1
and dim(Col(A)) = 3, with a basis for the column space given by the first three columns
of A.
AT =
−1 0 −1 3 → 0
0 1 −5
1 −1 1 −1 0 0 0 0
2 1 1 −3 0 0 0 0
Columns one, two, and four are pivot columns of A and columns one, two, and three are
pivot columns of AT . Therefore, the column space of A is given by
2 3 1
−1 −1 −1
Col(A) = span
, , ;
1
2
1
1 −2 −1
x1 = − x3 + x5 , x2 = x3 − 2x5 , x4 = 2x5 ,
x5 0 1
It can be checked that the null space is the orthogonal complement of the row space and
the left null space is the orthogonal complement of the column space. The rank(A) = 3,
and A is not of full rank.
1. d. To find the null space of a matrix, bring it to reduced row echelon form. We have
1 2 0 1 1 2 0 1 1 2 0 1 1 2 0 0
2 4 1 1 → 0 0 1 −1 → 0 0 1 −1 → 0 0 1 0 .
3 6 1 1 0 0 1 −2 0 0 0 −1 0 0 0 1
−2
1
With x1 = −2x2 , x3 = 0, and x4 = 0, a basis for the null space is .
0
0
3 6 1 1 1 0 0 1 −2 −2 0 0 0 −1 −1 0 0 0 1 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 127
−2
1
A basis for the null space is , and a particular solution can be found by setting
0
0
the free variable x2 = 0. Therefore, x1 = x3 = 0 and x4 = 1, and the general solution is
x1 −2 0
x2
= a 1 + 0 ,
x 0 0
3
x4 0 1
1 2 0 1 1 2 0 0
3. c. The matrix in reduced row echelon form is 2 4 1 1 → 0 0 1 0 . The
3 6 1 1 0 0 0 1
number of pivot columns is three, and this is the rank.
1. We first use the Gram-Schmidt process to find an orthonormal basis for W. We assign
T
u1 = 1 1 1 . Then,
0
1 1 1 1
0 1
1
u2 = 1 − 1
1 1
1
1 1 1 1
1
0 1 −2
2 1
= 1 − 1 = 1 .
3 3
1 1 1
When a = 1, b = c = 0, we have
1 −2 1
1 1
vprojW = 1 − 1 = 0 ;
3 3
1 1 0
1.
1 0 ! 1
1 1 β0 3
1
=
2 β1 3
1 3 4
or ! ! !
4 6 β0 11
= .
6 14 β1 21
The solution is β 0 = 7/5 and β 1 = 9/10, and the least-squares line is given by y =
7/5 + 9x/10.
1
−2 −1 0
1 1 1
vprojW = (vT w1 )w1 + (vT w2 )w2 = − 1 + 1 = −1 .
2 6 3
−1 1 2
β 0 + β 1 x1 = y1 ,
β 0 + β 1 x2 = y2 ,
β 0 + β 1 x3 = y3 .
Substituting in the values for x and y, and writing in matrix form, we have
1 1 ! 1
β0
1 2 = 1 .
β1
1 3 3
1
The best fit line is therefore y = − + x.
3
1.
1 0 0
0 1 0 = 1 × 1 × 1 = 1.
0 0 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 130
2.
d e f
a b c = dbi + ecg + f ah − f bg − eai − dch
g h i
a b c
= −( aei + b f g + cdh − ceg − bdi − a f h) = − d e f.
g h i
3. Let ! !
a b e f
A= , B= .
c d g h
Then !
ae + bg a f + bh
AB = ,
ce + dg c f + dh
and
1. For each element chosen from the first row, there is only a single way to choose nonzero
elements from all subsequent rows. Considering whether the columns chosen are even or
odd permutations of the ordered set {1, 2, 3, 4}, we obtain
a b c d
e f 0 0
= a f hj − behj + cegj − degi.
0
g h 0
0 0 i j
1. Suppose the square matrix A has two equal rows. If we interchange these two rows, the
determinant of A changes sign according to Property 2, even though A doesn’t change.
Therefore, det A = − det A, or det A = 0.
2. To prove that the determinant is a linear function of row i, interchange rows 1 and row
i using Property 2. Use Property 3, then interchange rows 1 and row i again.
3. Consider a general n-by-n matrix. Using the linear property of the jth row, and that a
matrix with two equal rows has zero determinant, we have
.. .. .. .
. .. .. .
. .. .. .. .. ..
. . . . . . . . . . . .
a ... ain a ... ain a ... ain ai1 ... ain
i1 i1 i1
.. .. .. .
= . .. .. . .. .. .. .. ..
. . . . . . + k .. . . = . . . .
a j1 + kai1 ... a jn + kain a j1 ... a jn ai1 ... ain a j1 ... a jn
.. .. .. .
.. .. .. .
.. .. .. .. .. ..
. . . . . . . . . .
4.
2
0 −1 2 0 −1 2 0 −1
3 1 1 = 0 1 5/2 = 0 1 5/2 = 2 × 1 × 7/2 = 7.
0 −1 1 0 −1 1 0 0 7/2
PROBLEM AND PRACTICE QUIZ SOLUTIONS 132
1. a. To find the determinant of a matrix with many zero elements, perform a Laplace
expansion across the row or down the column with the most zeros. Choose the correct
sign. We have for one expansion choice,
−3 0 −2 0 0
−3 0 −2 0
2 −2 −2 0 0 −3 0 −2
2 −2 −2 0
0 0 −2 0 0 = 2 = −4 2 −2 −2
0 0 −2 0
3 0 −3 2 −3 0 0 −2
−3 3 3 −2
−3 3 3 0 −2
−3 0
= 8 = 48.
2 −2
2. b. We can apply the Leibniz formula by going down the first column. For each element
in the first column there is only one possible choice of elements from the other three
columns. We have
a e 0 0
b f g 0
= a f hj − behj + cegj − degi.
c 0 h i
d 0 0 j
The signs are obtained by considering whether the following permutations of the rows
{1, 2, 3, 4} are even or odd: a f hj = {1, 2, 3, 4} (even); behj = {2, 1, 3, 4} (odd); cegj =
{3, 1, 2, 4} (even); degi = {4, 1, 2, 3} (odd).
a − λ b c
0 = det(A − λI) = d e−λ f
g h i − λ
= ( a − λ)(e − λ)(i − λ) + b f g + cdh − c(e − λ) g − bd(i − λ) − ( a − λ) f h
= −λ3 + ( a + e + i )λ2 − ( ae + ai + ei − bd − cg − f h)λ + aei + b f g + cdh − ceg − bdi − a f h.
which are the sum of the minors obtained by crossing out the rows and columns of the
diagonal elements. The cubic equation in more memorable form is therefore
Therefore, 2 − λ = ±7, and the eigenvalues are λ1 = −5, λ2 = 9. The eigenvector for
λ1 = −5 is found from ! !
7 7 x1
= 0,
7 7 x2
0 1 0 x3
PROBLEM AND PRACTICE QUIZ SOLUTIONS 134
1
√
or x2 = 0 and x1 + x3 = 0, or x1 = 0. The eigenvector for λ2 = 2 − 2 is found from
−1
√
1 2 0 x1
√
1 2 1 x2 = 0.
√
0 1 2 x3
1
√ √
Therefore, x1 = x3 and x2 = − 2x3 and an eigenvector is x2 = − 2. Similarly, the
1
1
√
third eigenvector is x3 = 2.
1
1. b. The characteristic equation det (A − λI) = 0 for a two-by-two matrix A results in the
quadratic equation λ2 − TrA λ + det A = 0, which for the given 2
√ λ − 3λ + 1 = 0.
√ matrix yields
3± 9−4 3 5
Application of the quadratic formula results in λ± = = ± .
2 2 2
3 − λ −1
2. d. The characteristic equation is det (A − λI) = = (3 − λ)2 + 1 = 0.
1 3 − λ
√
With i = −1, the solution is λ± = 3 ± i.
3. b. One can either compute the eigenvalues and eigenvectors of the matrix, or test the
given possible answers. If we test the answers, then only one is an eigenvector, and we
PROBLEM AND PRACTICE QUIZ SOLUTIONS 135
have √
2 1 0 1
2 1
2+
√ √ √ √
1 2 1 2 = 2 + 2 2 = (2 + 2) 2 .
√
0 1 2 1 2+2 1
From the first equation, we write c2 x2 = −c1 x1 , and eliminating c2 x2 from the second
equation we obtain
(λ1 − λ2 )c1 x1 = 0.
From the first equation, we can also write c1 x1 = −c2 x2 , and eliminating c1 x1 from the
second equation we obtain
(λ2 − λ1 )c2 x2 = 0.
dim(Col(A)) + dim(Null(A)) = n.
If the only solution to Ax = 0 is the zero vector, then det A ̸= 0 and A is invertible.
0 1 2
1 1 1
√ √ √ √
λ1 = 2, x1 = 0 ; λ2 = 2 − 2, x2 = − 2 ; λ3 = 2 + 2, x3 = 2 .
−1 1 1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 136
Notice that the three eigenvectors are mutually orthogonal. This will happen when the
matrix is symmetric. If we normalize the eigenvectors, the matrix with eigenvectors as
columns will be an orthogonal matrix. Normalizing the orthogonal eigenvectors (so that
S−1 = ST ) , we have
√
1/ 2 1/2 1/2
√ √
S= 0 −1/ 2 1/ 2 .
√
−1/ 2 1/2 1/2
We therefore find
2 0 0
√
0 2− 2 0
√
0 0 2+ 2
√ √ √
1/ 2 0 −1/ 2 2 1 0 1/ 2 1/2 1/2
√ √ √
= 1/2 −1/ 2 1/2 1 2 1 0 −1/ 2 1/ 2 .
√ √
1/2 1/ 2 1/2 0 1 2 −1/ 2 1/2 1/2
1.
−1
eA = eSΛS
SΛ2 S−1 SΛ3 S−1
= I + SΛS−1 + + +...
2! 3!
Λ2 Λ3
= S I+Λ+ + + . . . S−1
2! 3!
= SeΛ S−1 .
Because Λ is a diagonal matrix, the powers of Λ are also diagonal matrices with the
diagonal elements raised to the specified power. Each diagonal element of eΛ contains a
power series of the form
λ2i λ3
1 + λi + + i +...,
2! 3!
which is the power series for eλi .
to find !n !
−1
1 2n −1 − 2n −1
= .
−1 1 − 2n −1 2n −1
PROBLEM AND PRACTICE QUIZ SOLUTIONS 137
Therefore,
!100 !2 50
0 1 0 1 = I50 = I.
=
1 0 1 0
Then,
!100 ! !100 !
0 1 1 1 1 1 0 1 1
=
1 0 2 1 −1 0 −1 1 −1
! ! !
1 1 1 1 0 1 1
= = I.
2 1 −1 0 1 1 −1
3. a. We have
!
I2 I3
1 1 e 0
e = I + I + + + · · · = I 1 + 1 + + + . . . = Ie1 =
I
.
2! 3! 2! 3! 0 e