3 Elementary
3 Elementary
3 Elementary
Definition 3.1. An elementary row operation is one of three transformations of the rows of a matrix:
Matrices are row equivalent if there exists a finite sequence of elementary row operations transforming
one to the other.
The elementary matrices come in the same three families, each is the result of performing the corre-
sponding row operation to the identity matrix:
By subtracting three times the first row from the second, we see that the following are row equivalent:
1 4 1 4
⇝
−2 3 −5 −9
The crucial observation, stated in general below, is that this transformation is the result of multiplying
by the corresponding elementary matrix:
(−3) 1 4 1 0 1 4 1 4
E21 = =
−2 3 −3 1 −2 3 −5 −9
1. T is an isomorphism of Mm×n (F) with itself. Its inverse is an operation of the same type.
1
Proof. Note first that row operations never mix columns and neither does matrix multiplication: if
a1 , . . . , an are the columns of A, then
T( A) = T(a1 ) · · · T(an ) and EA = Ea1 · · · Ean
1. Suppose that T is of type III, adding to row i a multiple λ of row j. Let vr , wr denote the rth
entries of v, w ∈ Fm and let k be a scalar. The rth entry of T(kv + w) is plainly
which is certainly the rth entry of kT(v) + T(w): thus T : Fm → Fm is linear. Its inverse is
plainly computed by subtracting from the ith row λ times the jth : an elementary operation of the
same type. Operations of types I and II are similar.
2. By part 1, T = LE ∈ L(Fm ) for some invertible matrix E ∈ Mm (F). To compute E, simply apply
T to the standard basis ϵ = {e1 , . . . , em }:
E = [T(e1 )]ϵ · · · [T(em )]ϵ = T(e1 ) · · · T(em ) = T(e1 · · · en ) = T( Im )
Column Operations Applying the above approach to columns yields the elementary column oper-
ations. Theorem 3.3 holds for column operations provided you multiply by matrices on the right
T( A) = AE and replace ‘row’ with ‘column.’
1 0 0
(3)
Example 3.4. Let E = E21 = 3 1 0 and compute:
001
1 0 0
a b c 3 1 0 = a + 3b b c
d e f d + 3e e f
0 0 1
E can be produced from the identity matrix by adding three times the second column to the first,
precisely the effect it has as a column operation when multiplying on the right.
2
3.2 The Rank of a Matrix and Matrix Inverses
Our goal is to use the row equivalence of matrices to provide systematic methods for computing
ranks and inverses of linear maps. First we translate the notions of rank and nullity to matrices.
Definition 3.5. The rank and nullity of a matrix A ∈ Mm×n (F) are the rank/nullity of the linear
map L A : Fn → Fm given by left-multiplication by A.
Our previous injectivity and surjectivity conditions immediately translate to this new language.
Lemma 3.6. Let A ∈ Mm×n (F) and L A : Fn → Fm be the corresponding linear map.
We now come to the crucial observations that permit easy calculations of ranks and inverses.
2. rank A is invariant under multiplication by invertible matrices: if P and Q are invertible, then
Proof. 1. For any vector v ∈ Fn , write v = λ1 e1 + · · · + λn en with respect to the standard basis
and observe that
λ1 a1 + · · · + λn an = Av = L A (v)
It is clear from this that the column space and R( L A ) are identical, whence rank A is the dimen-
sion of the column space.
By virtue of the theorem, we’ll denote the column space and nullspace of A by R( A) and N ( A)
respectively rather than the lengthier R(L A ) and N (L A ).
3
Computing the Rank of a Matrix Recall that elementary row/column operations act via multipli-
cation by invertible matrices: thus
Examples 3.8. 1. Recall Example 3.2, where we saw the row equivalence of −12 34 and −15 −49 .
Since the columns of these are linearly independent, the column spaces of both are R2 and both
matrices plainly have rank 2. Indeed we can perform a sequence of row operations that make
the rank even more obvious:
(2) ( 1 ) (−4)
1 4 E21 1 4 E2 11 1 4 E12 1 0
−−→ −−→ −−→
−2 3 0 11 0 1 0 1
Since all matrices have the same rank, the original clearly has rank 2.
2. Since 2 × 2 matrices are small, the row operation approach wasn’t required. For a larger matrix
however, it can be invaluable. For instance:
1 0 −2 3
1 1 0 3 1 1 0 3
2 0 1 1 (−2) (−2) 0 −2 1 −5 (−1) (2) (−2) (1) 0 0 5 −5
E21 E51 E13 E23 E43 E53
A= 0 1 2 0 −−−−−→ 0 1 2 0 −−−−−−−−−→ 0 1 2 0
0 2 5 −1 0 2 5 −1 0 0 1 −1
2 1 1 3 0 −1 1 −3 0 0 3 −3
1 0 0 1 1 0 0 1
0 0 0 0
(2) (−5) (−2) (−3)
E34 E23 0 1 0 2
E14 E24 E34 E54
−−−−−−−−−−→ 0 1 0 2 −−−→ 0 0 1 −1
0 0 1 −1 0 0 0 0
0 0 0 0 0 0 0 0
Since the fourth column is a linear combination of the first three (linearly independent)
columns, we conclude that rank A = 3. Alternatively, we could repeat using with column
operations:
1 1 0 3 1 0 0 0
2 0 1 1 0 0 1 0
0
1 2 0 −→ · · · −→ −4 5 −3 0
0 2 5 −1 −10 12 −7 0
2 1 1 3 0 1 0 0
The first three columns are linearly independent, so again the rank is three. If we used a mixture
of row and column operations, we could eventually transform A into the rank 3 matrix
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 0
0 0 0 0
4
Row or Column Operations: which to use? If your only purpose is to compute ranks, mixing row
and column operations is fine. If you want something else, you may wish to stick to only one type:
here’s why.
Row Operations preserve the span of the rows of a matrix (the row space). This is important when
matrices represent linear systems of equations. For example, below we transform a system of
equations and the corresponding augmented matrix using row operations:
(
x + 3y = 1 1 3 x 1 1 3 1
=
2 −1 y 16 2 −1 16
(2x − y = 16
x + 3y = 1 1 3 x 1 1 3 1
=
0 −7 y 14 0 −7 14
(−7y = 14
x + 3y = 1 1 3 x 1 1 3 −1
=
y = 2 0 1 y 2 0 1 2
(
x = −1
1 0 x −1 1 0 −1
=
y=2 0 1 y 2 0 1 2
This familiar method relies on the fact that row-equivalent linear systems have identical solu-
tions. When viewed as a matrix system (middle column) it should be clear that multiplication
by elementary matrices must occur on the left.
Column Operations preserve the column space of a matrix. For instance, the above example shows
that a simple basis of the column space of A is given by
( 1 ! 0
! 0 !)
0 0 1
−4 , 5 , −3
−10 12 −7
0 1 0
Row operations will change the column space and vice versa. If knowing these is important to you,
stick to one type of operation!
The example generalizes:
Theorem 3.9. A matrix A ∈ Mm×n (F) has rank A = r if and only if there exists a finite sequence of
row and column operations transforming A to the matrix
Ir Or×(n−r)
D=
O(m−r)×r O(m−r)×(n−r)
Here Ir is the r × r identity, with the remaining pieces being zero matrices of the given dimensions.
Otherwise said, there exist elementary m × m matrices R1 , . . . , Rk and elementary n × n matrices
C1 , . . . , Cℓ such that
Rk · · · R1 AC1 · · · Cℓ = D
The proof is too long to give in full, but can be proved by tedious induction on the number of rows
of A. We are far more interested in some corollaries, particularly involving the maximal rank case,
that is when A is invertible.
5
Computing the inverse of a matrix Everything follows from a simple corollary.
Corollary 3.10. Every invertible matrix A is a product of elementary matrices.
In light of the Corollary, the last line of Theorem 3.9 can be rewritten so say that
rank A = r ⇐⇒ ∃invertible P, Q such that PAQ = D
Proof. If A ∈ Mn (F) is invertible, then rank A = n whence D = In . It follows that there exist products
P, Q of elementary matrices such that
By Theorem 3.3, the right hand side and thus A is a product of elementary matrices.
1 2 3
Example 3.11. We compute the inverse of A = 020 by applying row operations to ( A | I3 ):
301
1 0 3 1 −1 0 E( 12 )
1 2 3 1 0 0 E(−1) 1 0 3 1 −1 0
12 2
0 2 0 0 1 0 −−→ 0 2 0 0 1 0 −−→ 0 1 0 0 21 0
3 0 1 0 0 1 3 0 1 0 0 1 3 0 1 0 0 1
1 −1 0 E(− 81 ) −1 0
(−3)
E31 1 0 3 1 0 3 1
3
1 1
−−→ 0 1 0 0 2 0 −−−→ 0 1 0 0
2 0
0 0 −8 −3 3 1 0 0 1 38 − 8 − 18
3
1 0 0 − 18 1 3
(−3)
E13 8 8 −1 1 3
1 −1 1
−−→ 0 1 0 0
2 0 =⇒ A = 0 4 0
8
0 0 1 38 − 38 − 18 3 −3 −1
It can easily be checked by multiplication that we have found the correct inverse matrix: the above
indeed shows how to write A as a product of elementary matrices
It
isalso acceptable, though non-standard, to perform column operations on the augmented matrix
A
In : just remember never to mix the two types of operation when computing the inverse!
6
Corollary 3.12. For any matrix A we have rank A = rank A T .
In particular, the dimension of the row space (span of the rows of A) is also rank A.
Example 3.13. If rank A = 3 and rank B = 2, we can appeal to block matrices such as in Theorem
3.9 to consider possible ranks of the product AB:
1 0 0 1 0
• A = 0 1 0 , B = 0 1 : AB = B =⇒ rank AB = 2
001 00
1 0 0 0 0 0 0 0 0 0
• A= 0100 , B= 000 : AB = 000 =⇒ rank AB = 1
0010 100 100
010
0 0 0
!
0 0 0
1 0 0 0 0 0 0 0
• A= 01000 , B= 0 0 0 : AB = 000 =⇒ rank AB = 0
00100 1 0 0 000
0 1 0
As the next result shows, these are essentially all the possibilities.
Theorem 3.14. Let S : U → V and T : V → W be linear maps, thena
rank S − null T ≤ rank TS ≤ min rank T, rank S
max null S, dim U − rank T ≤ null TS ≤ null T + null S
The same relationships hold for matrices A, B, provided the product is AB is defined.
a The upper bounds are easier to remember due to their symmetry: use the rank–nullity theorem to recover the lower
Proof. If w ∈ R(TS), then w = T(S(u)) for some u ∈ U, from which w ∈ R(T). We conclude that
R(TS) ≤ R(T) =⇒ rank TS ≤ rank T
If this were a claim about matrices, Corollary 3.12 could deal with the other part of the minimum:
rank AB = rank( AB)T = · · · . Instead we consider null spaces and apply the rank-nullity theorem:
u ∈ N (S) =⇒ S(u) = 0 =⇒ TS(u) = 0 =⇒ u ∈ N (TS)
from which
N (S) ≤ N (TS) =⇒ null S ≤ null TS =⇒ dim U − rank S ≤ dim U − rank TS
=⇒ rank TS ≤ rank S
The remaining inequalities are an exercise.
7
Example 3.15. Let A ∈ M6×5 (R) and B ∈ M5×4 (R), and suppose that rank A = 4 and rank B = 3.
We find the maximum and minimum possible ranks of the product AB and give examples in each
case.
First observe that since L A : R5 → R6 , the rank–nullity theorem says that null A = 5 − rank A = 1.
Similarly null B = 4 − 3 = 1.
By the Theorem,
It is easy to cook up explicit matrices satisfying rank AB = 3 as in the previous example: for instance
1 0 0 0 0 1 0 0 0
1 0 0 0
!
0 1 0 0 0 0100
I4 O4×1 0 0 1 0 0, 0 1 0 0
A= = 0 0 0 1 0 B= 0 0 1 0 =⇒ AB = 00 00 10 00
O2×4 O2×1 0 0 0 0 0 0 0 0 0 0000
0 0 0 0 0 0 0 0 0 0000
The idea for maximum ranks is to have the identity submatrices inside A and B overlap as much as
possible.
By trying to make the identities overlap as little as possible—essentially squeezing as much of the
range of A into the nullspace of A—we should can also create a minimal rank example: for instance,
with the same A as above,
0 0 0 0
0 0 0 0
!
0 0 0 0 0000
B= 0 1 0 0 =⇒ AB = 00 10 01 00
0 0 1 0 0000
0 0 0 1 0000
Exercises 3.2 1. For each of the following matrices, compute the rank and the inverse if it exists:
1 0 1 1
1 2 0 −2 4
(a) (b) 1 1 −1 (c) 12 10 −11 02
2 4 2 4 5 0 −1 1 −3
2. For each of the following linear transforms T, find the matrix of the linear map with respect to
the standard bases, determine whether T is invertible, and compute T−1 , if it exists.
3. (a) Find A ∈ M3×4 (R) and B ∈ M4×3 (R) such that rank A = rank B = 2 and rank AB = 1.
(b) Suppose that A ∈ M4×3 (R) and B ∈ M3×5 (R) have rank A = 2 and rank B = 3. What is
rank AB?
Now apply the Rank–Nullity Theorem to finish the proof of Theorem 3.14.
8
Ia O Ib O OO
(b) Let A = O O
, B = OO
and C = O Ic
where a, b, c are the ranks of A, B, C respec-
tively, and O indicates the zero matrix of the appropriate size. Suppose that
A ∈ Mm × n ( F ) , B, C ∈ Mn× p (F)
This shows that the maximal and minimal ranks indicated in the Theorem can actually be achieved;
the only caveat being that rank C − null A could be negative!
5. Prove that for any m × n matrix A, we have rank A = 0 ⇐⇒ A is the zero matrix.
6. (a) Prove that matrices A, B ∈ Mm×n (F) have the same rank if and only if B = PAQ for some
invertible P, Q.
(b) Suppose that T ∈ L(V, W ) is a linear map between finite-dimensional vector spaces. Show
γ
that rank[T] β is independent of the choice of bases β and γ.
(Of course, this value equals rank T itself!)
3 2
7. Write the invertible matrix A = as a product of elementary matrices.
10 6
For a challenge, see if you can do this for a general invertible matrix.
8. Let T ∈ L(V, W ) be given and suppose that P ∈ L(W ) and Q ∈ L(V ) are isomorphisms. Prove
that
(Your argument must work in infinite dimensions and thus without matrices)