3 Elementary

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

3 Elementary Matrix Operations and Systems of Linear Equations

3.1 Elementary Matrix Operations and Elementary Matrices


In this chapter we develop a (hopefully!) familiar method for comparing matrices.

Definition 3.1. An elementary row operation is one of three transformations of the rows of a matrix:

Type I: Swap two rows;


Type II: Multiply a row by a non-zero constant;
Type III: Add to one row a scalar multiple of another.

Matrices are row equivalent if there exists a finite sequence of elementary row operations transforming
one to the other.
The elementary matrices come in the same three families, each is the result of performing the corre-
sponding row operation to the identity matrix:

Type I: Eij is the identity matrix with rows i, j swapped;


(λ)
Type II: Ei is the identity with the ith diagonal entry replaced by λ ̸= 0;
(λ)
Type III: Eij is the identity matrix with an additional λ in the ijth entry.

Example 3.2. In M2 (R) the elementary matrices are as follows:


         
0 1 (λ) λ 0 (λ) 1 0 (λ) 1 λ (λ) 1 0
E12 = , E1 = , E2 = , E12 = , E21 =
1 0 0 1 0 λ 0 1 λ 1

By subtracting three times the first row from the second, we see that the following are row equivalent:
   
1 4 1 4

−2 3 −5 −9

The crucial observation, stated in general below, is that this transformation is the result of multiplying
by the corresponding elementary matrix:
      
(−3) 1 4 1 0 1 4 1 4
E21 = =
−2 3 −3 1 −2 3 −5 −9

Theorem 3.3. Let T be an elementary row operation acting on m × n matrices.

1. T is an isomorphism of Mm×n (F) with itself. Its inverse is an operation of the same type.

2. T( A) = EA where E is the elementary matrix T( Im ) obtained by applying T to the identity.

In particular, the inverses of the three types of elementary matrix are


( λ ) −1 ( λ −1 ) ( λ ) −1
   
(−λ)
Eij−1 = Eij , Ei = Ei , Eij = Eij

1
Proof. Note first that row operations never mix columns and neither does matrix multiplication: if
a1 , . . . , an are the columns of A, then
 
T( A) = T(a1 ) · · · T(an ) and EA = Ea1 · · · Ean

It is therefore enough to prove the result when n = 1.

1. Suppose that T is of type III, adding to row i a multiple λ of row j. Let vr , wr denote the rth
entries of v, w ∈ Fm and let k be a scalar. The rth entry of T(kv + w) is plainly

(kvi + wi ) + λ(kv j + w j ) = k(vi + λv j ) + (wi + λw j ) if r = i


kvr + wr if r ̸= i

which is certainly the rth entry of kT(v) + T(w): thus T : Fm → Fm is linear. Its inverse is
plainly computed by subtracting from the ith row λ times the jth : an elementary operation of the
same type. Operations of types I and II are similar.

2. By part 1, T = LE ∈ L(Fm ) for some invertible matrix E ∈ Mm (F). To compute E, simply apply
T to the standard basis ϵ = {e1 , . . . , em }:
 
E = [T(e1 )]ϵ · · · [T(em )]ϵ = T(e1 ) · · · T(em ) = T(e1 · · · en ) = T( Im )

Column Operations Applying the above approach to columns yields the elementary column oper-
ations. Theorem 3.3 holds for column operations provided you multiply by matrices on the right
T( A) = AE and replace ‘row’ with ‘column.’
1 0 0
(3)
Example 3.4. Let E = E21 = 3 1 0 and compute:
001
 
  1 0 0  
a b c 3 1 0 = a + 3b b c
d e f d + 3e e f
0 0 1

E can be produced from the identity matrix by adding three times the second column to the first,
precisely the effect it has as a column operation when multiplying on the right.

Exercises 3.1 1. Let A = 21 47 and B = 10 71 .


 

(a) Find a sequence of elementary matrices E I , E I I , E I I I , of the types indicated, so that


B = EI I EI EI I I A
(b) Hence find a matrix C such that B = CA. Is C the only matrix satisfying this equation?
(c) Find another sequence of elementary matrices such that B = Ek · · · E1 A.
2. Let A be an m × n matrix. Prove that if B can be obtained from A by an elementary row opera-
tion, then B T can be obtained from A T by the corresponding elementary column operation.
(This essentially proves Theorem 3.3 for column operations.)
3. For the matrices A, B in question 1, find a sequence of elementary matrices of any length/type
such that B = AE1 · · · Ek .

2
3.2 The Rank of a Matrix and Matrix Inverses
Our goal is to use the row equivalence of matrices to provide systematic methods for computing
ranks and inverses of linear maps. First we translate the notions of rank and nullity to matrices.

Definition 3.5. The rank and nullity of a matrix A ∈ Mm×n (F) are the rank/nullity of the linear
map L A : Fn → Fm given by left-multiplication by A.

Our previous injectivity and surjectivity conditions immediately translate to this new language.

Lemma 3.6. Let A ∈ Mm×n (F) and L A : Fn → Fm be the corresponding linear map.

1. L A is injective ⇐⇒ rank A = n ⇐⇒ null A = 0

2. L A is surjective ⇐⇒ rank A = m ⇐⇒ null A = n − m

3. (When m = n) L A is an isomorphism ⇐⇒ A is invertible ⇐⇒ rank A = n ⇐⇒ null A = 0

We now come to the crucial observations that permit easy calculations of ranks and inverses.

Theorem 3.7. Let A = (a1 , . . . , an ) ∈ Mm×n (F) have columns a j ∈ Fn .

1. The column space Span{a1 , . . . , an } of A has dimension rank A.

2. rank A is invariant under multiplication by invertible matrices: if P and Q are invertible, then

rank PA = rank AQ = rank A

Proof. 1. For any vector v ∈ Fn , write v = λ1 e1 + · · · + λn en with respect to the standard basis
and observe that

λ1 a1 + · · · + λn an = Av = L A (v)

It is clear from this that the column space and R( L A ) are identical, whence rank A is the dimen-
sion of the column space.

2. We work with the range of the linear map LPA = LP L A (= LP ◦ L A ):

R(LPA ) = R(LP L A ) = { PAv : v ∈ Fn } = LP (R(L A ))

Since P is invertible, LP : R(L A ) → LP (R(L A )) is an isomorphism, whence

dim R(L A ) = dim LP (R(L A )) = dim R(LPA ) =⇒ rank A = rank PA

The result for AQ is even easier: we leave it to an exercise.

By virtue of the theorem, we’ll denote the column space and nullspace of A by R( A) and N ( A)
respectively rather than the lengthier R(L A ) and N (L A ).

3
Computing the Rank of a Matrix Recall that elementary row/column operations act via multipli-
cation by invertible matrices: thus

Elementary row/column operations are rank-preserving

Examples 3.8. 1. Recall Example 3.2, where we saw the row equivalence of −12 34 and −15 −49 .
 

Since the columns of these are linearly independent, the column spaces of both are R2 and both
matrices plainly have rank 2. Indeed we can perform a sequence of row operations that make
the rank even more obvious:
  (2)   ( 1 )   (−4)  
1 4 E21 1 4 E2 11 1 4 E12 1 0
−−→ −−→ −−→
−2 3 0 11 0 1 0 1

Since all matrices have the same rank, the original clearly has rank 2.

2. Since 2 × 2 matrices are small, the row operation approach wasn’t required. For a larger matrix
however, it can be invaluable. For instance:

1 0 −2 3
     
1 1 0 3 1 1 0 3
2 0 1 1  (−2) (−2) 0 −2 1 −5 (−1) (2) (−2) (1) 0 0 5 −5
  E21 E51   E13 E23 E43 E53  
A= 0 1 2 0  −−−−−→ 0 1 2 0  −−−−−−−−−→ 0 1 2 0 
   

0 2 5 −1 0 2 5 −1 0 0 1 −1
2 1 1 3 0 −1 1 −3 0 0 3 −3
   
1 0 0 1 1 0 0 1
0 0 0 0 
(2) (−5) (−2) (−3)
 E34 E23 0 1 0 2 
 
E14 E24 E34 E54 
−−−−−−−−−−→ 0 1 0 2  −−−→ 0 0 1 −1
  

0 0 1 −1 0 0 0 0 
0 0 0 0 0 0 0 0

Since the fourth column is a linear combination of the first three (linearly independent)
columns, we conclude that rank A = 3. Alternatively, we could repeat using with column
operations:
   
1 1 0 3 1 0 0 0
2 0 1 1   0 0 1 0
   
0
 1 2 0  −→ · · · −→  −4 5 −3 0
 

0 2 5 −1 −10 12 −7 0
2 1 1 3 0 1 0 0

The first three columns are linearly independent, so again the rank is three. If we used a mixture
of row and column operations, we could eventually transform A into the rank 3 matrix
 
1 0 0 0
0 1 0 0
 
0 0 1 0
 
0 0 0 0
0 0 0 0

4
Row or Column Operations: which to use? If your only purpose is to compute ranks, mixing row
and column operations is fine. If you want something else, you may wish to stick to only one type:
here’s why.
Row Operations preserve the span of the rows of a matrix (the row space). This is important when
matrices represent linear systems of equations. For example, below we transform a system of
equations and the corresponding augmented matrix using row operations:
(       
x + 3y = 1 1 3 x 1 1 3 1
=
2 −1 y 16 2 −1 16
(2x − y = 16       
x + 3y = 1 1 3 x 1 1 3 1
=
0 −7 y 14 0 −7 14
(−7y = 14       
x + 3y = 1 1 3 x 1 1 3 −1
=
y = 2 0 1 y 2 0 1 2
(
x = −1
      
1 0 x −1 1 0 −1
=
y=2 0 1 y 2 0 1 2

This familiar method relies on the fact that row-equivalent linear systems have identical solu-
tions. When viewed as a matrix system (middle column) it should be clear that multiplication
by elementary matrices must occur on the left.

Column Operations preserve the column space of a matrix. For instance, the above example shows
that a simple basis of the column space of A is given by
( 1 ! 0
! 0 !)
0 0 1
−4 , 5 , −3
−10 12 −7
0 1 0

Row operations will change the column space and vice versa. If knowing these is important to you,
stick to one type of operation!
The example generalizes:

Theorem 3.9. A matrix A ∈ Mm×n (F) has rank A = r if and only if there exists a finite sequence of
row and column operations transforming A to the matrix
 
Ir Or×(n−r)
D=
O(m−r)×r O(m−r)×(n−r)

Here Ir is the r × r identity, with the remaining pieces being zero matrices of the given dimensions.
Otherwise said, there exist elementary m × m matrices R1 , . . . , Rk and elementary n × n matrices
C1 , . . . , Cℓ such that

Rk · · · R1 AC1 · · · Cℓ = D

The proof is too long to give in full, but can be proved by tedious induction on the number of rows
of A. We are far more interested in some corollaries, particularly involving the maximal rank case,
that is when A is invertible.

5
Computing the inverse of a matrix Everything follows from a simple corollary.
Corollary 3.10. Every invertible matrix A is a product of elementary matrices.

In light of the Corollary, the last line of Theorem 3.9 can be rewritten so say that
rank A = r ⇐⇒ ∃invertible P, Q such that PAQ = D

Proof. If A ∈ Mn (F) is invertible, then rank A = n whence D = In . It follows that there exist products
P, Q of elementary matrices such that

PAQ = In =⇒ A = P−1 Q−1

By Theorem 3.3, the right hand side and thus A is a product of elementary matrices.

The proof yields a systematic method for calculating inverses:


• The identity matrix In = QPA is the result of applying a sequence of row operations to A.
• A−1 = QP = QPIn is the result of applying the same sequence to In .
• Since row operations never mix up columns, we can find A−1 by applying row operations to
the augmented matrix ( A | I ) until the left side is the identity: the right side will then be A−1 , i.e.

( A | I ) is row equivalent to ( I | A−1 )

1 2 3
Example 3.11. We compute the inverse of A = 020 by applying row operations to ( A | I3 ):
301

1 0 3 1 −1 0 E( 12 )
     
1 2 3 1 0 0 E(−1) 1 0 3 1 −1 0
12 2
0 2 0 0 1 0 −−→ 0 2 0 0 1 0 −−→ 0 1 0 0 21 0
3 0 1 0 0 1 3 0 1 0 0 1 3 0 1 0 0 1
1 −1 0 E(− 81 ) −1 0
   
(−3)
E31 1 0 3 1 0 3 1
3
1 1
−−→ 0 1 0 0 2 0  −−−→ 0 1 0 0
2 0 
0 0 −8 −3 3 1 0 0 1 38 − 8 − 18
3

1 0 0 − 18 1 3
   
(−3)
E13 8 8 −1 1 3
1 −1 1
−−→ 0 1 0 0

2 0  =⇒ A = 0 4 0 
8
0 0 1 38 − 38 − 18 3 −3 −1

It can easily be checked by multiplication that we have found the correct inverse matrix: the above
indeed shows how to write A as a product of elementary matrices

(−3) (− 81 ) (−3) ( 21 ) (−1)


I3 = E13 E3 E31 E2 E12 A
     
1 1 0 1 0 0 1 0 0 1 0 0 1 0 3
(1) (2) (3) (−8) (3)
=⇒ A = E12 E2 E31 E3 E13 = 0 1 0 0 2 0 0 1 0 0 1 0  0 1 0
0 0 1 0 0 1 3 0 1 0 0 −8 0 0 1

It
 isalso acceptable, though non-standard, to perform column operations on the augmented matrix
A
In : just remember never to mix the two types of operation when computing the inverse!

6
Corollary 3.12. For any matrix A we have rank A = rank A T .

Proof. D = PAQ =⇒ D T = Q T A T P T . Since Q T and P T are invertible, we immediately see that


rank A T = rank D T . But rank D T clearly equals r = rank D = rank A.

In particular, the dimension of the row space (span of the rows of A) is also rank A.

Maximum and Minimum Ranks of Compositions


As a final application we consider the rank of a composition in terms of its factors.

Example 3.13. If rank A = 3 and rank B = 2, we can appeal to block matrices such as in Theorem
3.9 to consider possible ranks of the product AB:
1 0 0 1 0
• A = 0 1 0 , B = 0 1 : AB = B =⇒ rank AB = 2
001 00
1 0 0 0 0 0 0 0 0 0
• A= 0100 , B= 000 : AB = 000 =⇒ rank AB = 1
0010 100 100
010
0 0 0
!
0 0 0
1 0 0 0 0 0 0 0
• A= 01000 , B= 0 0 0 : AB = 000 =⇒ rank AB = 0
00100 1 0 0 000
0 1 0

As the next result shows, these are essentially all the possibilities.
Theorem 3.14. Let S : U → V and T : V → W be linear maps, thena

rank S − null T ≤ rank TS ≤ min rank T, rank S

max null S, dim U − rank T ≤ null TS ≤ null T + null S

The same relationships hold for matrices A, B, provided the product is AB is defined.
a The upper bounds are easier to remember due to their symmetry: use the rank–nullity theorem to recover the lower

bounds instead of memorizing them!

Proof. If w ∈ R(TS), then w = T(S(u)) for some u ∈ U, from which w ∈ R(T). We conclude that
R(TS) ≤ R(T) =⇒ rank TS ≤ rank T
If this were a claim about matrices, Corollary 3.12 could deal with the other part of the minimum:
rank AB = rank( AB)T = · · · . Instead we consider null spaces and apply the rank-nullity theorem:
u ∈ N (S) =⇒ S(u) = 0 =⇒ TS(u) = 0 =⇒ u ∈ N (TS)
from which
N (S) ≤ N (TS) =⇒ null S ≤ null TS =⇒ dim U − rank S ≤ dim U − rank TS
=⇒ rank TS ≤ rank S
The remaining inequalities are an exercise.

7
Example 3.15. Let A ∈ M6×5 (R) and B ∈ M5×4 (R), and suppose that rank A = 4 and rank B = 3.
We find the maximum and minimum possible ranks of the product AB and give examples in each
case.
First observe that since L A : R5 → R6 , the rank–nullity theorem says that null A = 5 − rank A = 1.
Similarly null B = 4 − 3 = 1.
By the Theorem,

2 = rank B − null A ≤ rank AB ≤ min(rank A, rank B) = 3

It is easy to cook up explicit matrices satisfying rank AB = 3 as in the previous example: for instance
1 0 0 0 0 1 0 0 0
1 0 0 0
!
0 1 0 0 0 0100
 
I4 O4×1 0 0 1 0 0, 0 1 0 0
A= = 0 0 0 1 0 B= 0 0 1 0 =⇒ AB =  00 00 10 00 
O2×4 O2×1 0 0 0 0 0 0 0 0 0 0000
0 0 0 0 0 0 0 0 0 0000

The idea for maximum ranks is to have the identity submatrices inside A and B overlap as much as
possible.
By trying to make the identities overlap as little as possible—essentially squeezing as much of the
range of A into the nullspace of A—we should can also create a minimal rank example: for instance,
with the same A as above,
0 0 0 0
0 0 0 0
!
0 0 0 0 0000
B= 0 1 0 0 =⇒ AB =  00 10 01 00 
0 0 1 0 0000
0 0 0 1 0000

Exercises 3.2 1. For each of the following matrices, compute the rank and the inverse if it exists:
  1 0 1 1 
1 2  0 −2 4 
(a) (b) 1 1 −1 (c) 12 10 −11 02
2 4 2 4 5 0 −1 1 −3

2. For each of the following linear transforms T, find the matrix of the linear map with respect to
the standard bases, determine whether T is invertible, and compute T−1 , if it exists.

(a) T : P2 (R) → P2 (R) defined by T( f )( x ) = ( x + 1) f ′ ( x )


a
(b) T : R3 → P2 (R) defined by T b = ( a + b + c) + ( a − b + c) x + ax2
c
tr A
!  
tr A T 0 1
(c) T : M2 (R) → R defined by T( A) = tr(EA) where E =
4
1 0
tr( AE)

3. (a) Find A ∈ M3×4 (R) and B ∈ M4×3 (R) such that rank A = rank B = 2 and rank AB = 1.
(b) Suppose that A ∈ M4×3 (R) and B ∈ M3×5 (R) have rank A = 2 and rank B = 3. What is
rank AB?

4. (a) Let S ∈ L(U, V ) and T ∈ L(V, W ). Prove that

null TS ≤ null T + null S

Now apply the Rank–Nullity Theorem to finish the proof of Theorem 3.14.

8
 
Ia O Ib O OO
 
(b) Let A = O O
, B = OO
and C = O Ic
where a, b, c are the ranks of A, B, C respec-
tively, and O indicates the zero matrix of the appropriate size. Suppose that

A ∈ Mm × n ( F ) , B, C ∈ Mn× p (F)

Compute AB and AC and check that

rank AB = min(rank A, rank B) and rank AC = max(0, rank C − null A)

This shows that the maximal and minimal ranks indicated in the Theorem can actually be achieved;
the only caveat being that rank C − null A could be negative!

5. Prove that for any m × n matrix A, we have rank A = 0 ⇐⇒ A is the zero matrix.

6. (a) Prove that matrices A, B ∈ Mm×n (F) have the same rank if and only if B = PAQ for some
invertible P, Q.
(b) Suppose that T ∈ L(V, W ) is a linear map between finite-dimensional vector spaces. Show
γ
that rank[T] β is independent of the choice of bases β and γ.
(Of course, this value equals rank T itself!)
 
3 2
7. Write the invertible matrix A = as a product of elementary matrices.
10 6
For a challenge, see if you can do this for a general invertible matrix.

8. Let T ∈ L(V, W ) be given and suppose that P ∈ L(W ) and Q ∈ L(V ) are isomorphisms. Prove
that

rank PT = rank TQ = rank T

(Your argument must work in infinite dimensions and thus without matrices)

You might also like