3.6 Iterative Methods For Solving Linear Systems
3.6 Iterative Methods For Solving Linear Systems
3.6 Iterative Methods For Solving Linear Systems
cannot be) fulfilled, only the Gauss-Seidel iterative method may be used after apply Theorem 1.27,
that is, converting the given system in the form AT Ax = AT b, we get
Now using the initial solution x(0) = [0, 0, 0]T , and the following Gauss-Seidel iterative formula
and we got, x(4) = [5.58, 4.18, −1.58]T , that is close to the exact solution of the given system. •
Ax = λx. (1.58)
The relation (1.58) represents the eigenvalue problem and we will refer to (λ, x) as an eigenpair. •
where I is an n × n identity matrix. The system of equation (1.59) has non-trivial solution x if and
only if, A − λI is singular or, equivalently,
The above relation (1.60) represents a polynomial equation in λ of degree n which in principle
could be used to obtain the eigenvalues of the matrix A. This equation is called the characteristic
equation of A. There are n roots of (1.60), which we will denote by λ1 , λ2 , . . . , λn . For a given
eigenvalue λi , the corresponding eigenvector xi is not uniquely determined. If x is an eigenvector
then so is αx where α is any nonzero scalar.
Example 1.65 Find the eigenvalues and eigenvectors of the following matrix
−6 0 0
A = 11 −3 0 .
−3 6 7
Chapter Three Systems of Linear Algebraic Equations 105
Solution. To find the eigenvalues of the given matrix A by using (1.60), we have
−6 − λ 0 0
11 −3 − λ 0 = 0,
−3 6 7−λ
MATLAB command eig is basic eigenvalue and eigenvector routine. The command
>> D = eig(A);
returns a vector containing all the eigenvalues of the matrix A. If the eigenvectors are also wanted,
will return a matrix X whose columns are eigenvectors of A corresponding to the eigenvalues in the
diagonal matrix D. To get the results of the Example 1.65, we use MATLAB command window
as:
The spectral radius of a matrix A may be found from MATLAB command as:
gives s s s
cb cb cb
λ1 = − , λ2 = and λmax = .
ad ad ad
Similarly, we can find the Gauss-Seidel iteration matrix as follows:
TG = −(L + D)−1 U,
and computing the right-hand side, we get
!
0 −b/a
TG = .
0 cb/ad
To find the eigenvalues of the matrix TG , we do as follows:
−µ −b/a
det(TG − λI) = = 0,
0 cb/ad − µ
it gives
cb cb
µ1 = 0, µ2 = and µmax = .
ad ad
Thus s 2
cb
µmax = = λ2max ,
ad
which is the required result. •
108 3.6 Iterative Methods for Solving Linear Systems
The necessary and sufficient condition for the convergence of the Jacobi iterative method and the
Gauss-Seidel iterative method is defined in the following theorem.
There are no general results exist to give us idea of choosing a method among the Jacobi method
or the Gauss-Seidel method to solve an arbitrary linear system. However, the following theorem is
suitable for the special case.
Theorem 1.29 If aij ≤ 0 for each i 6= j and aii > 0 for each i = 1, 2, . . . , n, then one and only
one of the following statements holds:
3. ρ(TJ ) = ρ(TG ) = 0.
4. ρ(TJ ) = ρ(TG ) = 1. •
Example 1.67 Find the spectral radius of Jacobi and Gauss-Seidel iteration matrices using each
of the following matrix.
2 0 −1 1 −1 1
(a) A = −1 3 0 , (b) A = −2 2 −1 ,
0 −1 4 0 1 5
1 0 0 1 0 −1
(c) A = −1 2 0 , (d) A= 1 1 0 .
0 −1 3 0 1 1
Solution. (a) The Jacobi iteration matrix TJ for the given matrix A can be obtained as
0 0 1/2
TJ = 1/3 0 0 ,
0 1/4 0
with
ρ(TJ ) = 0.0000 and ρ(TG ) = 0.0000,
110 3.6 Iterative Methods for Solving Linear Systems
and
0 0 1 0 0 1
TJ = −1 0 0 , TG = 0 0 −1 ,
0 −1 0 0 0 1
with, ρ(TJ ) = 1.0000 and ρ(TG ) = 1.0000, respectively. •
Note that for many linear systems only one of the method either Jacobi method or Gauss-Seidel
method will converge.
Example 1.68 Show that for the following linear system Jacobi method converges but Gauss-Seidel
method diverges.
1 2 −2 3
(a) A = 1 1 1 , b = 0 .
2 2 1 1
Solution. The Jacobi iteration matrix TJ for the given matrix A can be obtained as
0 −2 2
−1
TJ = −D (L + U ) = −1 0 −1 ,
−2 −2 0
which gives, λ = 0, 0, 0 and so ρ(TJ ) = 0. Thus Jacobi method converges. To compute the Gauss-
Seidel iteration matrix TG for the given matrix A, we do as
0 −2 2
TG = −(D + L)−1 U = 0 2 −3 ,
0 0 2
Solving this cubic polynomial, we obtained λ = 0, 2, 2 and ρ(TG ) = 2, which means Gauss-Seidel
method diverges. •
Example 1.69 Show that for the following linear system Jacobi method diverges but Gauss-Seidel
method converges.
2 1 1 4
(a) A = 1 2 1 , b = 4 .
1 1 1 4
Solution. The Jacobi iteration matrix TJ for the given matrix A can be obtained as
0 −1/2 −1/2
TJ = −D−1 (L + U ) = −1/2 0 −1/2 ,
−1 −1 0
Chapter Three Systems of Linear Algebraic Equations 111
which gives, λ = −1.2808, 0.5000, 0.7808 and so ρ(TJ ) = 1.2808. Thus Jacobi method not converge.
To compute the Gauss-Seidel iteration matrix TG for the given matrix A, we do as
0 −1/2 −1/2
TG = −(D + L)−1 U = 0 1/4 −1/4 ,
0 1/4 3/4
Solving this cubic polynomial, we obtained λ = 0, 0.5, 0.5 and ρ(TG ) = 0.5, which means Gauss-
Seidel method converges. •
3k+1 3
and it gives
k
1 k
lim =0 and lim = 0.
k→∞ 3 k→∞ 3k+1
Hence the given matrix A is convergent. •
1 1
Since the above matrix has the eigenvalue of order two, therefore, its spectral radius is . This
3 3
shows the important relation existing between the spectral radius of a matrix and the convergent
of a matrix.
1. A is a convergent matrix.
3. ρ(A) < 1.
which factorizes to
(λ + 1)(λ − 3)(λ − 1)2 = 0,
gives the eigenvalues 3, 1, 1, and -1 of the given matrix A. Hence the spectral radius of A is,
ρ(A) = max{|3|, |1|, |1|, | − 1|} = 3 > 1. •
We will discuss some very important results concerning with the eigenvalue problems. The proofs
of all the results are beyond the text and will be omitted. However, they are very easily understood
and can be used.
Solving this cubic equation, the eigenvalues of A are, -2, -1, and 1. Thus spectral radius of A is
Also
−2 1 0 −2 1 2 5 −2 −4
AT A = 1 0 1 1 0 0 = −2 2 2 ,
2 0 0 0 1 0 −4 2 4
Chapter Three Systems of Linear Algebraic Equations 113
which gives the eigenvalues, 0.4174, 1, and 9.5826. Therefore, the spectral radius of AT A is, 9.5826.
Hence q √
kAk2 = ρ(AT A) = 9.5826 ≈ 3.0956.
From this we conclude that
ρ(A) = 2 < 3.0956 ≈ kAk2 .
One can also show that
The spectral norm of a matrix A may be found from the MATLAB command as:
−λ3 + 4λ2 + 9λ − 36 = 0.
Solving this cubic equation, we have the eigenvalues 4, -3, and 3 of the given matrix A. Therefore,
the spectral radius of A is, 4. Since A is symmetric, so
10 0 6
T 2
A A = A = 0 9 0 .
6 0 10
Since we know that the eigenvalues of A2 are the eigenvalues of A raised to the power 2, so the
eigenvalues of AT A are, 16, 9, and 9, and its spectral radius is, ρ(AT A) = ρ(A2 ) = [ρ(A)]2 = 16.
Hence q √
kAk2 = ρ(AT A) = 16 = 4 = ρ(A),
which satisfies the Theorem 1.32. •
114 3.6 Iterative Methods for Solving Linear Systems
which gives the eigenvalues 17.96, and 0.04. The spectral radius of AT A is, 17.96. Hence
q √
kAk2 = ρ(AT A) = 17.96 ≈ 4.24.
which gives the eigenvalues 14.64 and 3.36, of (A−1 )T (A−1 ), and therefore, its spectral radius is,
14.64. Hence q √
kA−1 k2 = ρ((A−1 )T (A−1 )) = 14.64 ≈ 3.83.
Note that the eigenvalues of A are, 3.73 and 0.27, and therefore, its spectral radius is, 3.73. Hence
1
< |3.73| < 4.24,
3.83
which satisfies the above Theorem 1.33. •
which is equivalent to
x(k+1) = Tω x(k) + c, (1.65)
where
Tω = (D + ωL)−1 [(1 − ω)D − ωU ] and c = ω(D + ωL)−1 b, (1.66)
are called the SOR iteration matrix and the vector respectively.
The quantity ω is called the relaxation factor. It can be formally proved that convergence can
be obtained for a values of ω in the range 0 < ω < 2. For ω = 1, the SOR method (1.62) is
simply the Gauss-Seidel method. The methods involves (1.62) are called relaxation methods. For
the choices of 0 < ω < 1, the procedures are called under-relaxation methods, and can be used
to obtain convergence of some systems that are not convergent by the Gauss-Seidel method. For
choices 1 < ω < 2, the procedures are called over-relaxation methods, which can used to accelerate
the convergence for systems that are convergent by the Gauss-Seidel method. The SOR methods
are particularly useful for solving the linear systems that occur in the numerical solution of certain
partial differential equations.
Example 1.75 Find the l∞ -norm of the SOR iteration matrix Tω when ω = 1.005 by using the
following matrix !
5 −1
A= .
−1 10
Solution. Since the SOR iteration matrix is
where ! ! !
0 0 0 −1 5 0
L= , U= , D= .
−1 0 0 0 0 10
Then " ! !#−1 " ! !#
5 0 0 0 5 0 0 −1
Tω = +ω (1 − ω) −ω ,
0 10 −1 0 0 10 0 0
which equal to (given ω = 1.005)
!−1 ! !
5 0 −0.025 1.005 −0.005 0.201
Tω = = .
−1.005 10 0 −0.05 −0.0005 0.0152
Example 1.76 Solve the following system of linear equations taking initial approximation x(0) =
[0, 0, 0, 0]T and with = 10−4 in l∞ -norm
2x1 + 8x2 = 1
5x1 − x2 + x3 = 2
−x1 + x2 + 4x3 + x4 = 12
x2 + x3 + 5x4 = 12
(a) using the Gauss-Seidel method, and (b) using the SOR method with ω = 0.33.
Starting with initial approximation x(0) = [0, 0, 0, 0]T , and for k = 0, we obtain
(1) (0)
x1 = 0.5[1 − 8x2 ] = 0.5
(1) (1) (0)
x2 = −1[2 − 5x1 − x3 ] = 0.5
(1) (1) (1) (0)
x3 = 0.25[12 + x1 − x2 − x4 ] = 3.0
(1) (1) (1)
x4 = 0.2[12 − x2 − x3 ] = 1.7
Then the first and subsequent iterations are listed in Table 1.7.
(b) The SOR method with initial approximation x(0) = [0, 0, 0, 0]T , ω = 0.33, and for k = 0, gives
(1) (0) (0)
x1 = (1 − 0.33)x1 + 0.165[1 − 8x2 ] = 0.16500
(1) (0) (1) (0)
x2 = (1 − 0.33)x2 − 0.33[2 − 5x1 − x3 ] = −0.387750
(1) (0) (1) (1) (0)
x3 = (1 − 0.33)x3 + 0.0825[12 + x1 − x2 − x4 ] = 1.03560
(1) (0) (1) (1)
x4 = (1 − 0.33)x4 + 0.066[12 − x2 − x3 ] = 0.74924
Then the first and subsequent iterations are listed in Table 1.8.
Since we notice that the Gauss-Seidel method diverged for the given system but the SOR method
converged but very slow for the given system. •
Chapter Three Systems of Linear Algebraic Equations 117
(k) (k) (k) (k) (k) (k) (k) (k)
k x1 x2 x3 x4 k x1 x2 x3 x4
0 0.000000 0.000000 0.000000 0.000000 0 0.000000 0.000000 0.000000 0.000000
1 2.540000 3.467100 5.418392 3.544321 1 2.000000 3.000000 4.500000 3.250000
2 -0.34741 0.923809 3.319772 3.919978 2 0.500000 1.500000 3.625000 3.687500
3 2.047182 1.422556 3.331152 3.811324 3 1.250000 1.562500 3.375000 3.812500
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
15 0.999999 2.000000 3.000000 4.000000 35 1.000000 1.999999 3.000000 4.000000
16 1.000000 2.000000 3.000000 4.000000 36 1.000000 2.000000 3.000000 4.000000
Table 1.9: Solution by SOR Method Table 1.10: Solution by Gauss-S. Method
Example 1.77 Solve the following system of linear equations using the SOR method, with =
0.5 × 10−6 in l∞ -norm.
2x1 + x2 = 4
x1 + 2x2 + x3 = 8
x2 + 2x3 + x4 = 12
x3 + 2x4 = 11
Start with initial approximation x(0) = [0, 0, 0, 0]T , and take ω = 1.27.
Solution. For the given system, the SOR method with ω = 1.27 and starting with initial approxi-
mation x(0) = [0, 0, 0, 0]T , and for k = 0, we obtain
(1) (0) (0)
x1 = (1 − 1.27)x1 + 0.635[4 − x2 ] = 2.54
(1) (0) (1) (0)
x2 = (1 − 1.27)x2 + 0.635[8 − x1 − x3 ] = 3.4671
(1) (0) (1) (0)
x3 = (1 − 1.27)x3 + 0.635[12 − x2 − x4 ] = 5.418392
(1) (0) (1)
x4 = (1 − 1.27)x4 + 0.635[11 − x3 ] = 3.544321.
Then the first and subsequent iterations are listed in Table 1.9.
We use the author-defined function SORM and the following MATLAB commands to get the
above results as follows:
We noted that the SOR method converges and required 16 iterations to obtain what is obviously
the correct solution for the given system. If we solve the same Example 1.77 using the Gauss-Seidel
method, we find that this method also converges but very slow because it needed 36 iterations
to obtain the correct solution, shown by Table 1.10, which is 20 iterations more than required by
the SOR method. Also, if we solve the same example by using the Jacobi method we find that it
needed 73 iterations to get the correct solution. Comparing the SOR method with the Gauss-Seidel
method a large reduction in number of iterations can be achieved, given an efficient choice of ω.
In practice ω should be chosen in the range 1 < ω < 2 but the precise choice of ω is a major
problem. Finding the optimum value for ω depends on the particular problem (size of the system
of equations and the nature of the equations) and often requires careful. A detailed study for the
118 3.6 Iterative Methods for Solving Linear Systems
optimization of ω can be found in Isaacson and Keller (1966). The following theorems can be used
in certain situations for the convergence of the SOR method.
Theorem 1.34 If all the diagonal elements of a matrix A are nonzero, that is, aii 6= 0, for each
i = 1, 2, . . . , n, then
ρ(Tω ) = |ω − 1|.
This implies that the SOR method can converges only if 0 < ω < 2. •
Theorem 1.35 If A is a positive definite matrix and 0 < ω < 2, then the SOR method converges
for any choice of initial approximation vector x(0) ∈ R. •
and the optimal choices of relaxation factor ω for the SOR method is
2
ω= p , (1.67)
1+ 1 − [ρ(TJ )]2
where TG and TJ are the Gauss-Seidel iteration and the Jacobi iteration matrices respectively. With
this choice of the relaxation factor ω, we can have
ρ(Tω ) = ω − 1,
Example 1.78 Find the optimal choice for the relaxation factor ω by using the following matrix
2 −1 0
A = −1 2 −1 .
0 −1 2
Solution. Since the given matrix A is positive definite and tridiagonal, so we can use the above
Theorem 1.36 to find the optimal choice for ω. Using matrix A, we can find the Jacobi iteration
matrix TJ as follows
0 1/2 0
TJ = 1/2 0 1/2 .
0 1/2 0
Now to find the spectral radius of the Jacobi iteration matrix TJ , we use the following characteristic
equation
λ
det(TJ − λI) = |TJ − λi| = −λ3 + ,
2
which gives the eigenvalues of matrix TJ , as λ = 0, ± √12 . Thus, ρ(TJ ) = √1
2
= 0.7071, and the
optimal value of ω is
2
ω= p = 1.1716.
1 + 1 − (0.7071)2
Chapter Three Systems of Linear Algebraic Equations 119
Also, note that the Gauss-Seidel iteration matrix TG has a the following form
0 1/2 0
TG = 0 1/4 1/2 ,
0 1/8 1/4
and its characteristic equation is
λ2
det(TG − λI) = |TG − λI| = −λ3 + .
2
Thus
1
ρ(TG ) = = 0.5000 = (ρ(TJ ))2 ,
2
which agrees with Theorem 1.36. •
Note that the optimal value of ω can be found also by using (1.67) if the eigenvalues of the Jacobi
iteration matrix TJ are real and 0 < ρ(TJ ) < 1. •
Example 1.79 Find the optimal choice for the relaxation factor ω by using the following matrix
5 −1 −1 −1
2 5 −1 0
A= .
−1 −1 5 −1
−1 −1 −1 5
Solution. Using the given matrix A, we can find the Jacobi iteration matrix TJ as follows
0 1/5 1/5 1/5
2/5 0 1/5 0
TJ = .
1/5 1/5 0 1/5
1/5 1/5 1/5 0
Now to find the spectral radius of the Jacobi iteration matrix TJ , we use the equation
det(TJ − λI) = 0,
and get the following polynomial equation
6 2 8 8
−λ4 − λ − λ− = (5λ − 3)(5λ + 1)3 = 0.
25 125 125
Solving the above polynomial equation, we obtain
3 1 1 1
λ = ,− ,− ,− ,
5 5 5 5
which are the eigenvalues of the matrix TJ . From this we get, ρ(TJ ) = 3/5 = 0.6, the spectral radius
of the matrix TJ .
Since the value of ρ(TJ ) is less than 1, therefore, we can use the formula (1.67) and get
2
ω= p = 1.1111,
1+ 1 − (0.6)2
the optimal value of ω. •
120 3.6 Iterative Methods for Solving Linear Systems
Since the rate of convergence of an iterative method depends on the spectral radius of the matrix
associated with the method, one way to choose a method to accelerate convergence is to choose a
method whose associated matrix T has minimal spectral radius.
Example 1.80 Compare the convergence of the Jacobi, Gauss-Seidel, and SOR iterative methods
for the system of linear equations Ax = b, where the coefficient matrix A is given as
4 −1 0 0
−1 4 −1 0
A= .
0 −1 4 −1
0 0 −1 4
Then use the faster convergence method to find the second approximation x(2) for the solution of
the system Ax = [1, 2, 2, 3]T by using initial approximation x(0) = [0.5, 0.5, 0.5, 0.5]T .
TJ = −D−1 (L + U ).
Since
4 0 0 0 0 −1 0 0
0 4 0 0 −1 0 −1 0
D= and L+U = ,
0 0 4 0 0 −1 0 −1
0 0 0 4 0 0 −1 0
therefore,
1/4 0 0 0 0 −1 0 0 0 1/4 0 0
0 1/4 0 0 −1 0 −1 0 1/4 0 1/4 0
TJ = − = .
0 0 1/4 0 0 −1 0 −1 0 1/4 0 1/4
0 0 0 1/4 0 0 −1 0 0 0 1/4 0
which gives the characteristic equation of the form λ4 −0.1875λ2 +1/256 = 0, and solving this fourth-
degree polynomial equation, we get the eigenvalues, λ = −0.4045, λ = −0.1545, λ = 0.1545, λ =
0.4045, of the matrix TJ . As the spectral radius of the matrix TJ is, ρ(TJ ) = 0.4045 < 1, which
shows that Jacobi method will converge for the given linear system.
Since the given matrix is positive definite and tridiagonal, therefore, by using Theorem 1.36, we can
compute the spectral radius of the Gauss-Seidel iteration matrix with the help of the spectral radius
of Jacobi iteration matrix, that is
which showing that Gauss-Seidel method will also converge and faster than the Jacobi method. Also,
from Theorem 1.36, we have
ρ(Tω ) = ω − 1.
Now to find the spectral radius of the SOR iteration matrix Tω , we have to calculate first the optimal
value of ω by using
2 2
ω= p
2
= p = 1.045.
1+ 1 − [ρ(TJ )] 1 + 1 − [0.4045]2
Using this optimal value of ω, we can compute the spectral radius of the SOR iteration matrix Tω
as follows
ρ(Tω ) = ω − 1 = 1.045 − 1 = 0.045 < 1,
which shows that SOR method will also converge for the given system. Since
therefore, SOR method will converge faster than Jacobi and Gauss-Seidel methods.
For the given system, using the SOR method and taking ω = 1.045, and the initial approximation
x(0) = [0.5, 0.5, 0.5, 0.5]T , gives
x(1) = [0.3694, 0.7271, 0.8208, 0.9756]T , x(2) = [0.4346, 0.8177, 0.9541, 0.9891]T ,
the required approximation of the exact solution, x = [0.4641, 0.8565, 0.9617, 0.9904]T . •
r = b − Ax∗ , (1.68)
122 3.7 Errors in Solving Linear Systems
which is called the residual vector and can be easily calculated. The quantity
krk kb − Ax∗ k
= ,
kbk kbk
is called the relative residual.
The smallness of the residual then provides a measure of the goodness of the approximate solution
x∗ . If every component of vector r vanishes, then x∗ is the exact solution. If x∗ is a good
approximation then we would expect each component of r to be small, at least in a relative sense.
For example, the following linear system
x1 + 2x2 = 3
1.0001x1 + 2x2 = 3.0001
has the exact solution x = [1, 1]T but has a poor approximate solution x∗ = [3, 0]T . To see how
good this solution is, we compute the residual, r = [0, −0.0002]T , and so krk∞ = 0.0002. Although
the norm of the residual vector is small, the approximate solution x∗ = [3, 0]T is obviously quite
poor; in fact kx − x∗ k∞ = 2.
We use the author-defined function to get above results using MATLAB command,
We can conclude from the residual that the approximate solution is correct to at most three decimal
places.
Intuitively it would seem reasonable to assume that when krk is small for a given vector norm, then
the error kx − x∗ k would be small as well. In fact this is true for some systems. However, there are
systems of equations which do not satisfy this property. Such systems are said to be ill-conditioned.
x1 + 3x2 = 4
0.3x1 + x2 = 1.33
Chapter Three Systems of Linear Algebraic Equations 123
1
The solution of this system is x1 = 0.1, x2 = 1.3. If we take the approximate value of 3 = 0.33,
then the system becomes
x1 + 3x2 = 4
0.33x1 + x2 = 1.33
whose solution is x1 = 10, x2 = −2. The system
x1 + 3x2 = 4
0.3333x1 + x2 = 1.33
gives the solution x1 = 100, x2 = −32.
So all these systems of equation ”looks” ill-conditioned (unstable or ill-posed system) because
a small change in the coefficient matrix resulted in a large change in the solutions vector.
Consider another linear system
x1 + 2x2 = 4
2x1 + 3x2 = 7
The exact solution is easily verified to be x1 = 2, x2 = 1. Now make a small change in the right
hand side vector of the equations
x1 + 2x2 = 4.001
2x1 + 3x2 = 7.001
gives the solution x1 = 1.999, x2 = 1.001. Make a small change in the coefficient matrix of the
equations
1.001x1 + 2.001x2 = 4
2.001x1 + 3.001x2 = 7.999
gives the solution x1 = 2.003, x2 = 0.997.
So this system of equation ”looks” well conditioned (stable or well-posed system) because small
changes in the coefficient matrix or the right hand side resulted in small changes in the solution
vector.
In practical problems we can expect the coefficients in the system to be subject to small errors,
either because of round-off or because of physical measurement. If the system is ill-conditioned,
the resulting solution may be grossly in error. Errors of this type, unlike those caused by round-off
error accumulation, can not be avoided by careful programming.
Note that for ill-conditioned systems the residual is not necessarily a good measure of the accuracy
of a solution. How then can we tell when a system is ill-conditioned ? In the following we discuss
the some possible indicators of ill-conditioned system.
The number kAkkA−1 k is called the condition number of a nonsingular matrix A and is denoted by
K(A), that is
cond(A) = K(A) = kAkkA−1 k. (1.69)
Note that the condition number K(A) for A depends on the matrix norm used and can, for some
matrices, vary considerably as the matrix norm is changed. Since
therefore, the condition number is always in the range 1 ≤ K(A) ≤ ∞ regardless of any natural
norm. The lower limit is attained for identity matrices and K(A) = ∞ if A is singular. So the
matrix A is well-behaved (well-conditioned) if K(A) is close to 1 and is increasingly ill-conditioned
when K(A) is significantly greater than 1, that is, K(A) → ∞. But, how large does K(A) have to
be before a system is regarded as ill-conditioned? There is no clear threshold. However, to assess
the effects of ill-conditioning, a rule of thumb can be used. For a system with condition number
K(A) , expect a loss of roughly log10 K(A) decimal places in the accuracy of the solution. For the
above first system the matrix and its inverse
" # " #
1 2 −1 −3999 2000
A= , A = , K(A) = kAk∞ kA−1 k∞ = 5.999 × 5999.4 = 35990,
2 3.999 2000 −1000
expect to lose 4 decimal places (log10 (35990) = 4.55618) in accuracy. On the other hand the above
second system the matrix and its inverse
" # " #
1 2 −1 −3 2
A= , A = , K(A) = kAk∞ kA−1 k∞ = 5 × 5 = 25,
2 3 2 −1
1.7.1.1 Scaling
Large condition numbers can also arise from equations that are in need of scaling. Consider the
following coefficient matrix which corresponds to one ”regular” equation and one ”large” equation.
For example,
" # " #
1 −1 0.5 0.0005
A= , A−1 = , K(A) = kAk∞ kA−1 k∞ = 2000 × 0.5005 = 1001.
1000 1000 −0.5 0.0005
Scaling (called Equilibration) can be used to reduce the condition number for a system that is
poorly scaled. If each row of A is scaled by its largest element, then the new A and its inverse
become
" # " #
1 −1 −1 0.5 0.5
A= , A = , K(A) = kAk∞ kA−1 k∞ = 2 × 1 = 2,
1 1 −0.5 0.5
• When A and B are square matrices, the inequality cond(AB) ≤ cond(A)cond(B) is true for
every matrix norm.
Example 1.81 Compute the condition number of the following matrix using the l∞ -norm
2 −1 0
A = 2 −4 −1 .
−1 0 2
Then
11 11
kAk∞ = 7 and kA−1 k∞ = , K(A) = kAk∞ kA−1 k∞ = (7) ≈ 5.9231.
13 13
Depending on the application, we might consider this number to be reasonably small and conclude
that the given matrix A is reasonably well-conditioned. •
Example 1.82 If the condition number of following matrix A is 8.8671, then find the l∞ -norm of
its inverse matrix, that is, kA−1 k∞
10.2 2.4 4.5
A = −2.3 7.7 11.1 .
−5.5 −3.2 0.9
126 3.7 Errors in Solving Linear Systems
First we calculate the l∞ -norm of the given matrix A which is the maximum of the absolute row
sums, we have
kAk∞ = max{17.1000, 21.1000, 9.6} = 21.1000,
and as it is given K(A) = 8.8671, so we have
8.8671 = (21.1000)kA−1 k∞ .
We might think that if the determinant of a matrix is close to zero, then the matrix is ill-conditioned.
However, this is false. Consider the following matrix
!
10−7 0
A= ,
0 10−7
for which det A = 10−14 ≈ 0. One can easily find the condition number of the given matrix as
The matrix A is therefore perfectly conditioned. Thus a small determinant is necessary but not
sufficient for a matrix to be ill-conditioned.
The condition number of a matrix K(A) using l2 -norm can be computed by the built-in function
cond command in the MATLAB as follows:
kx − x∗ k ≤ krkkA−1 k, (1.70)
Ax − Ax∗ = b − (b − r) = r,
kbk
kbk ≤ kAkkxk, or, kxk ≥ .
kAk
Hence
kx − x∗ k kA−1 kkrk krk
≤ ≤ K(A) .
kxk kbk/kAk kbk
The inequalities (1.70) and (1.71) imply that the quantities kA−1 k and K(A) can be used to give
an indication of the connection between the residual vector and the accuracy of the approximation.
If the quantity K(A) ≈ 1, the relative error will be fairly close to the relative residual. But if
K(A) >> 1, then the relative error could be many times larger than the relative residual. •
Example 1.83 Find the condition number of the following matrix (for n = 2, 3, . . .)
" #
1 1
An = .
1 1 − 1/n
If n = 2 and x∗ = [−1.99, 2.99]T be the approximate solution of the linear system Ax = [1, −0.5]T ,
then find the relative error.
which shows that the matrix An is obviously ill-conditioned. Here we expect that the relative error
in the calculated solution to a linear system of the form An x = b could be as much as 4n times the
relative residual. The residual vector (by taking n = 2) can be calculated as
! ! ! !
∗ 1 1 1 −1.99 0.000
r = b − A2 x = − = ,
−0.5 1 0.5 2.99 −0.005
kx − x∗ k krk 0.005
≤ K(A) = (8) = 0.0400,
kxk kbk 1
x1 + x2 − x3 = 1
x1 + 2x2 − 2x3 = 0
−2x1 + x2 + x3 = −1
Then
which shows that the matrix is ill-conditioned. Thus the given system is ill-conditioned.
kx − x∗ k (0.07)
≤ (22.5) = 1.575.
kxk 1
Case 1.1 Suppose that the right-hand side term b is replaced by b + δb, where δb is an error in
b. If x + δx is the solution corresponding to the right-hand side b + δb, then we have
A(x + δx) = (b + δb), (1.74)
which implies that
Ax + Aδx = b + δb, or Aδx = δb.
Multiplying by A−1 and taking the norm, gives,
kδxk = kA−1 δbk ≤ kA−1 kkδbk. (1.75)
The change kδxk in the solution is bounded by kA−1 k times the change kδbk in the right-hand side.
kδxk
The conditioning of linear system is connected with the ratio between the relative error and
kxk
kδbk
the relative change in right-hand side, gives
kbk
kδxk/kxk kA−1 δbk/kxk kAxkkA−1 δbk
= = ≤ kAkkA−1 k,
kδbk/kbk kδbk/kAxk kxkkδbk
130 3.7 Errors in Solving Linear Systems
Case 1.2 Suppose that the matrix A is replaced by A + δA, where δA is error in A while right-hand
side term b is similar. If x + δx is the solution corresponding to the matrix A + δA, then we have
If the product kA−1 kkδAk is much smaller than 1, the denominator in (1.78) is near 1. Consequently,
when kA−1 kkδAk is much smaller than 1, then (1.78) implies that the relative change in the solution
is bounded by the condition number of a matrix times the relative change in the coefficient matrix.
Case 1.3 Suppose that there is change in the coefficient matrix A and right-hand side term b
together, and if x + δx is the solution corresponding to coefficient matrix A + δA and the right-
hand side b + δb, then we have
then it follows that (see Fröberg 1969) the matrix (I + A−1 δA) is nonsingular and
1 1
k(I + A−1 δA)−1 k ≤ −1
≤ −1
. (1.82)
1 − kA δAk 1 − kA kkδAk
Chapter Three Systems of Linear Algebraic Equations 131
kA−1 k
kδxk ≤ [kδbk + kxkkδAk],
1 − kA−1 kkδAk
The estimate (1.85) shows that small relative changes in A and b cause small relative changes in
the solution x of the linear system (1.73) if the inequality
K(A)
, (1.86)
kδAk
1 − K(A)
kAk
Using the values of both matrices norms, we can find the value of the condition number of A as
follows:
K(A) = kAk∞ k|A−1 k∞ = 5 × 5 = 25.
(b) Since the change from b to b1 is an error δb, that is, δb = b1 − b = [−0.01, 0, 0]T , and the
l∞ -norm of this column matrix is, kδbk∞ = 0.01. From the equation (1.76), we get
kδxk 25(0.01)
≤ = 0.1250,
kxk 2
the possible relative change in the solution to the given linear system.
(c) Since the change from A to A1 is an error δA, that is, A1 = A + δA, so
−0.01 0 0
δA = 0 0 0 ,
0 0 0
and the l∞ -norm of this matrix is, kδAk∞ = 0.01. From the equation (1.78), we get
kδxk 25(0.01/5)
≤ = 0.0526,
kxk (1 − 5(0.01))
the possible relative change in the solution to the given linear system.
(d) Since we know from parts (b) and (c) that changes in bA and A are
−0.01 −0.01 0 0
δb = 0 and δA = 0 0 0 ,
0 0 0 0
Ax = b, (1.87)
Chapter One Systems of Linear Algebraic Equations 133
x = x(1) + y.
Ay = r = b − Ax(1) , (1.88)
where r is the residual. The system (1.88) can now be solved to give correction y to the approxima-
tion x(1) . Thus the new approximation x(2) = x(1) + y, will be closer to the solution as compared
to x(1) . If necessary, we compute new residual, r = b − Ax(2) , and solve again system (1.88) to
get new corrections. Normally two or three iterations are enough for getting exact solution. This
iterative method can be used to obtain an improved solution whenever an approximate solution
has been obtained by any means.
1.01x1 + 0.99x2 = 2
0.99x1 + 1.01x2 = 2
has the exact solution x = [1, 1]T . The approximate solution due to the Gaussian elimination
method is x(1) = [1.01, 1.01]T , and residual r(1) = [−0.02, −0.02]T . Then the solution to the system
Ay = r(1) ,
using the simple Gaussian elimination method is y(1) = [−0.01, −0.01]T . So the new approximation
is, x(2) = x(1) + y(1) = [1, 1]T , which is equal to the exact solution just after one iteration. •
1.8 Applications
In this section we discuss applications of linear systems. Here, we will solve or tackle a variety of
real-life problems from several areas of science.
1. Junctions: All the current flowing into a junction must flow out of it.
2. Paths: The sum of the IR terms (where I denotes current and R resistance) in any direction
around a closed path is equal to the total voltage in the path in that direction. •
134 1.5 Applications
I1 A I1
8 volts
200 vph
O N
400 vph
x5
I3 I3 Hogan Street x4 Laura Street x
1
B D 100 vph
1 ohm 800 vph E Monroe Street B
x6
I2 I2 x2
x3
1000 vph
C 100 vph
D Adams Street
C x7
Figure 1.3:
Example 1.88 Consider the electric network of Figure 1.3(a). Let us determine the currents
through each branch of this network.
Solution. The batteries are 8 volts and 16 volts. The resistances are one 1-ohm, one 4-ohm, and
two 2-ohm. The current entering each battery will be the same as that leaving it.
Let the currents in the various branches of the given circuit be I1 , I2 and I3 . Kirchhoff ’s laws refer
to junctions and closed paths. There are two junctions in this circuits, namely the points B and
D. There are three closed paths, namely ABDA, CBDC and ABCDA. Apply the laws to the
junctions and paths.
Junctions
Junction B : I1 + I2 = I3
Junction D : I3 = I1 + I2
It is not necessary to look further at path ABCDA. We now have a system of three linear equations
in three unknowns, I1 , I2 and I3 . Path ABCDA in fact leads to an equation that is a combination
of the last two equations; there is no new information. The problem thus reduces to solving the
following system of three linear equations in three variables I1 , I2 , and I3 .
I1 + I2 − I3 = 0
4I1 + I3 = 8
4I2 + I3 = 16
Chapter One Systems of Linear Algebraic Equations 135
Solve this system for I1 , I2 and I3 using the Gauss elimination method:
.. .. ..
1 1 −1 . 0 1 1 −1 . 0 1 1 −1 . 0
. . .
1 .. 8 ≡ 0 −4 5 .. ≡ 0 −4 5 .. 8 .
4 0 8
. .. .
0 4 1 .. 16 0 4 1 . −4 0 0 6 .. 24
Now using backward substitution to get the solution of the system, I1 = 1, I2 = 3, I3 = 4. The
units are amps. The solution is unique, as is to be expected in this physical situation. •
Traffic Flow
Network analysis, as we saw in the previous discussion, plays an important role in electrical engi-
neering. In recent years, the concepts and tools of network analysis have been found to be useful
in many other fields, such as information theory and the study of transportation systems. The
following analysis of traffic flow through a road network during peak period illustrates how systems
of linear equations with many solutions can arise in practice.
Consider the typical road network of Figure 1.3(b). It represents an area of downtown Jacksonville,
Florida. The streets are all one-way with the arrows indicating the direction of traffic flow. The
flow of traffic in and out of the network is measured in vehicles per hour (vph). The figures given
here are based on midweek peak traffic hours, 7 A.M. to 9 A.M. and 4 P.M. to 6 P.M. An increase
of 2 percent in the overall flow should be allowed for during Friday evening traffic flow. Let us
construct a mathematical model that can be used to analysis this network. Let the traffic flows
along the various branches be x1 , . . . , x7 , as shown in Figure 1.3(b).
This conservation of flow constraint (compare it to the first of the Kirchhoff’s laws for electrical
networks) leads to a system of linear equations.
The Gauss-Jordan elimination method is used to solve this system of equations. Observe that the
augmented matrix contains many zeros. These zeros greatly reduce the amount of computation
136 1.5 Applications
involved. In practice, networks are much larger than the one we have illustrated here, and the
systems of linear equations that describe them are thus much larger. The systems are solved on a
computer. however the augmented matrices of all such systems contain many zeros.
Solve this system for x1 , x2 , . . . , x7 using the Gauss-Jordan elimination method:
.. ..
1 0 0 0 1 0 0 . 600 1 0 0 0 0 1 −1 . 600
.. ..
1 −1 0 0 0 1 0 . 100
0 1 0 0 0 0 −1 . 500
.. ..
0 1 0 0 0 0 −1 . 500
0 0 1 0 0 0 −1 . −200
≡ .
.. ..
0 0 −1 0 0 0 1 . 200 0 0 0 1 0 1 −1 . 600
..
..
0 −1 0 0 0 0 0 0 0 1 −1
0 1 . 800 1 . 000
.. ..
0 0 0 1 1 0 0 . 600 0 0 0 0 0 0 0 . 000
x1 = −x6 + x7 + 600
x2 = x7 + 500
x3 = x7 − 200
x4 = −x6 + x7 + 600
x5 = x6 − x7
As was perhaps to be expected, the system of equations has many solutions-there are many traffic
flows possible. One does have a certain amount of choice at intersections.
Let us now use this mathematical model to arrive at information. Suppose it becomes necessary to
perform road work on the stretch of Adams Street between Laura and Hogan. It is desirable to have
as small a flow of traffic as possible along this stretch of road. The flows can be controlled along
various branches by means of traffic lights at junctions. What is the minimum flow possible along
Adams that would not lead to traffic congestion ? What are the flows along the other branches
when this is attained ? Our model will enable us to answer these questions.
Minimizing the flow along Adams corresponds to minimizing x7 . Since all traffic flows must greater
than or equal to zero, the third equation implies that the minimum value of x7 is 200, for otherwise
x3 could become negative.(A negative flow would be interpreted as traffic moving in the opposite
direction to the one permitted on a one-way street.) Thus the road work must allow for a flow of
at least 200 cars per hour on the branch CD in the peak period.
Let us now examine what the flows in the other branches will be when this minimum flow along
Adams is attained x7 gives
x1 = −x6 + 800
x2 = + 700
x3 = 000
x4 = −x6 + 800
x5 = x6 − 200
Since x7 = 200 implies that x3 = 0 and vice-versa, we see that the minimum flow in the branch x7
can be attained by making x3 = 0; that is by closing branch DE to the traffic. •
Chapter One Systems of Linear Algebraic Equations 137
1 1
x1 x2
0 2
x3 x4
0 2
1 1
The temperature at any interior point is the average of the temperatures of its neighboring points.
Suppose, for simplicity, we have only four interior points with unknown temperatures x1 , x2 , x3 , x4
and 12 boundary points (not named) with temperatures indicated in Figure 1.4.
Example 1.89 Compute the unknown temperatures x1 , x2 , x3 , x4 using Figure ??.
The problem thus reduces to solving the following system of four linear equations in four variables
x1 , x2 , x3 and x4 .
4x1 − x2 − x3 = 1
−x1 + 4x2 − x4 = 3
−x1 + 4x3 − x4 = 1
− x2 − x3 + 4x4 = 3
138 1.5 Applications
Solve this system for x1 , x2 , x3 and x4 using the Gauss elimination method:
.. .
4 −1 −1 0 . 1 4 −1 −1 0 .. 1
.. .
15
− 41 −1 .. 13
−1 4 0 −1 . 3 ≡ ··· ≡ 0
4 4 .
.. 16 ..
0 56 22
−1 4 −1
0 . 1 15 − 15 .
0
15
.. 24 .. 30
0 −1 −1 4 . 3 0 0 0 7 . 7
It takes three different ingredients, A, B and C, to produce a certain chemical substance. A, B and
C have to be dissolved in water separately before they interact to form the chemical. The solu-
tion containing A at 2.5g per cubic centimeter (g/cm3 ) combined with the solution containing B at
4.2g/cm3 combined with the solution containing C at 5.6g/cm3 makes 26.50g of the chemical. If the
proportions for A, B, C in these solutions are changed to 3.4, 4.7 and 2.8g/cm3 , respectively (while
the volumes remain the same), then 22.86g of the chemical is produced. Finally, if the proportions
are changed to 3.7, 6.1 and 3.7g/cm3 , respectively, then 29.12g of the chemical is produced. What
are the volumes in cubic centimeters of the solutions containing A, B, and C?
Solution. Let x, y, z be the cubic centimeters be the corresponding volumes of the solutions con-
taining A, B and C. Then 2.5x is the mass of A in the first case, 4.2y is the mass of B and 5.6z
is the mass of C. Add together, the three masses should be 26.50. So 2.5x + 4.2y + 5.6z = 26.50.
The same reasoning applies to the other two cases, and we get the system
Solve this system for x, y and z using the Gauss elimination method:
.. ..
2.5 4.2 5.6 . 26.50 2.5 4.2 5.6 . 26.50
. ..
3.4 4.7 2.8 .. 22.86 ≡ 0 −1.012 −4.816
. −13.18
.. ..
3.6 6.1 3.7 . 29.12 0 0 −4.612 . −9.717
Using backward substitution to get the solution x = 0.847, y = 2.996, z = 2.107, of the system. The
volumes of the solutions containing A, B and C are 0.847cm3 , 2.996cm3 , 2.107cm3 , respectively. •