1 Review of Least Squares Solutions To Overdetermined Systems

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

cs412: introduction to numerical analysis 11/9/05

Lecture 17: Rectangular Systems and Numerical Integration

Instructor: Professor Amos Ron Scribes: Mark Cowlishaw

1 Review of Least Squares Solutions to Overdetermined Systems


Recall that in the last lecture we discussed the solution of overdetermined linear systems using the
least squares method. Recall that an overdetermined system is a linear system of equations

Am×n ~x = ~b (1)

where A is a matrix with m rows and n columns with m > n. As we discussed, there is likely no
exact solution to an overdetermined system like equation 1, so we look for the solution ~x with the
smallest error vector A~x − ~b, using some vector norm to determine the size of the error vector. In
the least squares method, we look for the error vector with the smallest 2-norm. More formally, we
find a vector x~∗ such that
° ° ° °
° ~∗ ~ ° ° °
°Ax − b° ≤ °A~x − ~b° for all vectors ~x (2)
2 2

Recall that there is a simple method for solving overdetermined systems using least squares, we
multiply on the left by the transpose of the matrix A:

A~x = ~b
(A0 A)~x = A0~b

This is called the normal equation. Note that this is not precisely how we solve least-squares
problems numerically, since

cond(A0 A) ∼ cond(A2 )

so that this new linear system as written may be ill-conditioned. However, it is sufficient to know
that there is an efficient method for solving overdetermined systems using least squares.

1.1 Least Squares Fitting of Functions


One important application of least squares solutions to overdetermined systems is in fitting a
function to a data set. We can informally state the problem as follows:

Problem 1.1 (Function Fitting). Given an interval [a, b] a function f : [a, b], and a parameter n,
find a polynomial p ∈ Πn such that p ≈ f .

We have seen how to solve this problem using interpolation, however, there are some important
reasons why interpolation might not be a good choice for fitting a function f :

1
• If we want to fit f at many points, interpolation may be inefficient
• There may be noise in the data, so, rather than fitting each data point exactly, we might get
better results by merely following the major trends in the data.

Returning to problem 1.1, given m+1 data points [t0 , t1 , . . . , tm ] and the corresponding function
values [f (t0 ), f (t1 ), . . . , f (tm )], we would like to fit this data using a polynomial pn ∈ Πn = a(1)tn +
a(2)tn−1 + · · · + a(n + 1). We can model this problem as an overdetermined linear system using
the Vandermonde approach:

   
tn0 tn−1
0 ... t00   f (t0 )
  a(1)  
 tn1 tn−1
1 ... t01  ..   f (t1 ) 
 .. .. .. ..  .  =  .. 
 . . . .   . 
a(n)
tnm tn−1
m ... t0m f (tm )
We can solve this system using the least squares method we just outlined. Note that, unlike
polynomial interpolation, we have two parameters to help us control the quality of the fit: the
number of points m + 1 and the degree of the polynomial n.
Finally, note that, if we have very noisy data, we will need a large number of points m to get
a good fit to the function. The error bound is roughly related to the square root of the number of
points, so that increasing the number of points by a factor of 4 would result in a reduction in the
error by a factor of 1/2.

2 Underdetermined Systems
An underdetermined system is a system of linear equations in which there are more unknowns than
constraints. That is, an underdetermined system has the form

Am×n ~xn×1 = ~bm×1

with m < n. For example, consider the system:


· ¸
x1
[2 3] = 5 (3)
x2
We can easily see that this system has infinitely many solutions defined by the line in 2-space:

2 · x1 + 3 · x2 = 5

The question is, how do we select one solution from this infinite set of solutions? Intuitively,
we would like the “smallest” solution, that is, the solution nearest the origin. For example, if we
are asked to solve the equation:

2 · x1 + 3 · x2 = 0

we have some intuition that [0 0] is the correct choice. Since the solution is a vector, we will
measure its “smallness” using some norm. As before, the least squares solution will select the
solution with the smallest 2-norm.

2
2.1 Least Squares Solution to Underdetermined Systems
To reiterate, we would like to solve the linear system

Am×n ~xn×1 = ~bm×1

with m < n, more formally, we would like to find a solution x~∗ with

° °
°x~∗ ° ≤ k~xk for all vectors ~x
2 2

As before, the least squares solution can be found using a simple algorithm. To find a solution,
we assume that the solution ~x can be written in the form

~x = A0~t

For some m × 1 vector ~t, we can then rewrite the system to form a new linear system with m
constraints and m unknowns:

Am×n ~xn×1 = ~bm×1


Am×n (A0n×m~tm×1 ) = ~bm×1
(AA0 )m×m~tm×1 ) = ~bm×1

For example, we can solve equation 3 as follows

· ¸
x1
[2 3] = [5]
x2
µ· ¸ ¶
2
[2 3] · [t] = [5]
3
µ · ¸¶
2
[2 3] · [t] = [5]
3
[13] [t] = [5]
5
t =
13
This ends our discussion of linear algebra.

3 Numerical Integration
Recall that the definition of the definite integral of a continuous (and we will assume positive)
Rb
function f (t) over the interval [a, b], a f (t)dt is the area under the curve, as shown in Figure 1.
What is the best way to calculate the definite integral? Our experience in math classes suggests
that we might use the anti-derivative, that is a function F with F 0 = f :

3
a b

Figure 1: The Definite Integral of f (t) over [a, b]

Z b
f (t)dt = F (b) − F (a) (4)
a
For example, we can calculate the integral of 1/t between 2 and 3 using anti-derivatives:

Z 3
1
dt = log 3 − log 2
2 t

However, how do we actually find the numerical value of log 3 − log 2? This brings up two
important reasons why anti-derivatives are not used (with a few exceptions) in the numerical
calculation of definite integrals:

1. There exist many functions that have no anti-derivative, or, the anti-derivative may exist for
a function, but we may not have the notation to express it, for example
Z
2
et dt

the key point is that this is the common case. We can only find anti-derivatives for a very small
number of functions, such as those functions that are popular as problems in mathematics
textbooks. Outside of math classes, these functions are rare.

2. Even if we can find an anti-derivative, calculating its value may be more difficult than calcu-
lating the value of the definite integral directly.

In the next few lectures, we shall discuss methods for directly calculating direct integrals by
calculating the area under the curve.

You might also like