31 Least Squares

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Least squares

Václav Hlaváč
Czech Technical University in Prague
hlavac@fel.cvut.cz
http://cmp.felk.cvut.cz/~hlavac

Courtesy: Fred Pighin and J.P. Lewis, SIGGRAPH


2007 Course;
Outline
2

 Linear regression
 Geometry of least-squares
 Discussion of the Gauss-Markov theorem
One-dimensional regression
3

a
One-dimensional regression
4

Find a line that represent the


b ”best” linear relationship:

b  ax

a
One-dimensional regression
5

• Problem: the data does not


b go through a line

ei  bi  ai x
bi  ai x

a
One-dimensional regression
6

• Problem: the data does not


b go through a line

ei  bi  ai x
bi  ai x • Find the line that minimizes
the sum:

 (b  a x)
i
i i
2

a
One-dimensional regression
7

• Problem: the data does not


b go through a line

ei  bi  ai x
bi  ai x • Find the line that minimizes
the sum:

 (b  a x)
i
i i
2

• We are looking for x̂ that


minimizes
e( x)   (bi  ai x)
a 2

i
Least squares example
8
There are 3 mountains u, y, z that from one site have been
measured as 2474 m, 3882 m, and 4834 m. But from u, y
looks 1422 m taller and the z looks 2354 m taller, and from
y, z looks 950 m taller. Set up the overdetermined system.
1 0 0 2474
u 3882
Ax=
0 1 0
0 0 1 y ~ 4834
=b

-1 1 0 z 1422

-1 0 1 2354

0 -1 1 950

Want to minimize ||Ax-b||2


Approaches to solve Ax b
9

 Normal equations-quick and dirty


 QR- standard in libraries uses orthogonal
decomposition
 SVD - decomposition which also gives
indication how linear independent columns
are
 Conjugate gradient - no decompositions,
good for large sparse problems
Matrix notation
10

Using the following notations


 a1   b1 
a   :  and b   : 
an  bn 
we can rewrite the error function using linear algebra as:

e( x)   (bi  ai x) 2

 (b  xa ) (b  xa )
T

e( x)  b  xa
2
Multidimentional linear regression
11

Using a model with m parameters

b  a1 x1  ...  am xm   a j x j
j
b

a1

a2
Multidimentional linear regression
12

Using a model with m parameters

b  a1 x1  ...  am xm   a j x j
j
and n measurements
n m 2

e( x )   (bi   ai , j x j )
i 1 j 1
2
 m 
 b   ai , j x j 
 j 1 
e( x )  b  Ax
2
Matrix representation
parameter 1 13

 b1   a1,1 .. a1,m   x1 
    
b  Ax   :    : : . : 
bn  an ,1 .. an ,m   xn 
measurement n

 b1  (a1,1 x1  ...  a1,m xm ) 


 
 : 
bn  (an ,1 x1  ...  an ,m xm )
Minimizing e(x ) e(x) is flat at x min
14
e( x min )  0

x min minimizes e(x) if e(x) does not go down


around x min
H e ( x min ) is positive
e(x ) semi - definite

x min
Positive semi-definite
15
A is positive semi - definite

x T Ax  0 , for all x
In 1-D In 2-D
Minimizing e(x )
16

1 T
e(x ) x He ( xˆ ) x
2


Minimizing e( x )  b  Ax
2

17

1 T
e( x )  x He ( xˆ ) x
2


Minimizing e( x )  b  Ax
2

18

T
ˆ
A Ax  A b
T

The normal equation

xˆ minimizes e(x) if
2 A A is positive
T

semi - definite
Always true
Geometric interpretation
19
 b is a vector in Rn

b
Geometric interpretation
20
 b is a vector in
Rn
 The columns of A define a vector space range(A)

a1

a2
Geometric interpretation
21
 b is a vector in
Rn
 The columns of A define a vector space range(A)
 Ax is an arbitrary vector in range(A)

a1

x1a1  x2a 2  Ax

a2
Geometric interpretation
22
 b is a vector in
Rn
 The columns of A define a vector space range(A)
 Ax is an arbitrary vector in range(A)

b
b  Ax

a1

x1a1  x2a 2  Ax

a2
Geometric interpretation
23
 Axˆ is the orthogonal projection of b onto range(A)
 AT b  Axˆ   0  AT Axˆ  AT b

b b  Axˆ

a1

xˆ1a1  xˆ2a 2  Axˆ

a2
The normal equation: AT Axˆ  AT b
24
The normal equation: AT Axˆ  AT b
25

 Existence: AT Axˆ  AT b has always a solution


The normal equation: AT Axˆ  AT b
26

 Existence: AT Axˆ  AT b has always a solution


 Uniqueness: the solution is unique if the columns of A are
linearly independent
The normal equation: AT Axˆ  AT b
27

 Existence: AT Axˆ  AT b has always a solution


 Uniqueness: the solution is unique if the columns of A are
linearly independent

Axˆ
a2
a1
Under-constrained problem
28
b

a1

a2
Under-constrained problem
29
b

a1

a2
Under-constrained problem
30
b

a1

a2
Under-constrained problem
31
b
 Poorly selected data
 One or more of the
a1
parameters are redundant

a2
Under-constrained problem
32
b
 Poorly selected data
 One or more of the
a1
parameters are redundant

Add constraints
a2
AT Ax  AT b with min x x
How good is the least-squares?
33
 Optimality: the Gauss-Markov theorem
Let bi  andx j  be two sets of random variables
and define: ei  bi  ai ,1 x1  ... ai ,m xm

A1 : ai,j are not random variables ,


If
A2 : E(ei )  0, for all i,
A3: var (ei )   , for all i,
A4 : cov(ei , e j )  0, for all i and j ,

Then x  arg min x  ei is the


ˆ 2

best unbiased linear estimator


34

ei

a
no errors in ai
35

b b

ei ei

a a
no errors in ai errors in ai
36

a
homogeneous errors
37

b b

a a
homogeneous errors non-homogeneous errors
38

a
no outliers
outliers
39

b b

a a
no outliers outliers

You might also like