Lecture3
Lecture3
Lecture3
Linear Algebra
CSE 803– Fall 2024, MSU
Xiaoming Liu
1
Slides credit: D Fouhey
This Week – Math
Two goals for the next two classes:
• Math with computers ≠ Math
• Practical math you need to know but
may not have been taught
3
This Week – Goal
4
Adding Numbers
•1+1=?
• Suppose 𝑥! is normally distributed with mean 𝜇
and standard deviation 𝜎 for 1 ≤ 𝑖 ≤ 𝑁
𝟏 𝑵
• How is the average, or 𝝁 (= ∑𝒊%𝟏 𝒙𝒊 ,
𝑵
distributed (qualitatively), in terms of
variance?
&
• 𝜇- has mean 𝜇 and standard deviation .
'
5
Let’s Make It Big
8
Trying it Out
Hmm.
Hmm.
9
What’s a Number?
27 26 25 24 23 22 21 20
1 0 1 1 1 0 0 1 185
128 + 32 + 16 + 8 + 1 = 185
10
Adding Two Numbers
28 27 26 25 24 23 22 21 20
1 0 1 1 1 0 0 1 185
0 1 1 0 1 0 0 1 105
1 0 0 1 0 0 0 1 0 34
Carry Result
Flag
“Integers” on a computer are integers modulo 2k
11
Some Gotchas
32 + (3 / 4) x 40 = 32
Why?
32 + (3 x 40) / 4 = 62
Underflow No Underflow
32 + 3 / 4 x 40 = 32 + 3 x 40 / 4 =
32 + 0 x 40 = 32 + 120 / 4 =
32 + 0 = 32 + 30 =
32 62
Ok – you have to multiply before dividing
12
Some Gotchas
Should be:
9x4=36
math 32 + (9 x 40) / 10 = 68
uint8 32 + (9 x 40) / 10 = 42
Overflow
32 + 9 x 40 / 10 = Why 104?
32 + 104 / 10 = 9 x 40 = 360
32 + 10 = 360 % 256 = 104
42
13
What’s a Number?
27 26 25 24 23 22 21 20
1 0 1 1 1 0 0 1 185
How can we do fractions?
25 24 23 22 21 20 2-1 2-2
1 0 1 1 1 0 0 1 45.25
45 0.25
14
Fixed-Point Arithmetic
25 24 23 22 21 20 2-1 2-2
1 0 1 1 1 0 0 1 45.25
What’s the largest number we can represent?
63.75 – Why?
How precisely can we measure at 63?
0.25
How precisely can we measure at 0?
0.25
Fine for many purposes but for science, seems silly
15
Floating Point
Sign (S) Exponent (E) Fraction (F)
1 0 1 1 1 0 0 1
1 7 1
-1 27-7 = 20 =1 1+1/8 = 1.125
𝑺 𝑬#𝒃𝒊𝒂𝒔
𝑭
−𝟏 𝟐 𝟏+ 𝟑
𝟐
Bias: allows exponent to be negative; Note: fraction = significant = mantissa;
exponents of all ones or all zeros are special numbers 16
Floating Point
Fraction
0/8 000 -20 x 1.00 = -1
1/8 001 -20 x 1.125 = -1.125
Sign Exponent
2/8 010 -20 x 1.25 = -1.25
1 0111 …
-1 7-7=0
*(-bias)*
6/8 110 -20 x 1.75 = -1.75
7/8 111 -20 x 1.875 = -1.875
18
Floating Point
Fraction
0/8 000 -22 x 1.00 = -4
1/8 001 -22 x 1.125 = -4.5
Sign Exponent
2/8 010 -22 x 1.25 = -5
1 1001 …
-1 9-7=2
*(-bias)*
6/8 110 -22 x 1.75 = -7
7/8 111 -22 x 1.875 = -7.5
19
Floating Point
Fraction
Sign Exponent
000 -20 x 1.00 = -1
1 0111
001 -20 x 1.125 = -1.125
21
Revisiting Adding Numbers
Sign Exponent Fraction
1 0100 000 -2-3 x 1.00 = -0.125
1 1001 000 -22 x 1.00 = -4
22
Revisiting Adding Numbers
Sign Exponent Fraction
1 0100 000 -2-3 x 1.00 = -0.125
1 1001 000 -22 x 1.00 = -4
24
https://en.wikipedia.org/wiki/Minifloat
Revisiting Adding Numbers
IEEE 754 Single Precision / Single
8 bits 23 bits
2127 ≈ 1038 ≈ 7 decimal digits
S Exponent Fraction
25
Trying it Out
Roundoff
error occurs
a+b=a ->
numerator is
stuck,
denominator
isn’t
26
Take-homes
27
Vectors
28
Vectors
2 𝒙) = 2
𝒙=
3 𝒙* = 3
29
Scaling Vectors
30
Adding Vectors
• Can add vectors
• Dimensions changed independently
• Order irrelevant
• Can change direction and magnitude
x = [2,3] x+y = [5,4]
y = [3,1]
31
Scaling and Adding
2x+y = [7,7]
y = [3,1]
32
Measuring Length
Magnitude / length / (L2) norm of vector
# $/!
𝒙 = 𝒙 ! = # 𝑥"!
"
x = [2,3] There are other norms; assume L2
unless told otherwise
𝒙 * = 13
y = [3,1] 𝒚 * = 10
Why?
33
Normalizing a Vector
34
Dot Products
#
𝒙 ⋅ 𝒚 = # 𝑥" 𝑦" = 𝒙𝑻 𝒚
"($
𝒙 ⋅ 𝒚 = cos 𝜃 𝒙 𝒚
What happens with
𝒙& normalized / unit
𝜃 vectors?
𝒚 &
35
Dot Products
#
𝒆𝟐
36
Dot Products
#
37
Special Angles
1 0
⋅ =0∗1+1∗0=0
0 1
𝒙 Perpendicular /
orthogonal vectors
have dot product 0
𝒙& irrespective of their
𝜃 magnitude
𝒚& 𝒚
38
Special Angles
𝑥$ 𝑦$
𝑥! ⋅ 𝑦! = 𝑥$ 𝑦$ + 𝑥! 𝑦! = 0
𝒙
Perpendicular /
orthogonal vectors
have dot product 0
𝒙& irrespective of their
𝜃 magnitude
𝒚 & 𝒚
39
Orthogonal Vectors
𝒙 = [2,3]
• Geometrically,
what’s the set of
vectors that are
orthogonal to x?
• A line [3,-2]
40
Orthogonal Vectors
𝒙 • What’s the set of vectors that are
orthogonal to x = [5,0,0]?
• A plane/2D space of vectors/any
vector [0, 𝑎, 𝑏]
41
Cross Product
• Set {𝒛: 𝒛 ⋅ 𝒙 = 0, 𝒛 ⋅ 𝒚 = 0} has an
𝒙 ambiguity in sign and magnitude
𝒚 • Cross product 𝒙×𝒚 is: (1)
orthogonal to x, y (2) has sign
given by right hand rule and (3)
𝒙×𝒚 has magnitude given by area of
parallelogram of x and y
• Important: if x and y are the same
direction or either is 0, then 𝒙×𝒚 =
𝟎.
• Only in 3D!
42
Image credit: Wikipedia.org
Operations You Should Know
43
Matrices
Horizontally concatenate n, m-dim column vectors
and you get a mxn matrix A (here 2x3)
44
Matrices
𝑎 +
Transpose: flip
𝑏 = 𝑎 𝑏 𝑐 (3x1)T = 1x3
rows / columns
𝑐
Vertically concatenate m, n-dim row vectors
and you get a mxn matrix A (here 2x3)
-
𝒖) 𝑢)! 𝑢)" 𝑢)#
𝐴= ⋮ = 𝑢 𝑢*" 𝑢*#
- *!
𝒖+
45
Matrix-Vector Product
𝒚*.) = 𝑨*., 𝒙,.)
𝑥)
𝑦)
= 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝑥*
𝑦*
𝑥,
𝒚 = 𝑥) 𝒗𝟏 + 𝑥* 𝒗𝟐 + 𝑥, 𝒗𝟑
Linear combination of columns of A
46
Matrix-Vector Product
𝒚*.) = 𝑨*., 𝒙,.)
3
𝑦) 𝑻
𝒖𝟏
𝑦* = 𝒙 3
𝑻
𝒖𝟐
𝑻 𝑻
𝑦) = 𝒖𝟏 𝒙 𝑦* = 𝒖𝟐 𝒙
Dot product between rows of A and x
47
Matrix Multiplication
Generally: Amn and Bnp yield product (AB)mp
− 𝑻
𝒂𝟏 − | |
𝑨𝑩 = ⋮ 𝒃𝟏 ⋯ 𝒃𝒑
𝑻
− 𝒂𝒎 − | |
Yes – in A, I’m referring to the rows, and in B,
I’m referring to the columns
48
Matrix Multiplication
Generally: Amn and Bnp yield product (AB)mp
| |
𝒃𝟏 ⋯ 𝒃𝒑
| |
𝑻 𝑻 𝑻
− 𝒂𝟏 − 𝒂 𝟏 𝒃𝟏 ⋯ 𝒂 𝟏 𝒃𝒑
𝑨𝑩 = ⋮ ⋮ ⋱ ⋮
− 𝒂𝑻𝒎 − 𝒂𝑻𝒎 𝒃𝟏 ⋯ 𝒂𝑻𝒎 𝒃𝒑
𝑨𝑩". = 𝒂𝑻𝒊 𝒃𝒋 49
Matrix Multiplication
50
Operations They Don’t Teach
You Probably Saw Matrix Addition
𝑎 𝑏 𝑒 𝑓 𝑎+𝑒 𝑏+𝑓
+ =
𝑐 𝑑 𝑔 ℎ 𝑐+𝑔 𝑑+ℎ
What is this? FYI: e is a scalar
𝑎 𝑏 𝑎+𝑒 𝑏+𝑒
+𝑒 =
𝑐 𝑑 𝑐+𝑒 𝑑+𝑒
51
Broadcasting
If you want to be pedantic and proper, you expand
e by multiplying a matrix of 1s (denoted 1)
𝑎 𝑏 𝑎 𝑏
+𝑒 = + 𝟏!1! 𝑒
𝑐 𝑑 𝑐 𝑑
𝑎 𝑏 𝑒 𝑒
= +
𝑐 𝑑 𝑒 𝑒
Many smart matrix libraries do this automatically.
This is the source of many bugs.
52
Broadcasting Example
Given: a nx2 matrix P and a 2D column vector v,
Want: nx2 difference matrix D
𝑥$ 𝑦$ 𝑥$ − 𝑎 𝑦$ − 𝑏
𝑎
𝑷= ⋮ ⋮ 𝒗= 𝑫= ⋮ ⋮
𝑥# 𝑦# 𝑏
𝑥# − 𝑎 𝑦# − 𝑏
𝑥$ 𝑦$ 𝑎 𝑏 Blue stuff is
𝑷− 𝒗 + = ⋮ ⋮ − ⋮ assumed /
𝑥# 𝑦# 𝑎 𝑏 broadcast
53
Two Uses for Matrices
54
Images as Matrices
Suppose someone hands you this matrix.
What’s wrong with it?
55
Contrast – Gamma curve
Typical way to
change the contrast
is to apply a
nonlinear correction
pixelvalue-
The quantity 𝛾
controls how much
contrast gets added
56
Contrast – Gamma curve
Now the darkest
regions (10th pctile) 90%
are much darker 50%
than the moderately
dark regions (50th new
10% 90%
pctile).
new
new 10% 50%
57
Implementation
Python+Numpy (right way):
imNew = im**4
Python+Numpy (slow way – why? ):
imNew = np.zeros(im.shape)
for y in range(im.shape[0]):
for x in range(im.shape[1]):
imNew[y,x] = im[y,x]**expFactor
58
Numerical
Linear Algebra
CSE 803– Fall 2024, MSU
Xiaoming Liu
59
Images as Matrices
Suppose someone hands you this matrix.
The contrast is wrong!
60
Results
Phew! Much Better.
61
Implementation
Python+Numpy (right way):
imNew = im**4
Python+Numpy (slow way – why? ):
imNew = np.zeros(im.shape)
for y in range(im.shape[0]):
for x in range(im.shape[1]):
imNew[y,x] = im[y,x]**expFactor
62
Element-wise Operations
Element-wise power – beware notation
2
𝑨2 ". = 𝐴".
Note – libraries distinguish between N-D column vector and Nx1 matrix.
64
Vectorizing Example
65
Vectorizing Example
−𝒙( − 𝒚( − − | |
𝑿= ⋮ 𝒀= ⋮ 𝒀𝑻 = 𝒚( ⋯ 𝒚+
− 𝒙' − − 𝒚+ − | |
𝒙𝟏 𝟐
Compute a Nx1
𝚺 𝑿𝟐 , 𝟏 = ⋮
vector of norms 𝟐
𝒙𝑵
(can also do Mx1)
66
Vectorizing Example
𝑻 (//
𝐃 = Σ 𝑿𝟐 , 1 + Σ 𝒀𝟐 , 1 − 2𝑿𝒀𝑻
𝒙𝟏 𝟐
+ 𝒚( 𝟐 𝟐
⋯ 𝒚+
⋮
𝟐
𝒙𝑵
𝒙𝟏 / + 𝒚𝟏 / ⋯ 𝒙𝟏 / + 𝒚𝑴 /
⋮ ⋱ ⋮ Why?
/ / / /
𝒙𝑵 + 𝒚𝟏 ⋯ 𝒙𝑵 + 𝒚𝑴
/ / 0 / /
Σ 𝑿 ,1 + Σ 𝒀 ,1 !3 = 𝒙! + 𝒚3
67
Vectorizing Example
𝑻 (//
𝐃 = Σ 𝑿𝟐 , 1 + Σ 𝒀𝟐 , 1 − 2𝑿𝒀𝑻
/ /
𝐃!3 = 𝒙𝒊 + 𝒚𝒋 + 2𝒙𝑻 𝒚
Numpy code:
XNorm = np.sum(X**2,axis=1,keepdims=True)
YNorm = np.sum(Y**2,axis=1,keepdims=True)
D = (XNorm+YNorm.T-2*np.dot(X,Y.T))**0.5
*May have to make sure this is at least 0
(sometimes roundoff issues happen)
68
Does it Make a Difference?
69
Linear Independence
A set of vectors is linearly independent if you can’t
write one as a linear combination of the others.
0 0 5
Suppose: 𝒂 = 0 𝒃 = 6 𝒄 = 0
2 0 0
0 0 1 1
𝒙 = 0 = 2𝒂 𝒚 = −2 = 𝒂 − 𝒃
4 2 3
1
• Is the set {a,b,c} linearly independent?
• Is the set {a,b,x} linearly independent?
• Max # of independent 3D vectors?
70
Span
Span: all linear
combinations of a
set of vectors
Span({ }) =
Span({[0,2]}) = ?
All vertical lines
through origin =
𝜆 0,1 : 𝜆 ∈ 𝑅
Is blue in {red}’s
span?
71
Span
Span: all linear
combinations of a
set of vectors
Span({ , }) = ?
72
Span
Span: all linear
combinations of a
set of vectors
Span({ , }) = ?
73
Matrix-Vector Product
| | Right-multiplying A by x
𝑨𝒙 = 𝒄𝟏 ⋯ 𝒄𝒏 𝒙 mixes columns of A
| | according to entries of x
74
An Intuition
| | | 𝑥$
𝒚 = 𝑨𝒙 = 𝒄𝟏 𝒄𝟐 𝒄𝒏 𝑥!
| | | 𝑥4
y y1 x
y2 Ax x1 x2 x3
y3
x – knobs on machine (e.g., fuel, brakes)
y – state of the world (e.g., where you are)
A – machine (e.g., your car)
75
Linear Independence
Suppose the columns of 3x3 matrix A are not
linearly independent (c1, αc1, c2 for instance)
| | | 𝑥$
𝒚 = 𝑨𝒙 = 𝒄𝟏 𝛼𝒄𝟏 𝒄𝟐 𝑥!
| | | 𝑥4
𝒚 = 𝑥$ 𝒄𝟏 + 𝛼𝑥! 𝒄𝟏 + 𝑥4 𝒄𝟐
𝒚 = 𝑥$ + 𝛼𝑥! 𝒄𝟏 + 𝑥4 𝒄𝟐
76
Linear Independence Intuition
Knobs of x are redundant. Even if y has 3
outputs, you can only control it in two directions
𝒚 = 𝑥$ + 𝛼𝑥! 𝒄𝟏 + 𝑥4 𝒄𝟐
y y1 x
y2 Ax x1 x2 x3
y3
77
Linear Independence
Recall: 𝑨𝒙 = 𝑥( + 𝛼𝑥/ 𝒄𝟏 + 𝑥7 𝒄𝟐
𝑥( + 𝛽
𝛽
𝒚 = 𝑨 𝑥/ − 𝛽/𝛼 = 𝑥( + 𝛽 + 𝛼𝑥/ − 𝛼 𝑐( + 𝑥7 𝑐/
𝑥7 𝛼
𝛽 𝛽
𝒚 = 𝑨 −𝛽/𝛼 = 𝛽 − 𝛼 𝒄𝟏 + 0𝒄𝟐
𝛼
0
79
Rank
80
Inverses
81
Symmetric Matrices
84
Suppose I have points in a grid
85
Now I apply f(x) = Ax to these points
Pointy-end: Ax . Non-Pointy-End: x
86
𝑨=
1.1 0
0 1.1
A = U ∑
σ1
σ2
0
σ3
Scale
0
Rotation
Eigenvectors Sqrt of
of AAT Eigenvalues
of ATA
92
The Singular Value Decomposition
Can always write a mxn matrix A as: 𝑨 = 𝑼𝚺𝑽𝑻
VT
A = U ∑
Rotation
Rotation Scale
Eigenvectors Sqrt of Eigenvectors
of AAT Eigenvalues of ATA
of ATA
93
Singular Value Decomposition
0
94
Singular Value Decomposition
0
95
Singular Value Decomposition
96
Solving Least-Squares
(x2,y2) Start with two points (xi,yi)
𝒚 = 𝑨𝒗
𝑦( 𝑥( 1 𝑚
𝑦/ = 𝑥/ 1 𝑏
(x1,y1)
𝑦( 𝑚𝑥( + 𝑏
𝑦/ = 𝑚𝑥/ + 𝑏
We know how to solve this –
invert A and find v (i.e., (m,b)
that fits points)
97
Solving Least-Squares
(x2,y2) Start with two points (xi,yi)
𝒚 = 𝑨𝒗
𝑦( 𝑥( 1 𝑚
𝑦/ = 𝑥/ 1 𝑏
(x1,y1) 𝑦( 𝑚𝑥 ( + 𝑏 /
𝒚 − 𝑨𝒗 / = 𝑦/ − 𝑚𝑥 + 𝑏
/
/ /
= 𝑦( − 𝑚𝑥( + 𝑏 + 𝑦/ − 𝑚𝑥/ + 𝑏
The sum of squared differences between
the actual value of y and
what the model says y should be.
98
Solving Least-Squares
Suppose there are n > 2 points
𝒚 = 𝑨𝒗
𝑦( 𝑥( 1
⋮ = ⋮ 𝑚
⋮
𝑦' 𝑏
𝑥' 1
Compute 𝑦 − 𝐴𝑥 / again
<
𝒚 − 𝑨𝒗 / = p 𝑦! − (𝑚𝑥! + 𝑏) /
!%(
99
Solving Least-Squares
Given y, A, and v with y = Av overdetermined
(A tall / more equations than unknowns)
We want to minimize 𝒚 − 𝑨𝒗 𝟐 , or find:
arg min𝒗 𝒚 − 𝑨𝒗 !
102
Homogeneous Least-Squares
Given a set of unit vectors (aka directions) 𝒙𝟏 , … , 𝒙𝒏
and I want vector 𝒗 that is as orthogonal to all the 𝒙𝒊
as possible (for some definition of orthogonal)
𝒙𝒏… Stack 𝒙𝒊 into A, compute Av
𝒙𝟐 𝑻 𝑻 0 if
𝒙𝟏 − 𝒙 𝟏 − 𝒙 𝟏 𝒗
𝑨𝒗 = ⋮ 𝒗= ⋮ orthog
𝑻
− 𝒙𝒏 − 𝒙𝑻𝒏 𝒗
𝒏
𝒗 𝟐 𝑻 𝟐
Compute 𝑨𝒗 =p 𝒙𝒊 𝒗
𝒊
Sum of how orthog. v is to each x 103
Homogeneous Least-Squares
104
Homogeneous Least-Squares
/
Let’s look at 𝑨𝒗 /
!
𝑨𝒗 ! = Rewrite as dot product
! 9
𝑨𝒗 ! = 𝐀𝐯 (𝐀𝐯) Distribute transpose
!
𝑨𝒗 ! = 𝒗𝑻 𝑨𝑻 𝐀𝐯 = 𝐯 𝐓 𝐀𝐓 𝐀 𝐯
105
Homogeneous Least-Squares
Ubiquitious tool in vision:
arg min 𝑨𝒗 !
$ 𝒗 ($
Remember derivatives?
107
Given quadratic function f(x)
/
𝑓 𝑥, 𝑦 = 𝑥 − 2 +5
𝑓 𝑥 is function
𝑔 𝑥 = 𝑓& 𝑥
aka
𝑑
𝑔 𝑥 = 𝑓(𝑥)
𝑑𝑥
108
Given quadratic function f(x)
/
𝑓 𝑥, 𝑦 = 𝑥 − 2 +5
What’s special
about x=2?
𝑓 𝑥 minim. at 2
𝑔 𝑥 = 0 at 2
a = minimum of f →
𝑔 𝑎 =0
109
Rates of change
/
𝑓 𝑥, 𝑦 = 𝑥 − 2 +5
Suppose I want to
increase f(x) by
changing x:
110
What Calculus Should I Know
111
Partial Derivatives
113
Taking a slice of
/ /
𝑓/ 𝑥, 𝑦 = 𝑥 − 2 +5+ 𝑦+1
Slice of y=0 is the
function from before:
𝑓 𝑥 = 𝑥−2 /+5
𝑓 @ 𝑥 = 2(𝑥 − 2)
114
Taking a slice of
/ /
𝑓/ 𝑥, 𝑦 = 𝑥 − 2 +5+ 𝑦+1
A
𝑓/𝑥, 𝑦 is rate of
AB
change & direction in
x dimension
115
Zooming Out
/ /
𝑓/ 𝑥, 𝑦 = 𝑥 − 2 +5+ 𝑦+1
A
𝑓/ 𝑥, 𝑦 is
AC
2(𝑦 + 1)
and is the rate of
change & direction in
y dimension
116
Zooming Out
/ /
𝑓/ 𝑥, 𝑦 = 𝑥 − 2 +5+ 𝑦+1
Gradient/Jacobian:
Making a vector of
𝜕𝑓 𝜕𝑓
∇D = ,
𝜕𝑥 𝜕𝑦
gives rate and
direction of change.
117
What Should I Know?
118
119