0% found this document useful (0 votes)

100 views40 pages

Sparse Coding and Dictionary Learning

This document provides an introduction to sparse coding and dictionary learning. It first discusses what sparsity means and how sparse representations can be used for tasks like compression, analysis, and denoising. It then covers sparse coding using l0- and l1-norm regularization to find sparse representations of signals in an overcomplete dictionary. Dictionary learning methods like K-SVD and online dictionary learning are also introduced to learn data-driven dictionaries. The document concludes with a summary of sparse coding and dictionary learning techniques.

Uploaded by

gollakoti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views40 pages

Sparse Coding and Dictionary Learning

Uploaded by

gollakoti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

An Introduction to Sparse Coding and

Dictionary Learning
Kai Cao
January 14, 2014

Outline

Introduction
Mathematical foundation
Sparse coding
Dictionary learning
Summary
2

Introduction

What is sparsity?
Sparsity implies many zeros in a vector or a matrix

Sparse
representation

FFT
transform
Fingerprint patch

FFT response

IFFT
transform

Usage:
Compression
Analysis
Denoising

Reconstructed patch

Sparse Representation

xD
Dictionary Learning
Problem

Sparse Coding
Problem

Application---Denoising
Source

Result 30.829dB

Dictionary

PSNR = 22.1dB
Noisy image

[M. Elad, Springer 2010]

Application---Compression
Original

JPEG

JPEG
2000

PCA

15.81

13.89

10.66

6.60

14.67

12.41

9.44

5.49

15.30

12.57

10.27

6.36

Dictionary
based

550 bytes per

image

Bottom:
RMSE values

[O. Bryta, M. Elad, 2008]

Mathematical foundation

Derivatives of vectors
First order

aT x xT a
= = a
x
x

Second order
Exercise

xT Bx
= ( B + BT ) x
x

1
minm || x D ||22 + || ||22 , x R n , D R nm
2
=
( DT D + I ) 1 DT x
9

Trace of a Matrix
Definition

A (aij ) R nn
Tr ( A) = i =1 aii , =
n

Properties
=
|| A ||2F

2
T
=
a
Tr
(
A
A),
ij

=i 1 =j 1

Tr ( A) = Tr ( AT ),

Tr ( A + B )= Tr ( A + B ),
Tr (aA) = aTr ( A),
Tr ( AB ) = Tr ( BA),

B R n n
aR
B R nn

Tr=
( ABC ) Tr
=
( BCA) Tr (CAB ),

B , C R n n

Derivatives of traces
First order

Tr ( XA) = AT
X

Tr ( X T A) = A
X

Derivatives of traces

Exercise

Tr ( X T XA
=
) XAT + XA
X

Tr ( X T BX
=
) BT X + BX
X

2
2
min
||
X

DA
||
+

||
A
||
F
F,
k m

X R n m , D R n k

=
A ( DT D + I ) 1 DT X

Sparse coding

Sparse linear model

Let x Rn be a signal

Let D =[d1, d2, , dm] Rnm be a set

of normalized (diT di = 1)basis vectors
(dictionary)
D

Sparse representation is to find a sparse vector

Rm such that x D, where is regarded as sparse
code

The sparse coding model

Objective function
1
minm || x D ||22
2
Data fitting term

+ ( )
Regularization term

The regularization term can be

the l2 norm. || || i =1 i2
the l0 norm. || ||0 #{i | ai 0}
m
the l1 norm. || ||1 i =1 | i |

2
2

Sparsity inducing

Matching pursuit
1
minm || x D ||22
2
1.
2.
3.

s. t.

|| ||0 L

Initialization: = 0, residual r = x
while ||||0 <L
Select the element with maximum correlation with the residual

i = arg max | diT r |

i =1,..., m

Update the coefficients and residual

T
=

+
d
r
i
i
i

End while

r= r (diT r )di

An example for matching pursuit

Dictionary elements

Patch from latent

c1=-0.039 c2= 0.577

c3=0.054 c4=-0.031 c5 =-0.437

Correlation ci= diT x

Residual r

0.577
c1=-0.035 c2= 0

c3=0.037 c4=-0.046 c5 =-0.289

Correlation ci= diT r

Coefficient does not update !

Residual r

Reconstructed patch

0.577

(-0.289)

|| x x ||2 =
0.763
=

0.577 +

(-0.289)

Orthogonal matching pursuit

1
minm || x D ||22
2

s. t.

|| ||0 L

Initialization: = 0, residual r = x, active set =

2.
3.

while ||||0 <L

Select the element with maximum correlation with the residual

i = arg max | diT r |

i =1,..., m

Update the active set, coefficients and residual

= i
= (d T d ) 1 d T r
5.

End while

r= x d
17

An example for orthogonal matching

pursuit
Dictionary elements

Patch from latent

c1=-0.039 c2= 0.577

c3=0.054 c4=-0.031 c5 =-0.437

Correlation ci= diT x

Residual r

0.577
c1=-0.035 c2= 0

c3=0.037 c4=-0.046 c5 =-0.289

Correlation ci= diT r

Residual r

Reconstructed patch

0.499 -

(-0.309)

|| x x ||2 =
0.759
=

0.499 +

(-0.309)

Why does l1-norm induce sparsity?

Analysis in 1D (comparison with l2)
1
min ( x ) 2 + | |
2

if x , = x
if x - , = x+
else, =0

= x/(1+2)

1
min ( x ) 2 + 2
2

slope = /(1+2)

x
x
19

Why does l1-norm induce sparsity?

Analysis in 2D (comparison with l2)
1
min || x ||22 + || ||1
2
1
2
min
|| x ||2 s.t. || ||1
2

1
min || x ||22 + || ||22
2
1
2
min
||
x
||

2 s.t. || ||2
2

1
20

Optimality condition for l1-norm regularization

1
minm J ( ) = || x D ||22 + || ||1

2

Directional derivative in the direction u at

J ( + tu ) J ( )
J ( , u ) =
lim
t 0+

g is subgradient of J at if and only if

t R m , J (t ) J ( ) + g T (t )
Proposition 1: g is a subgradient u R m , g T u J ( , u )
Proposition 2: if J is differentiable at , J ( , u ) =
J ( )T u
Proposition 3: is optimal if and only if for all u, J ( , u ) 0
21

Subgradient for l1-norm regularization

Example: f(x) = |x|
f(x) = |x|

subgradient
1
x

x
-1

|u|

f ( x , u ) =

sign( x)u

x=0
x0
22

Subgradient for l1-norm regularization

1
minm J ( ) = || x D ||22 + || ||1

2

J ( , u ) =
u T DT ( x D ) +

sign(a )u + | u |

i , ai 0

i
0
i , ai =

g is a subgradient at if and only if for all i

| gi diT ( x D ) |
gi = diT ( x D ) + sign(ai )

ai = 0

ai 0

First order method for

convex optimization
Differentiable objective
Gradient descent: t +1 = t t J ( t )
With line search for a decent ht
Diminishing step size: e.g., ht=(t+t0)-1

Non differentiable objective

Subgradient decent: t +=1 t t gt , gt is a subgradient
With line search
Diminishing step size
24

Reformulation as quadratic
program
1
minm || x D ||22 + || ||1
2
1
min m || x D + + D ||22 + (1T + + 1T )
+ , + 2

Dictionary Learning

Dictionary selection
Which D to use?
A fixed set of basis:
Steerable wavelet
Contourlet
DCT Basis

Data adaptive dictionary learn from data

K-SVD (l0-norm)
On-line dictionary learning (l1-norm)

The objective function for K-SVD

min || X DA ||2F
D, A

The examples are

linear combinations
of atoms from D

"j, s.t. || j ||0 L

Each example has a
sparse representation with
no more than L atoms

www.cs.technion.ac.il/~ronrubin/Talks/K-SVD.ppt

KSVD An Overview

Initialize
D
Sparse Coding
Use MP or OMP

Dictionary
Update
Column-by-Column by
SVD computation

www.cs.technion.ac.il/~ronrubin/Talks/K-SVD.ppt

KSVD: Sparse Coding Stage

"j, s.t. || j ||0 L

min || X DA ||2F
A

For the jth

example
we solve

min

D x j

2
2

X
s.t.

D
T

Ordinary Sparse Coding !

www.cs.technion.ac.il/~ronrubin/Talks/K-SVD.ppt

KSVD: Dictionary Update Stage

min || X DA ||2F
D

"j, s.t. || j ||0 L

For the kth

atom
we solve

min || d k Ek ||
dk

k
T

2
F

i
d

i T X (the residual)
ik

Solve with SVD

Ek= U V T

d k = u1

www.cs.technion.ac.il/~ronrubin/Talks/K-SVD.ppt

KSVD Dictionary Update Stage

We want to solve:

Only some of
the examples
use column dk!

T
k

When updating ak,

only recompute
the coefficients
corresponding to
those examples

Solve with
SVD!

www.cs.technion.ac.il/~ronrubin/Talks/K-SVD.ppt
32

Compare K-SVD with K-means

Initialize
Dictionary

Initialize
Cluster Centers

Sparse Coding
Use MP or OMP

Assignment
for each vector

Dictionary
Update

Cluster centers
update

Column-by-Column by
SVD computation

K-SVD

Cluster-by-cluster

K-means
33

dictionary learning with

l1-norm regularization
Objective function for l1-norm regularization
1 t 1
min || xi D i ||22 + || i ||1
D t
i =1 2

where

1
i arg min || xi D ||22 + || ||1
2
R m

Advantages of online learning:

Handle large and dynamic datasets,
Could be much faster than batch algorithms.
34

dictionary learning with

l1-norm regularization
1 t 1
2
)
||
||
Ft ( D=
x

i
i 2 + || i ||1
t i =1 2
t
1 1
T
T
= ( Tr ( D DAt ) Tr ( D Bt )) + || i ||1
t 2
i =1

where
t

At = i iT ,
i =1

Bt = xi iT
i =1

Ft ( D) 1
= ( DAt Bt )
D
t
For a new xt+1,

T
At +=
A
+

1
t
t +1 t +1 ,

T
Bt +=
B
+
x

1
t
t +1 t +1

On-line dictionary learning

1) Initialization: D0 Rnm; A0=0; B0=0;
2) For t=1,,T
3)
Draw xt from the training data set
4)
Get sparse code
1
=
t arg min || xt Dt 1 ||22 + || ||1
2
R m
5) Aggregate sufficient statistics

=
At At 1 + t tT ,

=
Bt Bt 1 + xt tT ,

6) Dictionary update

Ft ( D)
=
Dt Dt 1
D

7) End for
36

Toolbox - SPAMS
SPArse Modeling Software:
Sparse coding
l0-norm regularization
l1-norm regularization

Dictionary learning
K-SVD
Online dictionary learning

C++ implemented with Matlab interface

http://spams-devel.gforge.inria.fr/

Summary
Sparsity and sparse representation
Sparse coding with l0- and l1-norm regularization
Orthogonal matching pursuit/matching pursuit
Subgradient and optimal condition

Dictionary learning with l0- and l1-norm regularization

K-SVD
Online dictionary learning

Try to use it !!

References
T. T. Cai, Lie Wang ,Orthogonal Matching Pursuit for Sparse Signal Recovery
With Noise, IEEE Transactions on Information Theory, 57(7): 4680-4688,2011
Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals
of statistics, 32(2):407499, 2004.
M. Aharon, M. Elad, and A. M. Bruckstein. The K-SVD: An algorithm for
designing of overcomplete dictionaries for sparse representations. IEEE
Transactions on Signal Processing, 54(11):4311-4322, November 2006.
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse
coding. In Proceedings of the International Conference on Machine Learning
(ICML), 2009a.

Thank you for listening

NeuralProphet Introduction
100% (1)
NeuralProphet Introduction
43 pages
Newman Networks An Introduction 2010
100% (2)
Newman Networks An Introduction 2010
394 pages
Queuing Templates: This Workbook Computes Queuing Results For The Following Models
No ratings yet
Queuing Templates: This Workbook Computes Queuing Results For The Following Models
10 pages
Advanced Machine Learning: (COMP 5328)
No ratings yet
Advanced Machine Learning: (COMP 5328)
46 pages
ML Tutorial Con Ejemplos
No ratings yet
ML Tutorial Con Ejemplos
236 pages
C OMBINATORIAL M ODELS OF C OMPLEX S YSTEMSTesis Doctorado Eng
No ratings yet
C OMBINATORIAL M ODELS OF C OMPLEX S YSTEMSTesis Doctorado Eng
194 pages
Sparse Coding and Dictionary Learning For Image Analysis: Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro
No ratings yet
Sparse Coding and Dictionary Learning For Image Analysis: Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro
21 pages
BAM: Clostridium Perfringens: Bacteriological Analytical Manual
No ratings yet
BAM: Clostridium Perfringens: Bacteriological Analytical Manual
5 pages
Sparsity and Its Mathematics
No ratings yet
Sparsity and Its Mathematics
44 pages
SVD PDF
No ratings yet
SVD PDF
10 pages
The Singular Value Decomposition (SVD)
No ratings yet
The Singular Value Decomposition (SVD)
9 pages
BAM Chapter 17. Clostridium Botulinum
No ratings yet
BAM Chapter 17. Clostridium Botulinum
26 pages
Study Book
No ratings yet
Study Book
316 pages
Volterravito PDF
No ratings yet
Volterravito PDF
6 pages
Probability Models For Computer Science: Sheldon M. Ross
No ratings yet
Probability Models For Computer Science: Sheldon M. Ross
5 pages
TimeSeries Analysis State Space Methods
100% (1)
TimeSeries Analysis State Space Methods
57 pages
10.1007 - 978 3 319 02291 8
100% (2)
10.1007 - 978 3 319 02291 8
398 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
Evžen Kočenda - Alexandr Černý - Elements of Time Series Econometrics - An Applied Approach-Karolinum Press, Charles University (2017)
No ratings yet
Evžen Kočenda - Alexandr Černý - Elements of Time Series Econometrics - An Applied Approach-Karolinum Press, Charles University (2017)
220 pages
Data Science Real World Applications
100% (1)
Data Science Real World Applications
19 pages
Learning Apache Spark With Python
No ratings yet
Learning Apache Spark With Python
200 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
Big Data and The Web
No ratings yet
Big Data and The Web
170 pages
Dynamic Parallel Khalil Ouarda JINT 07 PDF
No ratings yet
Dynamic Parallel Khalil Ouarda JINT 07 PDF
24 pages
ف1
No ratings yet
ف1
4 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
Pdemat
No ratings yet
Pdemat
90 pages
Matrix Cookbook
No ratings yet
Matrix Cookbook
71 pages
Machine Learning For Survival Analysis
No ratings yet
Machine Learning For Survival Analysis
107 pages
Intermediate R - Nonlinear Regression in R
No ratings yet
Intermediate R - Nonlinear Regression in R
4 pages
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
No ratings yet
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
395 pages
Mathematical Modeling of Engineering Problems
No ratings yet
Mathematical Modeling of Engineering Problems
69 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
29 pages
Previous Year Placement Questions of ISI KOLKATA
No ratings yet
Previous Year Placement Questions of ISI KOLKATA
9 pages
Statistical Inference
100% (1)
Statistical Inference
118 pages
Teoria de Algoritmos
No ratings yet
Teoria de Algoritmos
107 pages
Numerical Calculus With Excel
No ratings yet
Numerical Calculus With Excel
16 pages
Lecture 13: Bayesian Networks I: CS221 / Spring 2019 / Charikar & Sadigh
No ratings yet
Lecture 13: Bayesian Networks I: CS221 / Spring 2019 / Charikar & Sadigh
76 pages
Debasis Kundu, Swagata Nandi Statistical Signal Processing Frequency Estimation 2012
No ratings yet
Debasis Kundu, Swagata Nandi Statistical Signal Processing Frequency Estimation 2012
140 pages
Fdocuments - in - Asymptotic Expansion of The Incomplete Beta Function For Large Values of The
No ratings yet
Fdocuments - in - Asymptotic Expansion of The Incomplete Beta Function For Large Values of The
6 pages
Populations-Dynamics Modelo Matematica Haberman
No ratings yet
Populations-Dynamics Modelo Matematica Haberman
139 pages
CO250 Web
No ratings yet
CO250 Web
204 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
281A Final Sol
No ratings yet
281A Final Sol
9 pages
B-Splines Primer
No ratings yet
B-Splines Primer
52 pages
110107129
No ratings yet
110107129
655 pages
Bose, A., & Chatterjee, S. (2018) - U-Statistics, Mm-Estimators and Resampling
No ratings yet
Bose, A., & Chatterjee, S. (2018) - U-Statistics, Mm-Estimators and Resampling
181 pages
Gardiner (Crispin W.) - Handbook of Stochastic Methods For Physics, Chemistry and The Natural Sciences Springer Series in Synergetics 13 1994
No ratings yet
Gardiner (Crispin W.) - Handbook of Stochastic Methods For Physics, Chemistry and The Natural Sciences Springer Series in Synergetics 13 1994
410 pages
Non-Life Insurance Mathematics (Sumary)
No ratings yet
Non-Life Insurance Mathematics (Sumary)
99 pages
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
No ratings yet
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
282 pages
ARIMA Models in Python Chapter2
No ratings yet
ARIMA Models in Python Chapter2
43 pages
Stats 210 Course Book
No ratings yet
Stats 210 Course Book
200 pages
Approximations For Digital Computers
100% (2)
Approximations For Digital Computers
200 pages
A Direct Proof of The Prime Number Theorem Using Riemann's Prime-Counting Function
100% (1)
A Direct Proof of The Prime Number Theorem Using Riemann's Prime-Counting Function
9 pages
Numerical Methods For Engineers
No ratings yet
Numerical Methods For Engineers
160 pages
Lecture Notes
100% (1)
Lecture Notes
324 pages
Representations Creuses Dans Le TduS BD
No ratings yet
Representations Creuses Dans Le TduS BD
46 pages
1.1 Background: September 28, 2015 16:10 Sparse Coding and Its Applications in Computer Vision - 9in X 6in b2310
No ratings yet
1.1 Background: September 28, 2015 16:10 Sparse Coding and Its Applications in Computer Vision - 9in X 6in b2310
6 pages
Sparse Regression and Dictionary Learning
No ratings yet
Sparse Regression and Dictionary Learning
14 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
681 Esl b1 Level MCQ Test With Answers Intermediate Test 2
No ratings yet
681 Esl b1 Level MCQ Test With Answers Intermediate Test 2
13 pages
Husbandry Guidelines For (Reptilia:Emydidae) : Indian Star Tortoise
No ratings yet
Husbandry Guidelines For (Reptilia:Emydidae) : Indian Star Tortoise
189 pages
Objectives
No ratings yet
Objectives
3 pages
Chapter-I Introduction To Nanomaterials: Sathyabama University
No ratings yet
Chapter-I Introduction To Nanomaterials: Sathyabama University
31 pages
Artemia Culture For Intensive Finfish PDF
No ratings yet
Artemia Culture For Intensive Finfish PDF
6 pages
M.SC Math PDF
No ratings yet
M.SC Math PDF
42 pages
Athadu Adavini Jayinchadu PDF
No ratings yet
Athadu Adavini Jayinchadu PDF
55 pages
JAM 2015: General Instructions During Examination
No ratings yet
JAM 2015: General Instructions During Examination
17 pages
2015
No ratings yet
2015
9 pages
Du MSC Entrance Exam 20141
No ratings yet
Du MSC Entrance Exam 20141
20 pages
3rd Grade Math Teaching Strategies Handout
No ratings yet
3rd Grade Math Teaching Strategies Handout
6 pages
MATB113 Couse Outine
No ratings yet
MATB113 Couse Outine
9 pages
Discrete Mathematics Cheat Sheet
0% (1)
Discrete Mathematics Cheat Sheet
3 pages
Calculus, Analysis and Linear Algebra (MTH141) 2022-23
No ratings yet
Calculus, Analysis and Linear Algebra (MTH141) 2022-23
2 pages
4 Problems On Hermitian Conjugates: Assignment No. 4
No ratings yet
4 Problems On Hermitian Conjugates: Assignment No. 4
1 page
Algebra Mock Review Packet
No ratings yet
Algebra Mock Review Packet
9 pages
Unit2 Systems
No ratings yet
Unit2 Systems
40 pages
Revision Notes On Complex Numbers: Argument of A Complex Number
No ratings yet
Revision Notes On Complex Numbers: Argument of A Complex Number
5 pages
Variance - MATLAB Var
No ratings yet
Variance - MATLAB Var
5 pages
Algebra Word Problems
No ratings yet
Algebra Word Problems
4 pages
Algebra I m5 Topic B Lesson 5 Teacher
No ratings yet
Algebra I m5 Topic B Lesson 5 Teacher
12 pages
Definition of Arithmetic Sequences and Series
No ratings yet
Definition of Arithmetic Sequences and Series
18 pages
Extra Questions For Class 9 Maths Linear Equations For Two Variables With Answers Solutions
No ratings yet
Extra Questions For Class 9 Maths Linear Equations For Two Variables With Answers Solutions
5 pages
Ex 3
No ratings yet
Ex 3
3 pages
Pairs of Rings Sharing Their Units: R R R S R R S R S
No ratings yet
Pairs of Rings Sharing Their Units: R R R S R R S R S
47 pages
S6 Test 1 PDF
No ratings yet
S6 Test 1 PDF
2 pages
Algebraic Equations
No ratings yet
Algebraic Equations
2 pages
Divisibility
No ratings yet
Divisibility
0 pages
Week 8 Math 6
No ratings yet
Week 8 Math 6
6 pages
Recurrence Relations: Solution. Let A
No ratings yet
Recurrence Relations: Solution. Let A
24 pages
On Solving The Generalized Pell Equation
No ratings yet
On Solving The Generalized Pell Equation
18 pages
Higher Order Partial Derivatives
No ratings yet
Higher Order Partial Derivatives
8 pages
Coding and Decoding
No ratings yet
Coding and Decoding
3 pages
Form 5 Pure Maths Paper 2 X 5
No ratings yet
Form 5 Pure Maths Paper 2 X 5
3 pages
Permutation and Combination
No ratings yet
Permutation and Combination
1 page
Math 600 Day 10: Lee Brackets of Vector Fields
No ratings yet
Math 600 Day 10: Lee Brackets of Vector Fields
16 pages
47 John A. Beachy - Introductory Lectures On Rings and Modules (1999)
100% (1)
47 John A. Beachy - Introductory Lectures On Rings and Modules (1999)
245 pages
Workbook Number Theory
No ratings yet
Workbook Number Theory
30 pages
Maths Differential Equations Sia
100% (1)
Maths Differential Equations Sia
258 pages
2025 SWR Mock 0795 CSC P3 Questions
No ratings yet
2025 SWR Mock 0795 CSC P3 Questions
6 pages