Mathematical Tool

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

An Introduction to Mathematical Image Processing

IAS, Park City Mathematics Institute, Utah


Undergraduate Summer School 2010
Instructor: Luminita Vese
Teaching Assistant: Todd Wittman
Department of Mathematics, UCLA
lvese@math.ucla.edu wittman@math.ucla.edu

Lecture Meeting Time: Mon, Tue, Thu, Fri 1-2pm (location: Coalition 1&2)
Problem Session: 4.30-5.30, 5.30-6.30 (location: computer lab or tent)
Working Version

Abstract
Image processing is an essential field in many applications, including medical imaging, as-
tronomy, astrophysics, surveillance, video, image compression and transmission, just to name
a few. In one dimension, images are called signals. In two dimensions we work with planar
images, while in three dimensions we have volumetric images (such as MR images). These
can be gray-scale images (single-valued functions), or color images (vector-valued functions).
Noise, blur and other types of imperfections often degrade acquired images. Thus such images
have to be first pre-processed before any further analysis and feature extraction.
In this course we will formulate in mathematical terms several image processing tasks:
image denoising, image deblurring, image enhancement, image segmentation, edge detection.
We will learn techniques for image filtering in the spatial domain (using first- and second-
order partial derivatives, the gradient, Laplacian, and their discrete approximations by finite
differences, averaging filters, order statistics filters, convolution), and in the frequency domain
(the Fourier transform, low-pass and high-pass filters), zero-crossings of the Laplacian. If time
permits, we will learn more advanced methods that can be formulated as a weighted Laplace
equation for image restoration, and curve evolution techniques called snakes for image seg-
mentation (including total variation minimization, active contours without edges, anisotropic
diffusion equation, NL Means).

• Prerequisites for the course include basic curriculum in calculus and linear algebra. Most con-
cepts will be defined as they are introduced. Some exposure, e.g. through a basic introductory
course, to numerical analysis, Fourier analysis, partial differential equations, statistics, or proba-
bility will definitely be useful though we will try to cover the necessary facts as they arrive.

1
• For computational purposes, some familiarity with the computer language Matlab is useful.
You may wish to explore in advance the introductory tutorial on Image Processing Using Matlab,
prepared by Pascal Getreuer, on how to read and write image files, basic image operations, filtering,
found at the online link: http://www.math.ucla.edu/˜ getreuer/matlabimaging.html
• The Discussion Sections will be devoted to problem solving, image processing with Matlab,
summary of current lecture, or to exposition of additional topics.
• For an introduction to image processing, a useful reading textbook is:
[7] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 3rd edition, Prentice-Hall.
See also
[8] R.C. Gonzalez, R.E. Woods and S.L. Eddins, Digital Image Processing Using Matlab, 2nd
edition, Prentice-Hall.
• Please visit also the webpage of these two textbooks for review material, computational projects,
problem solutions, images used in the textbook, imaging toolbox, and many other useful informa-
tion:
http://www.imageprocessingplace.com/index.htm
• The website for the course can be found at: http://www.math.ucla.edu/˜ lvese/155.1.09w/PCMI− USS− 2010
where lecture notes, problems, computer projects and other links are posted.
• An introduction and overview of the course can be found on the course webpage. Click on
”Lecture1.pdf” for slides presentation.

Topics to be covered
Fundamental steps in image processing
A simple image formation model
Image sampling and quantization
Intensity transformations and spatial filtering
- Histogram equalization
- Linear spatial filters, correlation, convolution
- Smoothing (linear) spatial filters
- Sharpening linear spatial filters using the Laplacian
Filtering in the frequency domain
- 1D and 2D continuous and discrete Fourier transforms
- convolution theorem
- properties of the Fourier transform
- filtering in the frequency domain (smoothing and sharpening, low-pass and high-pass filtering)
- the Laplacian in the frequency domain, enhancement
- homomorphic filtering
- band-reject and band-pass filters

2
Image restoration and reconstruction
- noise models
- mean filters
- order statistics filters
- adaptive median filter
- periodic noise reduction
- NL Means filter
- linear, position-invariant degradations
- examples of degradation (PSF) functions
- inverse filtering (Wiener filter, constrained least squares filtering, total variation minimization)
- image reconstruction from projections (Radon transform, computed tomography, the Fourier slice
Thm., filtered backprojections using parallel beam)
Image segmentation
- image gradient, gradient operators, gradient-based edge detection
- the Marr-Hildreth edge detector, Canny edge detector
- active contours
- global processing using the Hough transform

Contents
1 Fundamental Steps in Digital Image Processing 4

2 A simple image formation model 4


2.1 Image sampling and quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Intensity transformations and spatial filtering 5


3.1 Histogram equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Spatial Linear Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 The Fourier Transform and Filtering in the Frequency Domain 11


4.1 Principles of Filtering in the Frequency Domain . . . . . . . . . . . . . . . . . . . 14

5 Image Restoration 17
5.1 Image Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Image Deblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Energy minimization methods for image reconstruction . . . . . . . . . . . . . . . 24
5.3.1 Computation of the first order optimality condition in the continuous case . 29

6 Image Segmentation 32
6.1 The gradient edge detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Edge detection by zero-crossings of the Laplacian (the Marr-Hildtreth edge detector) 33
6.3 Boundary detection by curve evolution and active contours . . . . . . . . . . . . . 34
6.3.1 Curve Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3
1 Fundamental Steps in Digital Image Processing
This field can be divided into several areas without clear boundaries:
- image processing
- image analysis
- computer vision
or into
- low-level vision
- mid-level vision
- high-level vision
In more details, the fundamental steps are as follows.
Image acquisition (output: digital image)
• Low-level vision: input = image, output = image
Image enhancement: subjective process (e.g. image sharpening)
Image Restoration: objective process, denoising, deblurring, etc (depends on the degradation)
Color Image Processing: there are several color modes. Color (R,G,B) images are represented
by vector-valued functions with three components; natural extensions from gray-scale to color
images (most of the time)
Wavelets: advance course, mathematical tool to allow representation of images at various de-
grees of resolutions, used in many image processing tasks .
Compression: reducing the storage required to save an image (jpeg 2000)
• Midd-level vision: input = image, output = image attributes
Mathematical morphology: tools for extracting image components useful in the representation
and description of shape.
Segmentation: to partition an image into its constituent parts or objects.
• High-level vision Input: boundaries and regions. Output: image attributes
Representation and description: following segmentation, gives representation of boundaries
and regions; description given by a set object attributes or features.
Recognition: assign a label (e.g., “vehicle”) to an object based on its descriptors
This course will deal with low-level and midd-level tasks.

2 A simple image formation model


In the continuous case, a planar image is represented by a two-dimensional function (x, y) 7→
f (x, y). The value of f at the spatial coordinates (x, y) is positive and it is determined by the
source of the image. If the image is generated from a physical process, its intensity values are
proportional to energy radiated by a physical source. Therefore, f (x, y) must be nonzero and
finite:
0 < f (x, y) < ∞
The image-function f may be characterized by two components:
(1) the amount of source illumination incident on the scene (illumination) i(x, y)
(2) the amount of illumination reflected by the objects (reflectance) r(x, y)

4
We have
f (x, y) = i(x, y)r(x, y)
where

0 < i(x, y) < ∞ and 0 < r(x, y) < 1.


Reflectance is bounded below by 0 (total absorption) and above by 1 (total reflectance).
i(x, y) depends on the illumination source
r(x, y) depends on the characteristics of the imaged objects.
The same expressions are applicable to images formed via transmission of the illumination
through a medium (chest X-ray, etc). Then we deal with transmissivity instead of reflectivity.
From the above construction, we have

Lmin ≤ l = f (x, y) ≤ Lmax ,

where l = f (x, y) is the gray-level at coordinates (x, y). It is common to shift the gray-scale (or
intensity scale) from the interval [Lmin , Lmax ] to the interval [0, L − 1]. Then l = 0 is considered
black and l = L − 1 is considered white on the gray scale. The intermediate values are shades
varying from black to white.

2.1 Image sampling and quantization


Need to convert the continuous sensed data into digital form via two processes: sampling and
quantization.
Sampling = digitizing the coordinate values
Quantization = digitizing the amplitude values
f0,0 f0,1 ... f0,N −1
 
 f f1,1 ... f1,N −1 
1,0
A digital image is represented as a M × N matrix: f =  .
 
 ... ... ... 
fM −1,0 fM −1,1 ... fM −1,N −1
The number of gray levels L is taken to be a power of 2: L = 2k for some integer k > 0. Many
times, digital images take values 0, 1, 2, ..., 255, thus 256 distinct gray levels.

3 Intensity transformations and spatial filtering


3.1 Histogram equalization
The histogram equalization technique is a statistical tool for improving the contrast of images. The
input image is f (x, y) (as a low contrast image, dark image, or light image). The output is a high
contrasted image g(x, y). Assume that the gray-level range consists of L gray-levels.
In the discrete case, let rk = f (x, y) be a gray level of f , k = 0, 1, 2, ...L − 1. Let sk = g(x, y)
be the desired gray level of output image g. We find a transformation T : [0, L − 1] 7→ [0, L − 1]
such that sk = T (rk ), for all k = 0, 1, 2, ..., L − 1.
Define h(rk ) = nk where rk is the kth gray level, and nk is the number of pixels in the image
f taking the value rk .

5
Visualizing the discrete function rk 7→ h(rk ) for all k = 0, 1, 2, ..., L − 1 gives the histogram.
We can also define the normalized histogram: let p(rk ) = nnk , where n is the total number of
pixels in the image (thus n = M N ).
The visualization of the discrete mapping rk 7→ p(rk ) for all k = 0, 1, 2, ..., L − 1 gives the
normalized histogram. We note that 0 ≤ p(rk ) ≤ 1 and
L−1
X
p(rk ) = 1.
k=0

See ”Lecture1.pdf” for a slide showing several images and their corresponding histograms. We
notice that a high-contrasted image has a more uniform, almost flat histogram. Thus, the idea is
to change the intensity values of f so that the output image g has a more uniform, almost flat
histogram. We will define T : [0, L − 1] 7→ [0, L − 1] as an increasing transformation using the
cumulative histogram.
In the discrete case, let rk = f (x, y). Then we define the histogram-equalized image g at (x, y)
by
k
X
g(x, y) = sk = (L − 1) p(rj ). (1)
j=0

We refer to ”Lecture1.pdf” for the slide showing the effect of histogram equalization applied to
poorly contrasted images.

Theoretical interpretation of histogram equalization (continuous case) We view the gray lev-
els r and s as random variables with associated probability distribution functions pr (r) and ps (s).
The continuous version of (1) uses the cumulative distribution function and we define
Z r
s = T (r) = (L − 1) pr (w)dw.
0

If pr (r) > 0 on [0, L − 1], then T is strictly increasing from [0, L − 1] to [0, L − 1], thus T
is invertible. Moreover, if T is differentiable, then we can use a formula from probabilities: if
s = T (r), then
∂r
ps (s) = pr (r) ,
∂s
where we view s = s(r) = T (r) as a function of r, and r = r(s) = T −1 (s) as a function of s.
From the definition of s, we have
∂s
= (L − 1)pr (r) = T 0 (r),
∂r
thus
∂r ∂  −1  1 1
= T (s) = 0 = .
∂s ∂s T (s) (L − 1)pr (r(s))
Therefore
∂r 1 1
ps (s) = pr (r) = pr (r) = .
∂s (L − 1)pr (r) L−1
In other words, ps is the uniform probability distribution function on the interval [0, L − 1], thus
corresponding to a flat histogram in the discrete case.

6
3.2 Spatial Linear Filters
A spatial filter consists of
(1) a neighborhood (typically a small rectangle)
(2) a predefined operation that is performed on the image pixels encompassed by the neighbor-
hood.
The output of the operation in (2) will provide the value of the output image g(x, y).
Let f (x, y) be the input image, and g(x, y) the output image in the discrete case, thus (x, y) are
viewed as integer coordinates, with 0 ≤ x ≤ M − 1 and 0 ≤ y ≤ N − 1. Therefore, both images
f and g are of size M · N . We define a neighborhood centered at the point (x, y) by
n o
S(x,y) = (x + s, y + t), −a ≤ s ≤ a, −b ≤ t ≤ b ,

where a, b ≥ 0 are integers. The size of the patch S(x,y) is (2a + 1)(2b + 1), and we denote by
m = 2a + 1 and n = 2b + 1 (odd integers), thus the size of the patch becomes m · n, and a = m−1
2
,
b = n−1
2
.
For example, the restriction f |S(x,y) for a 3 × 3 neighborhood S(x,y) is represented by

f (x − 1, y − 1) f (x − 1, y) f (x − 1, y + 1)
f |S(x,y) = f (x, y − 1) f (x, y) f (x, y + 1) .
f (x + 1, y − 1) f (x + 1, y) f (x + 1, y + 1)

We also define a window mask w = w(s, t), for all −a ≤ s ≤ a, −b ≤ t ≤ b, of size m × n.


For the 3 × 3 case, this window is

w(−1, −1) w(−1, 0) w(−1, +1)


w = w(0, −1) w(0, 0) w(0, +1) .
w(+1, −1) w(+1, 0) w(+1, +1)

We are now ready to define a linear spatial filter. The output image g is defined by
a
X b
X
g(x, y) = w(s, t)f (x + s, y + t), (2)
s=−a t=−b

with the condition that care must be taken at the boundary of the image f , in the case when
f (x + s, y + t) is not well defined. To solve this problem, we can extend the image f by zero in a
band outside of its original domain, or we can extend it by periodicity, or by reflection (mirroring).
In all these cases, all values f (x + s, y + t) will be well defined.
It is left as an exercise to show that the operation f 7→ g defined by (2) is a linear operation
(with respect to real numbers, satisfying the additivity and scalar multiplication).
For a 3 × 3 window mask and neighborhood, the filter in (2) becomes

g(x, y) = w(−1, −1)f (x − 1, y − 1) + w(−1, 0)f (x − 1, y) + w(−1, 1)f (x − 1, y + 1)


+ w(0, −1)f (x, y − 1) + w(0, 0)f (x, y) + w(0, 1)f (x, y + 1)
+ w(1, −1)f (x + 1, y − 1) + w(1, 0)f (x + 1, y) + w(1, 1)f (x + 1, y + 1).

We will consider in this section two classes of spatial linear filters:

7
(i) smoothing linear spatial filters
(ii) sharpening linear spatial filters
Smoothing can be understood as local averaging (or blurring),which is similar with spatial
summation or spatial integration. Small details and noise will be lost in the smoothing process, but
sharp edges will become blurry.
Sharpening has the opposite effect to smoothing. A blurry image f can be made sharper
through this process, details like edges will be enhanced.

Smoothing linear spatial filters Let us give two examples of smoothing spatial masks w of size
1 1 1
3×3. The first one gives the average filter or the box filter, w = 19 1 1 1 , and the second one can
1 1 1
2 2
1 2 1
− x +y 1
be seen as a discrete version of a two-dimensional Gaussian function e 2σ : w = 16 2 4 2
2

1 2 1
(the coefficients of w decrease as we move away from the center). Note that, for both these filter
masks w, the sum of the coefficients 1s=−1 1t=−1 w(s, t) = 1. Thus, the input image f and the
P P

output image g have the same range of intensity values.


As we can see from the above two examples, it is convenient to directly normalize the filter, by
considering the general weighted average filter of size m × n:
Pa Pb
s=−a t=−b w(s, t)f (x + s, y + t)
g(x, y) = Pa Pb .
s=−a t=−b w(s, t)

The above equation is computed for all (x, y) with x = 0, 1, 2, ..., M − 1 and y = 0, 1, 2, ..., N − 1.
Note that the denominator is a constant, thus it has to be computed only once for all (x, y).
Additional mean filters (linear and nonlinear) will be discussed later in the section on image
denoising.

Sharpening linear filters In this section, we will present a method for image enhancement using
the Laplacian. Let’s assume for a moment that the image function f has second order partial
derivatives. We define the Laplacian of f in the continuous case by
∂ 2f ∂ 2f
4f (x, y) = (x, y) + (x, y).
∂x2 ∂y 2
You will prove as an exercise that the mapping f 7→ 4f is linear and rotationally invariant.
We remember from numerical analysis, that in one dimension, the second order derivative of
f , f 00 at the point x, can be approximated by finite differences of values of f at x, x + h, x − h:
f (x + h) − 2f (x) + f (x − h)
f 00 (x) ≈ , (3)
h2
where h > 0 is a small discretization parameter.
2 2
First we approximate ∂∂xf2 (x, y) using (3) keeping y fixed, then ∂∂yf2 (x, y) using (3) keeping x
fixed, by
∂ 2f f (x + h, y) − 2f (x, y) + f (x − h, y)
2
(x, y) ≈ ,
∂x h2

8
Figure 1: Illustration of the enhancement using the Laplacian. Left: a blurry 1D edge of the image
f , and the sign of f 00 . Right: the resulting g = f − f 00 as an ideal sharp edge.

∂ 2f f (x, y + h) − 2f (x, y) + f (x, y − h)


2
(x, y) ≈ .
∂y h2
For images we take h = 1. Summing up these equations, we obtain a discretization of the
Laplacian naturally called the 5-point Laplacian:

4f (x, y) ≈ f (x + 1, y) + f (x − 1, y) + f (x, y + 1) + f (x, y − 1) − 4f (x, y), (4)

which can be applied to discrete images f . We notice that the 5-point Laplacian from (4) can be
0 1 0
obtained using a linear spatial filter with ”Laplacian mask” w = 1 −4 1 .
0 1 0
It is possible to show that the Laplacian can also be discretized by a 9-point Laplacian, with
formula and corresponding Laplacian mask w

4f (x, y) ≈ f (x + 1, y) + f (x − 1, y) + f (x, y + 1) + f (x, y − 1)

+f (x + 1, y + 1) + f (x − 1, y − 1) + f (x + 1, y − 1) + f (x − 1, y + 1) − 8f (x, y),
1 1 1
w = 1 −8 1 .
1 1 1
We notice that both Laplacian masks w have the sum of coefficients equal to 0, which is related
with the sharpening property of the filter. We will now show how the Laplacian can be used to
enhance images.
Consider a smoothed 1D edge profile f as shown in Figure 1 left, where the sign of f 00 is also
given. As we can see, applying the operation g = f − f 00 as shown in Figure 1 right, produces an
ideal sharp edge.
Similarly, for two dimensional images, if the input f is blurry, it can be enhanced by the
operation g(x, y) = f (x, y) − 4f (x, y). In the discrete case, if we use the 5-point Laplacian (4),
this corresponds to the following spatial linear filter and mask:

g(x, y) = f (x, y)−4f (x, y) = 5f (x, y)−f (x+1, y)−f (x−1, y)−f (x, y+1)−f (x, y−1), (5)

9
0 −1 0
w = −1 5 −1 .
0 −1 0
−1 −1 −1
Similarly, the enhancement using the 9-point Laplacian has the mask w = −1 9 −1 .
−1 −1 −1

Unsharp Masking and Highboost Filtering Image enhancement can also be obtained using
smoothing spatial filters, instead of the Laplacian. Assume that f is the input image (that may be
blurry). We first compute a smoother image fsmooth by applying a smoothing linear filter to f as
discussed before. Then the sharper output image g is defined by
 
g(x, y) = f (x, y) + k · f (x, y) − fsmooth (x, y) , (6)

where k ≥ 0 is a coefficient. When k = 1 the method is called unsharp masking. When k > 1, it
is called highboost filtering, since the edges and details are even more highlighted.
Consider the case k = 1 in equation (6). When we compute f − fsmooth , this gives negative
values on the left of the edge, and positive values on the right of the edge, and values close to 0
elsewhere. Thus when we compute f + (f − fsmooth ) this provides the sharper corrected edge in g.

10
4 The Fourier Transform and Filtering in the Frequency Do-
main
Illustration of Fourier series Consider the inner product vector space of functions

V = {f : [0, 2π] → C, f continuous}

with corresponding inner product


1 Z 2π
hf, gi = f (x)g(x)dx.
2π 0
Let S = {fn (x) = einx = cos nx + i sin nx, n integer}. We have that S is an orthonormal
basis of V , based on the following:
If m 6= n, both integers, we have
1 Z 2π imx inx 1 Z 2π i(m−n)x 1 2π
hfm , fn i = e e dx = e dx = ei(m−n)x = 0.
2π 0 2π 0 2πi(m − n) 0

On the other hand,


1 Z 2π 1 Z 2π inx −inx 1 Z 2π
hfn , fn i = fn (x)fn (x)dx = (e e )dx = 1dx = 1.
2π 0 2π 0 2π 0
It is also easy to verify that for any f ∈ V ,
n=+∞
cn einx ,
X
f (x) =
n=−∞

1 2π
R −inx
where cn = 2π 0 f (x)e dx. Thus the function f can be expressed as a sum of sines and and
cosines with appropriate coefficients.

The continuous and discrete Fourier transforms Remember that a linear discrete filter applied
to an image f (x, y) was defined by
a
X b
X
g(x, y) = h(m, n)f (x + m, y + n),
m=−a n=−b

where h was a window mask. In the continuous case, if h, f are functions in the plane, we can
define the correlation of two functions (as a continuous version of the spatial linear filter)
Z ∞ Z ∞
h ◦ f (x, y) = h(m, n)f (x + m, y + n)dmdn. (7)
−∞ −∞

If we make the change in the above formula (7): + signs into - signs inside f , we obtain the
convolution of two functions defined by
Z ∞ Z ∞
h ∗ f (x, y) = h(m, n)f (x − m, y − n)dmdn. (8)
−∞ −∞

11
Thus the convolution h ∗ f can also be seen as a linear filtering in the spatial domain: filtering
the image f with convolution kernel h. For the discrete convolution, the computation h ∗ f is
very expensive, especially if the support of h is not too small. It turns out that using the Fourier
Transform F that we will define, we have:
if H = F(h) and F = F(f ), then h ∗ f ⇔ HF (pointwise multiplication),
thus a fast computation.
We recall the Euler’s formulas: if θ is a real number, then eiθ = cos θ + i sin θ. If α, β are
real numbers, then eα+iβ = eα (cos β + i sin β), where i is the purely complex number such that
i2 = −1. R∞
In one dimension first, we consider a planar function f (x) satisfying −∞ |f (x)|dx < ∞. Then
we can define F : R → C, the Fourier transform of f , by
Z ∞
F (u) = f (x)e−2πiux dx, F(f ) = F,
−∞

and its inverse can be obtained by


Z ∞
f (x) = F (u)e2πiux du, F −1 (F ) = f
−∞

(x is called spatial variable, while u is called frequency variable).


In two dimensions, the corresponding direct and inverse Fourier transforms for a function
f (x, y) are Z ∞ Z ∞
F (u, v) = f (x, y)e−2πi(ux+vy) dxdy, F(f ) = F,
−∞ −∞
and its inverse can be obtained by
Z ∞
f (x, y) = F (u, v)e2πi(ux+vy) dudv, F −1 (F ) = f.
−∞

We say that [f, F ] forms a transform pair. It is easy to verify that the mapping f 7→ F(f ) is a
linear transformation.
Since the transform F takes complex values, we have F (u, v) = R(u,q v) + iI(u, v) (using the
real and imaginary parts of F ). We call the Fourier spectrum |F (u, v)| = R(u, v)2 + I(u, v)2 .
We have the convolution theorem, that we prove in one dimension:
Convolution Theorem If H = F(h) and F = F(f ), then F(h ∗ f ) = HF , or F(h ∗ f ) =
F(h)F(f ).
Proof: We present the details of the proof in one dimension. The two-dimensional case is similar.
Recall the 1D convolution
Z ∞
h ∗ f (x) = h(m)f (x − m)dm.
−∞

Then
Z +∞ Z ∞ hZ ∞ i
F(h ∗ f ) = h ∗ f (x)e−2πiux dx = h(m)f (x − m)dm e−2πiux dx
−∞ −∞ −∞
Z ∞ hZ ∞ i
= h(m) f (x − m)e−2πiux dx dm.
−∞ −∞

12
By the change of variable x − m = X, we obtain
Z ∞ hZ ∞ i
F(h ∗ f ) = h(m) f (X)e−2πiu(m+X) dX dm
−∞ −∞
hZ ∞ ih Z ∞ i
= h(m)e−2πium dm f (X)e−2πiuX dX = H(u)F (u).
−∞ −∞
In the discrete case, we substitute the real numbers x, y, u, v by x4x, y4y, u4u, v4v where
now x, y, u, v are integers and 4x, 4y, 4u, 4v here denote step discretizations. We make the
convention that 4x4u = M1 , 4y4v = N1 . Then for an image f (x, y), with x = 0, 1, ..., M − 1,
y = 0, 1, ..., N − 1, the two-dimensional discrete Fourier transform (2D DFT) and its inverse (2D
IDFT) are defined by
M −1 N −1
ux vy
f (x, y)e−2πi( M + N ) , u = 0, 1, 2..., M − 1, v = 0, 1, ..., N − 1
X X
F (u, v) =
x=0 y=0

and
−1 N −1
1 MX ux vy
F (u, v)e2πi( M + N ) , x = 0, 1, 2..., M − 1, y = 0, 1, ..., N − 1.
X
f (x, y) =
M N u=0 v=0
(the one-dimensional versions of the discrete transforms are defined similarly).
Note that
M −1 N −1 M −1 N −1
0
X X X X
F (0, 0) = f (x, y)e = f (x, y) = (M N )average(f ),
x=0 y=0 x=0 y=0

thus the average of f can be recovered from F (0, 0):


F (0, 0)
average(f ) = .
MN

Shifting the center of the transform: We have the formula F(f (x, y)(−1)x+y ) = F (u− M2 , v −
N
2
). Indeed, note that we have

(−1)x+y = eiπ(x+y) = cos(π(x + y)) + i sin(π(x + y)).


Then
M X
N h i ux vy
F(f (x, y)(−1)x+y ) = f (x, y)eiπ(x+y) e−2πi( M + N )
X

x=0 y=0
M −1 N −1 h
−xM
− yN ux vy
i
f (x, y)e−2πi( )
e−2πi( M + N )
X X
= 2M 2N

x=0 y=0
M −1 N −1 h u−M v−N
2 2
i M N
f (x, y)e−2πi(x +y )
X X
= M N = F (u − , v − ).
x=0 y=0 2 2
From now on, we will assume that we work with the centered spectrum. Please see the handout
posted on the course webpage as an illustration of the shifting of the origin of the transform 0 to
M/2, in one dimension.

13
4.1 Principles of Filtering in the Frequency Domain
We assume that we work with the shifted transform based on the above property, thus the origin of
the transform (0, 0) has been moved to the center of the domain (M/2, N/2).
Usually it is impossible to make direct associations between the components of an image f and
its transform F . However, we can make connections between the frequency components of the
transform and spatial features of the image. We have
• frequency is directly related to spatial rates of change
• we have seen that the slowest varying frequency component ((u = 0, v = 0) before centering
or (u = M/2, v = N/2) after centering) is proportional to the average of the image intensity f (no
variation)
• as we move away from the origin (or from the center (u = M/2, v = N/2) after shifting),
the low frequency components correspond to slowly varying intensity components in the image
• as we move farther away from the origin (or from the center (u = M/2, v = N/2) after
shifting), high frequency components correspond to faster gray-level changes

x+y
 domain 1. Multiply f (x, y) by (−1)
Basics of filtering in the frequency
2. Compute F (u, v) = DF T f (x, y)(−1)x+y (here F (u, v) is in fact F (u − M/2, v − N/2),
but we keep the simpler notation F (u, v))
3. Multiply F by a real ”filter” function H: G(u, v) = H(u, v)F (u, v) (pointwise multiplica-
tion, not matrix multiplication)
4. Compute the Discrete IFT of G (of the result in 3.)
5. Take the real part of the result in 4 (in order to ignore parasitic complex components due to
computational accuracies)
6. Multiply the result in 5 by (−1)x+y to obtained the filtered image g
H will suppress certain frequencies in the transform, while leaving other frequencies un-
changed.
Let h be such that F(h) = H. Due to the Convolution Theorem, we have that linear filtering
in the spatial domain with h is equivalent with linear filtering in the frequency domain with filter
H: h ∗ f ⇔ HF
As we have mentioned, the low frequencies of the transform correspond to smooth regions in
the image (homogeneous parts). High frequencies of the transform correspond to edges, noise,
details.
A low-pass filter (LPF) H leaves low frequencies unchanged, while attenuating the high fre-
quencies. This is a smoothing filter.
A high-pass filter (HPF) H leaves high frequencies unchanged, while attenuating the low fre-
quencies. This is a sharpening filter.
  q
Let D(u, v) = dist (u, v), (M/2, N/2) = (u − M/2)2 + (v − N/2)2 the distance from
the frequency (u, v) to the center of the domain. Let D0 ≥ 0 be a parameter.
Examples of low-pass filters: (
1 if D(u, v) ≤ D0
• Ideal low-pass filter H(u, v) = . This filter produces the so-called
0 if D(u, v) > D0
ringing effect or Gibbs effect (oscillations near boundaries and edges), due to the sharp transition
between 0 and 1.

14
The following two filters produce less ringing effect.
1
• Let n a positive parameter. The Butterworh low-pass filter of order n H(u, v) = 1+(D(u,v)/D0 )2n
.
( u,v)/2D 2
• The Gaussian low-pass filter H(u, v) = e−D 0

Examples of high-pass filters: (


0 if D(u, v) ≤ D0
• Ideal high-pass filter H(u, v) = . This filter produces the so-called
1 if D(u, v) > D0
ringing effect or Gibbs effect (oscillations near boundaries and edges), due to the sharp transition
between 0 and 1.
The following two filters produce less ringing effect.
1
• Let n a positive parameter. The Butterworh high-pass filter of order n is H(u, v) = 1+(D0 /D(u,v)) 2n .
( u,v)/2D 2
• The Gaussian low-pass filter H(u, v) = 1 − e−D 0

The Laplacian in the frequency domain Assuming that f (x) is a one-dimensional function
defined on the real line in the continuous case, and that f has partial derivatives upto order n, we
∂nf

have the following formula that expresses the Fourier Transform of the derivative F ∂xn function
of the Fourier Transform of f :
 ∂ nf 
F = (2πiu)n F (u). (9)
∂xn
To prove this formula, consider the IFT
Z ∞
f (x) = F (u)e2πiux du,
−∞

then differentiating both sides with respect to x (a parameter under the integral sign), we obtain
Z ∞ h
∂f i
= F (u)(2πiu) e2πiux du,
∂x −∞

so by the uniqueness of the Inverse Fourier Transform, we must have


∂f
= (2πiu)F (u)du,
∂x
thus the above formula for n = 1. Repeating the process, we deduce formula (9) for any derivative
of order n.
Now we apply formula (9) with n = 2 in each variable x and y to the Laplacian in two
dimensions, using also the linearity of the Fourier transform:
 ∂ 2f ∂ 2f   ∂ 2f   ∂ 2f 
F(4f ) = F + = F + F
∂x2 ∂y 2 ∂x2 ∂y 2

= (2πiu)2 F (u, v) + (2πiv)2 F (u, v) = −4π 2 (u2 + v 2 )F (u, v)


thus we see that the computation of the Laplacian can be done by the filter

Hlapl (u, v) = −4π 2 (u2 + v 2 ).

15
Due to the shifting property of the center, we define Hlapl as
h M 2 N i
H(u, v) = −4π 2 (u − ) + (v − )2 .
2 2
Enhancement using the Laplacian in the frequency domain
Let f be the input image, and g be the output enhanced image obtained by g = f − 4f . This
operation can also be performed in the frequency domain, using the linearity of the transform and
the above Laplacian filter Hlapl :
F(f − 4f ) = F(f ) − F(4f ) = F (u, v) + 4π 2 (u2 + v 2 )F (u, v)
= (1 + 4π 2 (u2 + v 2 ))F (u, v) = (1 − Hlapl (u, v))F (u, v).
Using the shifting property of the center, we obtain enhancement using the Laplacian in the
Frequency domain by the following steps:
x+y
1. F (u, v) = 2DF T (f (x, y)(−1)
 ) 
2. Henhance (u, v) = 1 + 4π (u − M/2)2 + (v − N/2)2
2

3. G(u, v) =hH(u,  v)F (u, v) i


4. g(x, y) = Re 2DIF T (G) (−1)x+y
Remark: Note that, in the spatial domain, images f and 4f had comparable values (no additional
rescaling was necessary to apply g = f − 4f ). However, in the frequency domain, F(4f )
introduces DFT scaling factors that can be of several orders of magnitude larger than the maximum
of f . Thus rescaling of f or of 4f has to be appropriately introduced (e.g., normalizing f and 4f
between [0, 1]).
Unsharp masking and highboost filtering in the frequency domain Let f be the input image, to be
made sharper. The equivalent of the unsharp masking and highboost filtering techniques g =
f + k(f − fsmooth ) in the Frequency domain is as follows. The component fsmooth can be obtained
by applying a low-pass filter to f , using a filter function HLP . Then, using the linearity of the
Fourier transform, we have
h  i h  i
g = F −1 F (u, v) + k F (u, v) − HLP (u, v)F (u, v) = F −1 1 + k(1 − HLP (u, v)) F (u, v)
h  i
= F −1 1 + kHHP (u, v) F (u, v) .
Recall that taking k = 1 represents ”unsharp masking” and k > 1 represents ”highboost filtering”.
The frequency domain filter 1 + kHHP (u, v) is called ”high-frequency emphasis filter”.
A more general filter can be obtained by
h  i
g = F −1 k1 + k2 HHP (u, v) F (u, v) ,
with k1 ≥ 0 and k2 ≥ 0.

Final remarks:
- The direct and inverse discrete Fourier transforms are computed using the Fast Fourier Trans-
form (FFT) algorithm.
- Additional properties of the discrete and Fourier transforms are given in the assigned exer-
cises.
- Additional applications of filtering in the Fourier domain will be seen in the sections on Image
Reconstruction and Image Segmentation.

16
5 Image Restoration
Image restoration is the process of recovering an image that has been degraded, using a-priori
knowledge of the degradation process.
Degradation model: Let f (x, y) be the true ideal image that we wish to recover, and let g(x, y) be
its degraded version. One relation that links f to g is the degradation model
h i
g(x, y) = H f (x, y) + n(x, y),
where H denotes a degradation operator (e.g. blur) and n is additive noise.
Inverse Problem: Knowing the degradation operator H and statistics of the noise n, find a good
estimate fˆ of f .

5.1 Image Denoising


We assume that the degradation operator is the identity, thus we only deal with denoising in this
subsection. The linear degradation model becomes g(x, y) = f (x, y) + n(x, y). Note that not all
types of noise are addtitive.

Random noise We assume here that the noise intensity levels can be seen as a random variable,
with associated histogram or probability distribution function (PDF) denoted by p(r). We also
assume that the noise n is independent of the image f and independent of spatial coordinates
(x, y).
Most common types of noise
2 2
• The Gaussian noise (additive) with associated PDF given by p(r) = √ 1 e−(r−µ) /2σ , where
2πσ
µ is the mean and σ denotes the standard deviation. (
1
B−A
if A ≤ r ≤ B
• Uniform noise (additive) with associated PDF given by p(r) =
0 otherwise
• Impulse noise (salt-and-peper, or bipolar) (not additive), with associated PDF given by


pA if r = A
p(r) = pB if r = B


0 otherwise
Mean filters for random noise removal
Let g be the input noisy image, and fˆ be the output denoised image. Let S(x,y) be a neighbor-
hood of the pixel (x, y) defined by
n o
S(x,y) = (x + s, y + t), −a ≤ s ≤ a, −b ≤ t ≤ b ,
of size mn, where m = 2a + 1 and n = 2b + 1 are positive integers.
• Arithmetic Mean Filter
1
fˆ(x, y) =
X
g(s, t).
mn (s,t)∈S
(x,y)

This filter is useful for removing Gaussian noise or uniform noise, but blurring is introduced.

17
• Geometric Mean Filter
 1/mn
fˆ(x, y) = Π(s,t)∈S(x,y) g(s, t) .

This filter introduces blurring comparable with the arithmetic mean filter, but tends to lose less
image detail in the process.
• Contraharmonic Mean Filter of order Q

g(s, t)Q+1
P
(s,t)∈S(x,y)
fˆ(x, y) = P .
(s,t)∈S(x,y) g(s, t)Q

Here, the parameter Q gives the order of the filter.


Q = 0: reduces to the arithmetic mean filter.
Q = −1: it is called the harmonic mean filter (works well for salt noise, Gaussian noise; fails
for pepper noise).
Q > 0: useful for removing pepper noise
Q < 0: useful for removing salt noise.
Needs to know the sign to be used, it cannot remove salt noise and pepper noise simultaneously.
As we can see, we still need a filter that could remove both salt and pepper noise simultane-
ously.
• Median Filter is an ”order statistics filter”, where fˆ(x, y) depends on the ordering of pixel
values of g in the window S(x,y) . The median filter output is the 50% ranking of the ordered values:
n o
fˆ(x, y) = median g(s, t), (s, t) ∈ S(x,y) .

1 5 20
For example, for a 3 × 3 median filter, if gS(x,y) = 200 5 25 then we first order the values in
25 9 100
the window as follows: 1, 5, 5, 9, 20, 25, 25, 100, 200. The 50% ranking (5th value here) is 20,
thus fˆ(x, y) = 20.
The median filter introduces much less blurring than the other filters of the same window size.
It can be used for salt noise, pepper noise, or salt-and-pepper noise. Note that the median operation
is not a linear operation.
The following two filters can be seen as combinations of order statistics filters and averaging
filters, and can be used for several types of noise.
• Midpoint Filter
1h i
fˆ(x, y) = max {g(s, t)} + min {g(s, t)} .
2 (s,t)∈S(x,y) (s,t)∈S(x,y)

This filter works for randomly distributed noise, such as Gaussian noise or uniform noise.

18
• Alpha-trimmed mean filter
Let d ≥ 0 be a even integer, such that 0 ≤ d ≤ mn − 1. We first order again the mn pixel
values of the input image g in the window S(x,y) , and then we remove the lowest d/2 and the largest
d/2. We denote the remaining mn − d values by gr .
1
fˆ(x, y) =
X
gr (s, t),
mn − d (s,t)∈S
(x,y)

but in the sum only the remaining values are used.


d = 0 reduces to the arithmetic mean filter.
d = mn−12
reduces to the median filter.
This filter is useful for multiple types of noise (Gaussian noise, uniform noise, salt-and-pepper
noise).
Note that, when the noise is stronger, the above methods require the use of larger window
S(x,y) ; however, larger window will introduce more blurring. There are adaptive filters based on
the previous methods, that are slightly more complex in design, but producing improved results
with less blurring effect (for example, see ”adaptive, local noise reduction filter” and the ”adaptive
median filter” in [7]).
• The Nonlocal Means Filter
Let g be the input image.
- For two different pixels X = (x, y) and X 0 = (x0 , y 0 ) of the image g, we consider their
corresponding neighborhoods
SX = {(x + s, y + t), −a ≤ s, t ≤ a}, SX 0 = {(x0 + s, y 0 + t), −a ≤ s, t ≤ a},
thus these two neighborhoods have the same size m2 and the same shape, with m = 2a + 1.
- We consider two small images, as two patches gX = g|SX and gX 0 = g|SX 0 . Thus each gX
and gX 0 is a image of size m × m (in other words, gX is the patch from the image g around X and
gX 0 is the patch from the image g around X 0 ).
- We apply a smoothing Gaussian filter to gX and gX 0 obtaining smoothed versions g̃X , g̃X 0 ,
again each of size m × m.
- Compute the norm distance between the two image patches g̃X , g̃X 0 :
kg̃X − g̃X 0 k2 = sum of squares of coefficients of the matrix g̃X − g̃X 0
kg̃X −g̃X 0 k2
- For each two pixels X and X 0 , define the weight w(X, X 0 ) = e− h2 , where h is the
noise level parameter.
- Define the output filtered and denoised image fˆ at pixel X by
P 0 0
X 0 ∈image domain w(X, X )g(X )
fˆ(X) = .
w(X, X 0 )
P
X 0 ∈image domain

Remember that the normalized weighted linear filter with window mask w computes a weighted
average over a spatial neighborhood of (x, y). Here, the NL Means filter computes a weighted
average over an intensity neighborhood of g(X). In the above formula, fˆ(X) is a weighted average
of all values g(X 0 ), for which the patch(X) is similar to the patch(X 0 ).
This method can be tested from the website:
http://www.ipol.im/pub/algo/bcm− non− local− means− denoising/

19
Periodic (deterministic) noise We assume here that the noise n(x, y) depends on the spatial
coordinates (x, y) and it may be due to electrical or electromechanical interference. An example
of periodic noise is n(x, y) = α cos(x + y) + β sin(x + y).
We define a unit impulse (or delta function) by
(
1 if (u, v) = (0, 0)
δ(u, v) =
0 otherwise

If we visualize δ as an image, it will be a bright white dot at (0, 0) and black otherwise.
Let u0 and v0 be two parameters. It is possible to prove the following formula in the discrete
case
  ih i
F sin(2πu0 x + 2πv0 y) = δ(u + M u0 , v + N v0 ) − δ(u − M u0 , v − N v0 ) .
2
Thus, if we visualize the Fourier spectrum of the ”noisy” image g(x, y) = f (x, y) + αn(x, y), with
n(x, y) = sin(2πu0 x + 2πv0 y) as periodic noise, we would notice two bright regions, symmetri-
cally located, corresponding to the two impulse functions appearing in the above formula.
Thus in order to remove such periodic noise, we define two small rectangles or disks A and B,
each including the two bright regions, and a filter H(u, v) in the frequency domain, such that
(
0 if (u, v) ∈ A ∪ B
H(u, v) = .
1 otherwise

This type of filter is called a notch filter. Note that Gaussian-type filters, or Butterworth type filters
with smoother transition between 0 and 1 can also be adapted to this application. The filtered
image fˆ should have the periodic pattern removed, while most of the image details f kept.
Note that there is a improved version of the notch filter for removing periodic noise, called
”optimum notch filter” [7].

20
5.2 Image Deblurring
Linear and position invariant degradation functions Recall the degradation model g = H[f ]+
n, where H is a degradation operator and n is additive noise. Assume first that n(x, y) = 0 (thus
there is no noise). We need the following definitions and properties on H:
- we assume that H is linear: let f1 and f2 be two functions, then
H[f1 + f2 ] = H[f1 ] + H[f2 ] (additivity) , H[αf ] = αH[f ] (scalar multiplication) .
- we assume that the additivity property is extended to integrals
- we assume that H is position invariant: if g(x, y) = H[f (x, y)], then g(x − α, y − β) =
H[f (x − α, y − β)] for all x, y, α, β and all functions f and g.
- we define a continuous impulse function δ, such that ( (by definition) f ∗ δ = f for any other
∞ if (x, y) = (0, 0)
function f . In other words, δ is defined by δ(x, y) = and the property
0 otherwise
f ∗ δ = f becomes
Z ∞ Z ∞
f (x, y) = f (α, β)δ(x − α, y − β)dαdβ. (10)
−∞ −∞

We prove that under the above assumptions and notations, the operator H is a convolution.
Proof. We have g(x, y) = H[f (x, y)], thus according to (10) we have
hZ ∞ Z ∞ i
g(x, y) = H f (α, β)δ(x − α, y − β)dαdβ ,
−∞ −∞

then extending the additivity property to integrals, we obtain


Z ∞ Z ∞ h i
g(x, y) = H f (α, β)δ(x − α, y − β) dαdβ.
−∞ −∞

Now f (α, β) is constant with respect to (x, y) thus by the scalar multiplication property, we obtain
Z ∞ Z ∞ h i
g(x, y) = f (α, β)H δ(x − α, y − β) dαdβ.
−∞ −∞

Denote by h(x, y) = H[δ(x, y)], thus by the position invariance property of H, we have h(x −
α, y − β) = H[δ(x − α, y − β)] and g(x, y) becomes
Z ∞ Z ∞
g(x, y) = f (α, β)h(x − α, y − β)dαdβ,
−∞ −∞

in other words, g(x, y) = f ∗ h(x, y) by the definition of the convolution. Since f ∗ h = h ∗ f , we


finally obtain that g(x, y) = H[f (x, y)] = h ∗ f (x, y).
In the presence of additive noise, the (linear) degradation model becomes
g(x, y) = h ∗ f (x, y) + n(x, y), or in the frequency domain, G(u, v) = H(u, v)F (u, v) + N (u, v).
The convolution kernel h is called a point-spread-function (PSF), since if it is applied to a
(sharp) impulse (white dot over black background), the result is a spreading out of the white dot
(instead of seeing a bright dot over a black background, we see a larger and diffuse white disk over
the black background).
The process of recovering f from g is now called denoising-deblurring or denoising-deconvolution.
The deconvolution process is difficult, especially in the presence of noise, since this leads to a
highly ill-posed problem.

21
Examples of the degradation functions
We give below a few examples of linear degradation functions h or H, obtained by modeling
the physical phenomenon for the image acquisition process.
• Atmospheric Turbulence Blur
2 2 5/6
In the frequency domain, this is defined by H(u, v) = e−k(u +v ) , where k is a constant
depending on the degree of turbulence:
k = 0.0025 corresponds to severe turbulence
k = 0.001 corresponds to mild turbulence
k = 0.00025 corresponds to low turbulence
A Gaussian low-pass filter can also be used to model atmospheric turbulence blur.
• Out of Focus Blur (
1 if x2 + y 2 ≤ D02
In the spatial domain, this is defined by h(x, y) =
0 otherwise
(larger parameter D0 gives more severe blur).
• Uniform Linear Motion Blur
Assume that the image f (x, y) undergoes planar motion during the acquisition. Let (x0 (t), y0 (t))
be the motion components in the x and y-directions. Here, t denotes time and T is the duration of
the exposure. The motion blur degradation is
Z T
g(x, y) = f (x − x0 (t), y − y0 (t))dt.
0

We want to express the motion blur in the frequency domain. Let G = F(g), thus by the definition,
Z ∞ Z ∞
G(u, v) = g(x, y)e−2πi(ux+vy) dxdy
−∞ −∞

Z ∞ Z ∞ hZ T i
= f (x − x0 (t), y − y0 (t))dt e−2πi(ux+vy) dxdy
−∞ −∞ 0
Z T hZ ∞ Z ∞ i
= f (x − x0 (t), y − y0 (t))e−2πi(ux+vy) dxdy dt.
0 −∞ −∞

Using now the property F(f (x − x0 , y − y0 )) = F (u, v)e−2πi(ux0 +vy0 ) , we obtain


Z Th i Z Th i
G(u, v) = F (u, v)e−2πi(ux0 (t)+vy0 (t)) dt = F (u, v) e−2πi(ux0 (t)+vy0 (t)) dt,
0 0

thus G(u, v) = H(u, v)F (u, v) with the motion degradation function in the frequency domain
Z Th i
H(u, v) = e−2πi(ux0 (t)+vy0 (t)) dt.
0

• Radon Transform Degradation in Computerized Tomography


In computerized tomography, parallel beams of X-rays are sent through the body. Energy is
absorbed and a sensor measures the attenuated values along each beam. Thus the ”image” f (x, y)
that we have to recover is an absorption image function of the tissue (each tissue absorbs the

22
energy or the radiation in a different way). The data g is given in the Radon domain, and the
relation between f and g can be defined by
Z ∞ Z ∞
g(ρ, θ) = R(f ) = f (x, y)δ(x cos θ + y sin θ − ρ)dxdy,
−∞ −∞

or that Z
g(ρ, θ) = f,
linex cos θ+y sin θ=ρ

where (ρ, θ) are the parameters describing a line. The image g(ρ, θ) for 0 ≤ ρ ≤ r and 0 ≤ θ ≤ π
is called a sinogram. The problem is to recover f from g. In reality, we only have a finite (limited)
number of ”projections” (lines), and R(f ) is called the Radon transform of f .

Image reconstruction from blurry-noisy data We will discuss several models for image de-
convolution (with or without noise).
The simplest approach is called direct inverse filtering. In the ideal case without noise, G(u, v) =
G(u,v)
H(u, v)F (u, v), thus to recover F , we could define F̂ (u, v) = H(u,v) . However, this approach has
at least two problems:
G(u,v)
(i) in reality there is unknown noise N , thus the real relation would be F̂ (u, v) = H(u,v) =
N (u,v)
F (u, v) + H(u,v) . So even if we know the degradation function H(u, v), we cannot recover F
exactly because N (u, v) is unknown.
(ii) The blurring filter H(u, v) usually is zero or it has very small values (close to zero) away
from the origin (or away from the center of the domain). Thus the division by H(u, v) would not be
defined at many values (u, v), or the term N (u, v)/H(u, v) would dominate and lead to incorrect
reconstruction.
The Wiener Filter is given by
h 1  |H(u, v)|2 i
F̂ (u, v) = G(u, v),
H(u, v) K + |H(u, v)|2
where K > 0 is a positive parameter. K is chosen so that visually good results are obtained. This
filter can be interpreted in the following way:
|H(u,v)|2
(i) First, the factor K+|H(u,v)| 2 acts like a low-pass filter to remove the noise.
1
(ii) Then, the factor H(u,v) acts like in the direct inverse filtering (direct deconvolution). But we
no longer have division by very small values, since the filter is equivalent with
h H(u, v) i
F̂ (u, v) = G(u, v).
K + |H(u, v)|2

Error measures Assume that we perform an artifficial experiment, thus we know the true image
f that we wish to recover. Once we have a denoised or deblurred image fˆ, we can measure in
several ways, how good is the restored result fˆ. We q
give two examples of measures:
• Root-Mean-Square-Error (RMSE) RM SE = M1N M
P −1 PN −1 ˆ 2
x=0 y=0 [f (x, y) − f (x, y)] .
PM −1 PN −1 ˆ
f (x,y)2
• Signal-to-Noise-Ratio (SNR) SN R = PM −1x=0
PN −1 y=0 ˆ
x=0 y=0
[f (x,y)−f (x,y)]2

23
5.3 Energy minimization methods for image reconstruction
Consider first the denoising problem for simplicity. Let g be the given noisy image, f the image to
be restored, linked through the relation g(x, y) = f (x, y) + n(x, y).
Let’s recall for a moment the enhancement method using the Laplacian: g = f − 4f , where
f was the input blurry image and g was the sharper output image. However, to remove noise, we
need the opposite process of smoothing. So let’s invert the relation g = f − 4f to obtain
f = g + 4f, (11)
where g is the input noisy image and f is the denoised processed image. This is a linear partial
differential equation, with unknown f . There are many ways to solve (11).
• One way is using the Fourier transform: (11) is equivalent with
F(f ) = F(g + 4f ),
thus by the linearity of the Fourier transform, we first have
F(f ) = F(g) + F(4f ),
or h i
F (u, v) = G(u, v) + − 4π 2 (u2 + v 2 ) F (u, v),
therefore
1
F (u, v) = G(u, v),
1+ 4π 2 (u2 + v2)
1
and we notice that 1+4π 2 (u2 +v 2 )
acts as a low-pass filter.
• Another way is by discretization in the spatial domain (assuming that f is extended by reflec-
tion outside of its domain): a discrete version of equation (11) can be obtained using the 5-point
Laplacian and with (x, y) integer coordinates,
h i
f (x, y) = g(x, y)+ f (x+1, y)+f (x−1, y)+f (x, y+1)+f (x, y−1)−4f (x, y) , ∀(x, y). (12)
Equation (12) can be solved by direct methods or iterative methods for linear systems. As an
example of iterative method, the simplest one is as follows: let g(x, y) be the given noisy data,
defined for x = 1, ..., M and y = 1, ..., N .
(i) Start with an initial guess f 0 (x, y) defined for all (x, y) in the original domain x = 1, ..., M
and y = 1, ..., N . Then extend f 0 by reflection to the domain x = 0, ..., M + 1, y = 0, ..., N + 1
(sometimes we chose f 0 = g).
(ii) For integer n ≥ 0, and for all x = 1, ..., M , y = 1, ..., N :
h i
f n+1 (x, y) = g(x, y) + f n (x + 1, y) + f n (x − 1, y) + f n (x, y + 1) + f n (x, y − 1) − 4f n (x, y)
(iii) Impose boundary conditions by reflection:
Let f n+1 (0, y) = f n+1 (2, y) and f n+1 (M + 1, y) = f n+1 (M − 1, y) for all y = 1, 2, ..., N .
Let f n+1 (x, 0) = f n+1 (x, 2) and f n+1 (x, N + 1) = f n+1 (x, N − 1) for all x = 1, 2, ..., M .
Let f n+1 (0, 0) = f n+1 (2, 2), f n+1 (0, N + 1) = f n+1 (2, N − 1),
f n+1 (M + 1, 0) = f n+1 (M − 1, 2), f n+1 (M + 1, N + 1) = f n+1 (M − 1, N − 1).
(iv) Let n = n + 1 and repeat steps (ii)-(iii) until convergence
(v) If matrix norm kf n+1 −f n k ≤ tolerance, stop and output f = f n+1 on the domain x = 1, ..., M
and y = 1, ..., N .

24
We want to show now, that the solution f of (11) could also be obtained as a minimizer of the
following discrete energy (taking into account the extension by reflection outside of the domain),
M X
N M X
N h i
2
(f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2 .
X X
E(f ) = [f (x, y) − g(x, y)] +
x=1 y=1 x=1 y=1
(13)
Indeed, assume now that some discrete image function f is a minimizer of E. Then, since E is
differentiable, we must have ∂(f∂E
(xy))
= 0 for all points (x, y). Compute

∂E
= 2(f (x, y) − g(x, y)) + 2[f (x + 1, y) − f (x, y)](−1) + 2[f (x, y + 1) − f (x, y)](−1)
∂(f (xy))
+2[f (x, y) − f (x − 1, y)] + 2[f (x, y) − f (x, y − 1)] = 0,
and rearranging the terms we obtain
h i
f (x, y) = g(x, y) + f (x + 1, y) + f (x − 1, y) + f (x, y + 1) + f (x, y − 1) − 4f (x, y) ,

which is exactly equation (12).


We notice that, a continuous version of (13) is
Z Z
2
Z Z h
∂f 2 ∂f 2 i
E(f ) = (f (x, y) − g(x, y)) dxdy + + dxdy,
∂x ∂y
or Z Z Z Z
2
E(f ) = (f (x, y) − g(x, y)) dxdy + |∇f |2 dxdy,
 
where ∇f (x, y) = ∂f ∂f
∂x ∂y
is the gradient of f and |∇f | is the gradient magnitude.
In the above energy E(f ), the first term is called a data fidelity term, while the second term is
a (isotropic) regularization (or isotropic smoother).
In this approach, if we use one of the equations (11) or (12) to denoise images, the noise will be
smoothed out but the edges will also become blurry, because the Laplacian is a isotropic  diffusion

operator in all directions. To overcome this problem, we use the gradient ∇f (x, y) = ∂f ∂f
∂x ∂y
as
an edge indicator. We know that, near a edge of the image f , where there are strong variations, the
gradient magnitude |∇f | must be large; on the contrary, away from edges, where there are slow
variations, the gradient magnitude |∇f | is small. Therefore, we do notwant   the image f
 to diffuse
∂ ∂f ∂ ∂f
where |∇f | is large. We modify the PDE f = g + 4f ⇔ f = g + ∂x ∂x + ∂y ∂y as

∂  1 ∂f  ∂  1 ∂f 
f =g+ + . (14)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
This is a anisotropic diffusion equation, that also comes from the following energy minimization:
n 1Z Z 2
Z Z o
min J(f ) = (f (x, y) − g(x, y)) dxdy + |∇f |dxdy . (15)
f 2
When we deal with both denoising-deblurring, we modify (15) into
1Z Z Z Z
J(f ) = (h ∗ f (x, y) − g(x, y))2 dxdy + λ |∇f |dxdy, (16)
2

25
introduced by Rudin, Osher and Fatemi in [11, 12]. The first term in (16) is a data fidelity term,
while the second term is called total variation regularization. The parameter λ > 0 is a weight
between the fidelity term and the regularization term.
1R R
The fidelity term 2 (h ∗ f (x, y) − g(x, y))2 dxdy is appropriate for additive Gaussian noise.
If the image g is noisy due to additive Laplacian noise or due to impulse noise (salt-and-pepper
|h ∗ f (x, y) − g(x, y))|dxdy.
RR
noise), then the fidelity term is modified into the 1-norm
We want now to show that the following anisotropic PDE
∂  1 ∂f  ∂  1 ∂f 
h̃ ∗ h ∗ f = h̃ ∗ g + λ +λ (17)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
must be formally satisfied by a minimizer of the convex energy J in (16), where h̃(x, y) =
h(−x, −y). To do this, we formally impose ∂J ∂f
= 0 which will be an equivalent relation with
(17).
Consider the discrete case for the purpose of illustration. Assume that the given noisy-blurry
image g is a matrix of size M × N . We want to recover a denoised-deblurred image f as a matrix
of size M × N . We denote by H[f ] = h ∗ f for simplicity, where H must be a linear operator,
since h ∗ (f1 + f2 ) = h ∗ f1 + h ∗ f2 , which is easy to verify. A discrete version of J is
M X
X N  2 M X
X N r 2  2
J(f ) = H[f ](x, y)−g(x, y) +λ f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y) ,
x=1 y=1 x=1 y=1
(18)
assuming the space discretization steps 4x = 4y = 1, and x, y integers. Differentiating with
respect to f (x, y), for (x, y) fixed, we obtain
∂J  
= H T H[f ] − g (x, y)
∂f (x, y)
   
2 f (x + 1, y) − f (x, y) (−1) + 2 f (x, y + 1) − f (x, y) (−1)
+ λ r 2  2
2 f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
 
2 f (x, y) − f (x − 1, y)
+ λ r 2  2
2 f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
 
2 f (x, y) − f (x, y − 1)
+ λ r 2  2 = 0.
2 f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)
After simplifications, we obtain
∂J  
= H T H[f ] − g (x, y)
∂f (x, y)
f (x + 1, y) − f (x, y)
− λ r 2  2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y + 1) − f (x, y)
− λ r 2  2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)

26
f (x − 1, y) − f (x, y)
− λ r 2  2
f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
f (x, y − 1) − f (x, y)
− λ r 2  2 = 0,
f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)

or
  n f (x + 1, y) − f (x, y)
H T H[f ] − g (x, y) = λ r 2  2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y) − f (x − 1, y) o
− r 2  2
f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
n f (x, y + 1) − f (x, y)
+ λ r 2  2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y) − f (x, y − 1) o
− r 2  2 ,
f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)

which can be seen as a discrete version of equation (17). But this is still a nonlinear difference
equation in f , for which we do not have explicit solutions. We apply an iterative explicit method
based on time-dependent gradient descent to solve it.

General gradient descent method In order to solve the above equation for all (x, y), with asso-
ciated boundary conditions, we apply the gradient descent method, based on the following general
idea: assume that we need to solve
min J(f ),
f

where f could be a number, a vector, or a function (in our case, f is a discrete image function,
(x, y) 7→ f (x, y)). We also define a time-dependent equation for t ≥ 0. Let f (0) = f0 , and assume
that f (t) solves
f 0 (t) = −J 0 (f (t)), t > 0.
(in our case for image functions f , we would have (x, y, t) 7→ f (x, y, t), t ≥ 0, and the differential
equation becomes ∂f ∂t
= −J 0 (f (·, ·, t))).
We show that J(f (t)) is decreasing as t increases. Indeed,

d  2
J(f (t)) = J 0 (f (t))f 0 (t) = J 0 (f (t))(−J 0 (f (t)) = − J 0 (f (t)) ≤ 0,
dt
thus J(f (t)) must be deacreasing as t increases. So solving f 0 (t) = −J 0 (f (t)), t > 0 for f (t),
decreases the original energy J(f ), with f (t) → f , as t → ∞, with f a minimizer of J.
We apply this idea to our main problem, and we discretize the time t by n4t, n integer. Let
f 0 (x, y) be an initial quess, defined for all (x, y) with 1 ≤ x ≤ M , 1 ≤ y ≤ N (and then extended

27
f n+1 −f n
by reflection). We want to have, as above, 4t
= −J 0 (f n ), which becomes for pixels (x, y)
with x = 1, ..., M , y = 1, ..., N , for n ≥ 0,

f n+1 (x, y) − f n (x, y)  


= −h̃ ∗ h ∗ f n − g (x, y)
4t
f n (x + 1, y) − f n (x, y)
+ λ r 2  2
f n (x + 1, y) − f n (x, y) + f n (x, y + 1) − f n (x, y)
f n (x, y + 1) − f n (x, y)
+ λ r 2  2
f n (x + 1, y) − f n (x, y) + f n (x, y + 1) − f n (x, y)
f n (x − 1, y) − f n (x, y)
+ λ r 2  2
f n (x, y) − f n (x − 1, y) + f n (x − 1, y + 1) − f n (x − 1, y)
f n (x, y − 1) − f n (x, y)
+ λ r 2  2 ,
f n (x + 1, y − 1) − f n (x, y − 1) + f n (x, y) − f n (x, y − 1)

where h̃ is defined by h̃(x, y) = h(−x, −y).


The boundary conditions are the same as for the linear case (by reflection):
f n+1 (0, y) = f n+1 (2, y) and f n+1 (M + 1, y) = f n+1 (M − 1, y) for all y = 1, 2, ..., N .
f n+1 (x, 0) = f n+1 (x, 2) and f n+1 (x, N + 1) = f n+1 (x, N − 1) for all x = 1, 2, ..., M .
f n+1 (0, 0) = f n+1 (2, 2), f n+1 (0, N + 1) = f n+1 (2, N − 1),
f n+1 (M + 1, 0) = f n+1 (M − 1, 2), f n+1 (M + 1, N + 1) = f n+1 (M − 1, N − 1).
The above relations are repeated for several steps n ≥ 0, until matrix norm kf n+1 − f n k ≤
tolerance.
In the implementation, we only work with two matrices f n = f old and f n+1 = f new , and after
each main step n, we reset f old = f new .
The above discretization of equation (17) is the slowest one, but it is the simplest one to imple-
ment. Faster implementations can be made. The parameter 4t must be chosen sufficiently small,
so that E(f new ) ≤ E(f old ). Also, a small parameter  > 0 can be added inside the square roots, to
avoid division by zero when the gradient magnitude is zero (in constant areas of the image). The
parameter λ is selected for best restoration results.

28
5.3.1 Computation of the first order optimality condition in the continuous case
We explain here the computation of the so-called ”Euler-Lagrange equation” associated with the
minimization of the energies that we have considered above in the continuous case. Assume that
f, g : Ω → R, where Ω is the image domain (a rectangle in the plane) and ∂Ω denotes its boundary.
We will use the notations fx = ∂f
∂x
, fy = ∂f
∂y
.
Consider a general energy of the form
Z
J(f ) = e(x, y, f, fx , fy )dxdy, (19)

to be minimized over a subspace V ⊂ L2 (Ω) = {f : Ω → R, Ω f 2 (x, y)dxdy < ∞}. We do not


R

enter into the details of defining V ; V will be a normed vector space over R (subspace of L2 (Ω),
and we will have f ∈ V iff E(f ) < ∞. The minimization problem is

min J(f ).
f ∈V

We assume that e is differentiable in all variables (or that e can be approximated by a differ-
entiable function). The corresponding first order optimality condition (Euler-Lagrange equation)
that must be satisfied by a minimizer f of J, can be expressed as
∂e ∂  ∂e  ∂  ∂e 
= + ⇔ ∂J(f ) = 0
∂f ∂x ∂(fx ) ∂y ∂(fy )
   
∂e ∂ ∂e ∂ ∂e
inside Ω, where ∂J(f ) = ∂f − ∂x ∂(fx )
− ∂y ∂(fy )
, together with boundary conditions.
We show next the derivation of the Euler-Lagrange equation in two cases.
• Quadratic Regularization (denoising only)
Let
Z Z Z Z h i
2 2 2
E(f ) = (f (x, y) − g(x, y)) dxdy + |∇f | dxdy = (f − g) dxdy + (fx )2 + (fy )2 dxdy.
Ω Ω Ω Ω

Define
V = {v ∈ L2 (Ω) : E(v) < ∞}.
If f is a minimizer of E(f ), then we must have E(f ) ≤ E(f + v), for all real numbers  and all
other functions v ∈ V . Denote by s() = E(f + v); therefore we must have s(0) ≤ s() for all ,
thus we must have s0 (0) = 0 for all other test functions v ∈ V .
We first compute
Z Z h i
s() = E(f + v) = (f (x, y) + v(x, y) − g(x, y))2 dxdy + ((f + v)x )2 + ((f + v)y )2 dxdy,
Ω Ω
Z Z h i
s() = E(f + v) = (f (x, y) + v(x, y) − g(x, y))2 dxdy + (fx + vx )2 + (fy + vy )2 dxdy.
Ω Ω
0
We compute s ():
Z Z h i
0
s () = 2 (f (x, y) + v(x, y) − g(x, y))v(x, y)dxdy + 2(fx + vx )vx + 2(fy + vy )vy dxdy.
Ω Ω

29
We take  = 0:
Z Z h i
s0 (0) = 2 (f (x, y) − g(x, y))v(x, y)dxdy + 2fx vx + 2fy vy dxdy = 0 for all v ∈ V.
Ω Ω

We apply integration by parts in the second term,


Z Z h i Z
2 (f (x, y)−g(x, y))v(x, y)dxdy− 2(fx )x v+2(fy )y v dxdy+ 2(fx vn1 +fy vn2 )dS = 0 for all v,
Ω Ω ∂Ω

where ~n = (n1 , n2 ) is the exterior unit normal to the boundary ∂Ω.


We can rewrite the last relation as
Z h i Z
f − g − 4f v(x, y)dxdy + (∇f · ~n)vdS = 0 for all v ∈ V,
Ω ∂Ω

from where we obtain (by a theorem)


(
f − g − 4f = 0 in Ω
∇f · ~n = 0 on ∂Ω

which is equation (11) with associated (free) boundary conditions.


• Total Variation Regularization (denoising-deblurring)
We denote by the linear operator H the mapping f 7→ h ∗ f = Hf (where h is the blurring
convolution kernel). Note that here, Hf is not a product.
We want to show in continuous variables that, if f is a minimizer of (16) recalled here,
1Z Z 2
Z Z
J(f ) = (h ∗ f (x, y) − g(x, y)) dxdy + λ |∇f |dxdy, (20)
2
rewritten as
1Z Z Z Z
(Hf − g)2 dxdy + λ
J(f ) = |∇f |dxdy, (21)
2
then f formally satisfies the anisotropic diffusion PDE (17) recalled here:
∂  1 ∂f  ∂  1 ∂f 
h̃ ∗ h ∗ f = h̃ ∗ g + λ +λ , (22)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
or
∂  1 ∂f  ∂  1 ∂f 
H T [Hf ] = H T g + λ +λ (23)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
(where H T is the adjoint or transpose operator of H).
To obtain equation (23), we apply the same steps as for E: we define s() = J(f + v), where
 is real and v is another function like f . Then we compute s0 (), we take  = 0 and we impose
s0 (0) = 0 for all functions v. Similarly, we apply integration by parts and we arrive to (23) in Ω,
∇f
together with the boundary conditions |∇f |
· ~n = 0 on the boundary ∂Ω. The remaining details are
T
left as an exercise. It remains to find H , which is presented next.

30
Computation of the adjoint H T of H
For the convolution term, we assume that all image functions f, g, h are defined on the entire
plane (by extension) and belong
R∞ R∞
to L2 . The space L2 is an inner product vector space associated
with the product hf, gi = −∞ −∞ f (x, y)g(x, y)dxdy, for any functions f and g in L2 . Also,
H : L2 → L2 is the linear convolution operator defined by
Z ∞ Z ∞
Hf (x, y) = h ∗ f (x, y) = h(α, β)f (x − α, y − β)dαdβ.
−∞ −∞

By the definition of the adjoint, we must find another linear operator H T such that
hHf, gi = hf, H T gi
for all functions f, g ∈ L2 . We have
Z ∞ Z ∞
hHf, gi = (h ∗ f )(x, y)g(x, y)dxdy
−∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= h(α, β)f (x − α, y − β)dαdβ g(x, y)dxdy
−∞ −∞ −∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= f (x − α, y − β)g(x, y)dxdy h(α, β)dαdβ.
−∞ −∞ −∞ −∞

We make the change of variables in x, y: x − α = X, y − β = Y and we obtain


Z ∞ Z ∞ hZ ∞ Z ∞ i
hHf, gi = f (X, Y )g(α + X, β + Y )dXdY h(α, β)dαdβ
−∞ −∞ −∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= f (X, Y ) h(α, β)g(α + X, β + Y )dαdβ dXdY.
−∞ −∞ −∞ −∞

Now making the change of variables α = −a, β = −b, we obtain


Z ∞ Z ∞ h Z −∞ Z −∞ i
hHf, gi = f (X, Y ) h(−a, −b)g(X − a, Y − b)(−1)(−1)dadb dXdY.
−∞ −∞ +∞ +∞
R −∞ R +∞
Since +∞ =− −∞ , we finally obtain
Z ∞ Z ∞ hZ ∞ Z ∞ i
hHf, gi = f (X, Y ) h(−a, −b)g(X − a, Y − b)dadb dXdY
−∞ −∞ −∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= f (X, Y ) h̃(a, b)g(X − a, Y − b)dadb dXdY
−∞ −∞ −∞ −∞
Z ∞ Z ∞
= f (X, Y )h̃ ∗ g(X, Y )dXdY = hf, H T gi,
−∞ −∞

where we introduced h̃(a, b) = h(−a, −b). Therefore: H T g(x, y) = h̃ ∗ g(x, y).

31
6 Image Segmentation
Segmentation is a midd-level vision task, where the input is a pre-processed image f (sharp, with-
out noise). The output could be an image g, or could no longer be an image, but image attributes:
set of points representing the edges of f , boundaries of objects defined by implicit curves or by
sets of points, etc. Segmentation is the essential step before image analysis and recognition. Seg-
mentation can also be defined as the process of partitioning the input image f into its constituent
objects or regions.
The segmentation can be based on several different criteria:
- It can be based on discontinuity: partition an image based on abrupt changes in intensity
(edges)
- It can be based on similarity: partition an image into regions that are similar according to a
set of predefined rules (based on color, texture, etc).

6.1 The gradient edge detector


Consider a differentiable function (x, y) 7→ f (x, y) in two dimensions. We define its gradient
operator as being the vector of first-order partial derivatives,
h ∂f ∂f i
∇f (x, y) = (x, y) (x, y) ,
∂x ∂y
and its gradient magnitude as the Euclidean norm of the vector ∇f ,
s
 ∂f 2  ∂f 2
|∇f |(x, y) = + .
∂x ∂y
The usual central finite differences approximations of the gradient are (assuming 4x = 4y =
1)
∂f f (x + 1, y) − f (x − 1, y) ∂f f (x, y + 1) − f (x, y − 1)
(x, y) ≈ , (x, y) ≈ .
∂x 2 ∂y 2
Neglecting the constant 21 , these approximations can be implemented using the following masks w
centered at (x, y) and spatial linear filtering,

0 −1 0 0 0 0
0 0 0 in x, −1 0 1 in y.
0 1 0 0 0 0

In image processing, since the input image f may still be noisy, a smoothed version of these
approximations is used, called the Sobel gradient operators, computed by the following two masks
w in x and y respectively:

−1 −2 −1 −1 0 1
0 0 0 in x, −2 0 2 in y.
1 2 1 −1 0 1

Note that all these four masks are sharpening masks, since the sum of their coefficients is zero.

32
The gradient magnitude can be used as an edge detector. Indeed, in the areas where the image
f does not vary too much, the gradient magnitude |∇f | is small (close to zero); on the contrary,
in the areas where there are strong variations (at edges), the gradient magnitude |∇f | is large. We
define the output image g(x, y) = |∇f |(x, y) (it will show white edges over black background), or
a thresholded version of |∇f |.
Assuming that |∇f | is approximated by some of the above discrete operators, we define
g(x, y) = |∇f |(x, y) (discrete version)
(followed by rescaling), or a thresholded gradient map:
(
255 if |∇f |(x, y) ≥ tolerance T,
g(x, y) =
0 if |∇f |(x, y) < tolerance T.
Note that the operation f 7→ g = |∇f | is nonlinear.

6.2 Edge detection by zero-crossings of the Laplacian (the Marr-Hildtreth


edge detector)
A more advanced technique for edge detection uses the zero-crossings of the Laplacian. The pixels
where the Laplacian 4f changes sign, can be seen as centers of thick edges. The main steps for this
method are as follows (these steps can be done in the spatial domain or in the frequency domain):
(1) Filter the input image f by a Gaussian low-pass filter (equivalent with Gσ ∗f , where Gσ (x, y) =
x2 +y 2
e− 2σ2 )
(2) Compute the Laplacian 4(Gσ ∗ f ) of step (2) by using for example the 3 × 3 Laplacian masks
(5-point Laplacian or 9-point Laplacian)
(3) Find the zero-crossings of the result from step (2).
Remarks
(i) Since we have 4(Gσ ∗ f ) = (4Gσ ) ∗ f , steps (i)-(ii) can be combined into a single step
2 2 2 x2 +y 2
as follows: define the Laplacian of a Gaussian LoG(x, y) = 4Gσ = x +yσ4−2σ e− 2σ2 (” inverted
mexican hat”) and its discrete version. Then compute LoG ∗ f ; this can be done using the mask
0 0 1 0 0
0 1 2 1 0
1 2 −16 2 1 .
0 1 2 1 0
0 0 1 0 0
(ii) In order to find the zero-crossings in step (3) above, consider a pixel (x, y) and its 3 × 3
neighborhood centered at (x, y). We have a zero-crossings at (x, y) if the signs of at least two of
its opposing neighboring pixels are different (we have to check four cases: left/right, up/down, and
the two diagonals). Another way is first by partitioning the result of (2) into: assign white=255 to
positive pixels and black=0 to negative pixels; then compute the edge map of the black-and-white
image.
(iii) An improved version of this detector is the Canny edge detector, where the Laplacian is
substituted by the second order derivative in the gradient direction: fxx (fx )2 +2fx fy fxy +(fy )2 fyy .

33
6.3 Boundary detection by curve evolution and active contours
We assume that we have given an image f : Ω 7→ R that contains several objects forming the
foreground, and a background. Curve evolution can be used to detect boundaries of objects in
images, as follows:
(i) Start the process with an initial curve;
(ii) Move or deform the curve according to some criterion;
(iii) The curve has to stop on boundaries of objects.

6.3.1 Curve Representation


Assume that we have a closed curve C in the plane, with parameterization s ∈ [0, 1] 7→ C(s) =
(x(s), y(s)), with C(0) = C(1).

Explicit Representation Numerically, we can represent a moving curve C by a sequence of


points in the plane (discrete points belonging to the curve). However, by this approach, it is hard to
handle change of topology (merging or breaking); also, re-parameterization and re-discretization
are often necessary. On the other hand, such representation is not computationally expensive, and
it easily allows representation of open curves.

Implicit Representation Another way to represent a curve C, which is the boundary of an open
domain, is as the isoline of a Lipschitz continuous function φ. Such representation, called im-
plicit representation or level set representation [10], allows for automatic change of topology, and
discretization on a fixed regular (rectangular) grid. On the other hand, the method is computation-
ally more expensive and it does not directly allow the representation of open curves. This is the
approach that we will consider.
Let φ : Ω → be a Lipschitz continuous function, and define C = {(x, y) ∈ Ω : φ(x, y) = 0}.
In other words, C is the zero isoline of the function φ. Note that here C can be made of several
connected components.
As an example, consider a circle C centered at (x0 , y0 ) and of radius r; we can define
q
φ(x, y) = r − (x − x0 )2 + (y − y0 )2 .

Then
φ(x, y) = 0 if (x, y) ∈ circle C;
φ(x, y) < 0 if (x, y) ∈ outside of circle C;
φ(x, y) > 0 if (x, y) ∈ inside circle C.
We notice that for a general curve C which is the boundary of an open domain, we can define
φ as the signed distance function to C:
 


 dist (x, y), C) if (x, y) ∈ inside C

φ(x, y) = −dist (x, y), C) if (x, y) ∈ outside C .



0 if (x, y) ∈ C
Note that, under some assumptions, the (signed) distance function to C satisfies the eikonal equa-
tion: |∇φ| = 1 over Ω in the sense of viscosity solutions.

34
Using the differentiable level set function φ, we can define several intrinsic geometric quantities
of the curve C:
• Unit normal to the curve N ~ = (N1 , N2 ) = − ∇φ
|∇φ|
φxx φ2y −2φx φy φxy +φyy φ2x
   
∂ φx ∂ φx
• Curvature at (x, y) ∈ C: K(x, y) = ∂x |∇φ|
+ ∂y |∇φ|
= (φ2x +φ2y )3/2
.
(
1 if φ ≥ 0
• Heaviside function H(φ) = , and the delta function δ(φ) = H 0 (φ) in the weak
0 if φ < 0
sense
• Area enclosed by the curve: A{φ ≥ 0} =
R
R Ω
H(φ(x, y))dxdy
• Length of the curve: L{φ(x, y) = 0} = Ω |∇H(φ(x, y)|
• If f : Ω → R is the given image over Ω, then we can define the mean of f over the regions
{φ ≥ 0} and {φ ≤ 0}:

f (x, y)(1 − H(φ))dxdy


R R
Ω f (x, y)H(φ)dxdy Ω
mean(f )φ≥0 = , mean(f )φ≤0 = .
Ω (1 − H(φ))dxdy
R R
Ω H(φ)dxdy

For curve evolution, we need to work with a family of curves C(t), over time t ≥ 0. This
family of curves is represented implicitly, as isolines of the function φ(x, y, t), (x, y) ∈ Ω, t ≥ 0,
such that
C(t) = {(x, y) ∈ Ω : φ(x, y, t) = 0}.
Assume that C(t) = (x(t, s), y(t, s)) moves in the normal direction, according to the equation
~,
C 0 (t) = F N

where F is a scalar speed function. Assume also that C(t) is implicitly represented by φ: thus,
φ(x(t, s), y(t, s), t) = 0. Differentiating with respect to t, we obtain
d
φ(x(t, s), y(t, s), t) = 0,
dt
or
∂φ ∂φ ∂x ∂φ ∂y
+ + = 0,
∂t ∂x ∂t ∂y ∂t
but since C 0 (t) = ( ∂x ~ , we obtain
, ∂y ) = F N
∂t ∂t

∂φ ∂φ ∂φ
+ N1 F + N2 F = 0,
∂t ∂x ∂y
or
∂φ ~ = 0.
+ F ∇φ · N
∂t
~ by − ∇φ , we finally obtain
Substituting N |∇φ|

∂φ
= F |∇φ|,
∂t
which is the main level set equation. The speed F can depend on (x, y), on the curvature K(x, y),
could be a constant, or could depend on the image f to be segmented.

35
Active contours for boundary detection Usually models are based on two concepts: (i) regu-
larization of the curve (such as motion by curvature) and (ii) stopping criterion.
Edge-based models Let g(x, y) = 1+|∇(Gσ1∗f (x,y))|2 . This is an ”edge-function”: it is close to zero
(small) near edges (where the gradient magnitude is high) and large away from edges. Thus such
models segment objects from the background based on discontinuities in the image f .
• Model proposed by Caselles et al. in [3]
∂φ
φ(x, y, 0) = φ0 (x, y), = g(x, y)K(x, y)|∇φ|. (24)
∂t
Here the speed function is F = g(x, y)K(x, y). If g(x, y) > 0 (away from edges), the curve moves
by motion with a speed proportional to the mean curvature. If g(x, y) ≈ 0 (on an edge), F ≈ thus
the curve stops on boundaries of objects (edges). This nonlinear partial differential equation does
not come from an energy minimization.
• Geodesic active contours Caselles et al [4] , Kichenassamy et al [9]: find a curve C that minimizes
the weighted length term Z
min g(C)dS.
C C
By this method, a minimizer C should overlap with edges in the image f , since the energy is
minimal when g(C) ≈ 0. In the level set implicit representation, we define C(t) = {(x, y) :
φ(x, y, t) = 0}, thus the model becomes
n Z
min J(φ) = g(x, y)|∇H(φ(x, y)|dxdy.
φ Ω

The first order extremality condition associated with the minimization, combined with gradient
descent, ∂φ
∂t
= −∂J(φ), gives

∂φ h ∂  g(x, y) ∂φ   g(x, y) ∂φ i
φ(x, y, 0) = φ0 (x, y), = δ(φ) + .
∂t ∂x |∇φ| ∂x |∇φ| ∂y
It is possible to extend the motion to all level lines of φ (even if we are interested only in the
zero-level line), substituting δ(φ) by |∇φ|, to obtain a more geometric motion
∂φ h ∂  g(x, y) ∂φ   g(x, y) ∂φ i
φ(x, y, 0) = φ0 (x, y), = |∇φ| + . (25)
∂t ∂x |∇φ| ∂x |∇φ| ∂y
We notice that the difference between models (24) and (25) is that the stopping edge-function
g is outside the curvature term in (24), while it is inside the curvature term in (25). Model (25) has
a better attraction of the curve C towards edges.
For both of these models, the initial curve defined by φ0 (x, y) = 0 has to enclose the objects to
be detected. Also, these models have more difficulty in segmenting strongly concave shapes.
Region-based models This model [5, 6] is based on the partitioning of the input image f into
”objects” and ”background” based on pixels intensity similarity, and no longer on discontinuities.
The following energy is minimized
Z Z
2
min E(C, c1 , c2 ) = (f − c1 ) dxdy + (f − c2 )2 dxdy + λLength(C). (26)
C,c1 ,c2 inside(C) outside(C)

36
We assume that the foreground can be described by the region where f (x, y) ≈ constant c1 , and
the background can be described by the region where f (x, y) ≈ constant c2 , with c1 6= c2 .
Thus the first two terms form the stopping (segmentation) criterion. The last term is a regular-
ization (penalty on the length of the curve).
In the implicit representation, model (26) can be expressed as
Z Z Z
min E(φ, c1 , c2 ) = (f − c1 )2 H(φ)dxdy + (f − c2 )2 (1 − H(φ))dxdy + λ |∇H(φ)|. (27)
φ,c1 ,c2 Ω Ω Ω

Minimizing the energy E(φ, c1 , c2 ) with respect to its variables, we obtain the associated ex-
tremality conditions R
f (x, y)H(φ)dxdy
c1 = Ω R ,
Ω H(φ)dxdy

f (x, y)(1 − H(φ))dxdy


R

c2 = ,
Ω (1 − H(φ))dxdy
R

∂φ
and for the minimization in φ we use gradient descent, ∂t
= − ∂E
∂φ
, or after simplifications,

∂φ h i
φ(x, y, 0) = φ0 (x, y), = δ(φ) − (f − c1 )2 + (f − c2 )2 + λK(x, y) .
∂t
This model is less dependent on the choice of the initial curve (this does not have to enclose
the objects). Also, interior contours and strongly concave shapes are automatically detected.
For more details we refer to [5, 6].

References
[1] A. Buades, B. Coll, J.M. Morel, A non local algorithm for image denoising, IEEE Computer
Vision and Pattern Recognition 2005, Vol 2, pp: 60-65, 2005.

[2] A. Buades, B. Coll, J.M. Morel, A review of image denoising methods, with a new one, Multi-
scale Modeling and Simulation, Vol4 (2), pp: 490-530, 2006.

[3] V. Caselles, F. Catte, T. Coll, F. Dibos, A geometric model for active contours, Numerische.
Mathematik 66, pp. 1-31, 1993.

[4] V. Caselles, R. Kimmel, G. Sapiro, Geodesic Active Contours, IJCV 1997.

[5] T. Chan and L. Vese, An active contour model without edges, Scale-Space Theories in Com-
puter Vision, Lecture Notes in Computer Science, 1682:141-151, 1999.

[6] T.F. Chan and L.A. Vese, Active contours without edges, IEEE Transactions on Image Process-
ing, 10 (2), Feb. 2001, pp. 266 -277.

[7] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 3rd edition, Prentice-Hall, 2008.

[8] R.C. Gonzalez, R.E. Woods and S.L. Eddins, Digital Image Processing Using Matlab, 2nd
edition, Prentice-Hall, 2010.

37
[9] S. Kichenssamy, Kumar, P. Olver, A. Tannenbaum, A. Yezzi, Conformal curvature flows: From
phase transitions to active vision, Archive Rat.Mech. Anal. 134 (3): 275-301 1996.

[10] Osher, S.; Sethian, J. A. (1988), Fronts propagating with curvature-dependent speed: Algo-
rithms based on Hamilton-Jacobi formulations, J. Comput. Phys. 79: 1249.

[11] L. Rudin, S. Osher, E. Fatemi, Nonlinear Total Variation Based Noise Removal Algorithms,
Physica D: Nonlinear Phenomena, Vol. 60, Issues 1-4, 1992.

[12] L. Rudin and S. Osher, Total Variation Based Image Restoration with Free Local Constraints,
ICIP (1) 1994, pp. 31-35.

38

You might also like