Mathematical Tool
Mathematical Tool
Mathematical Tool
Lecture Meeting Time: Mon, Tue, Thu, Fri 1-2pm (location: Coalition 1&2)
Problem Session: 4.30-5.30, 5.30-6.30 (location: computer lab or tent)
Working Version
Abstract
Image processing is an essential field in many applications, including medical imaging, as-
tronomy, astrophysics, surveillance, video, image compression and transmission, just to name
a few. In one dimension, images are called signals. In two dimensions we work with planar
images, while in three dimensions we have volumetric images (such as MR images). These
can be gray-scale images (single-valued functions), or color images (vector-valued functions).
Noise, blur and other types of imperfections often degrade acquired images. Thus such images
have to be first pre-processed before any further analysis and feature extraction.
In this course we will formulate in mathematical terms several image processing tasks:
image denoising, image deblurring, image enhancement, image segmentation, edge detection.
We will learn techniques for image filtering in the spatial domain (using first- and second-
order partial derivatives, the gradient, Laplacian, and their discrete approximations by finite
differences, averaging filters, order statistics filters, convolution), and in the frequency domain
(the Fourier transform, low-pass and high-pass filters), zero-crossings of the Laplacian. If time
permits, we will learn more advanced methods that can be formulated as a weighted Laplace
equation for image restoration, and curve evolution techniques called snakes for image seg-
mentation (including total variation minimization, active contours without edges, anisotropic
diffusion equation, NL Means).
• Prerequisites for the course include basic curriculum in calculus and linear algebra. Most con-
cepts will be defined as they are introduced. Some exposure, e.g. through a basic introductory
course, to numerical analysis, Fourier analysis, partial differential equations, statistics, or proba-
bility will definitely be useful though we will try to cover the necessary facts as they arrive.
1
• For computational purposes, some familiarity with the computer language Matlab is useful.
You may wish to explore in advance the introductory tutorial on Image Processing Using Matlab,
prepared by Pascal Getreuer, on how to read and write image files, basic image operations, filtering,
found at the online link: http://www.math.ucla.edu/˜ getreuer/matlabimaging.html
• The Discussion Sections will be devoted to problem solving, image processing with Matlab,
summary of current lecture, or to exposition of additional topics.
• For an introduction to image processing, a useful reading textbook is:
[7] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 3rd edition, Prentice-Hall.
See also
[8] R.C. Gonzalez, R.E. Woods and S.L. Eddins, Digital Image Processing Using Matlab, 2nd
edition, Prentice-Hall.
• Please visit also the webpage of these two textbooks for review material, computational projects,
problem solutions, images used in the textbook, imaging toolbox, and many other useful informa-
tion:
http://www.imageprocessingplace.com/index.htm
• The website for the course can be found at: http://www.math.ucla.edu/˜ lvese/155.1.09w/PCMI− USS− 2010
where lecture notes, problems, computer projects and other links are posted.
• An introduction and overview of the course can be found on the course webpage. Click on
”Lecture1.pdf” for slides presentation.
Topics to be covered
Fundamental steps in image processing
A simple image formation model
Image sampling and quantization
Intensity transformations and spatial filtering
- Histogram equalization
- Linear spatial filters, correlation, convolution
- Smoothing (linear) spatial filters
- Sharpening linear spatial filters using the Laplacian
Filtering in the frequency domain
- 1D and 2D continuous and discrete Fourier transforms
- convolution theorem
- properties of the Fourier transform
- filtering in the frequency domain (smoothing and sharpening, low-pass and high-pass filtering)
- the Laplacian in the frequency domain, enhancement
- homomorphic filtering
- band-reject and band-pass filters
2
Image restoration and reconstruction
- noise models
- mean filters
- order statistics filters
- adaptive median filter
- periodic noise reduction
- NL Means filter
- linear, position-invariant degradations
- examples of degradation (PSF) functions
- inverse filtering (Wiener filter, constrained least squares filtering, total variation minimization)
- image reconstruction from projections (Radon transform, computed tomography, the Fourier slice
Thm., filtered backprojections using parallel beam)
Image segmentation
- image gradient, gradient operators, gradient-based edge detection
- the Marr-Hildreth edge detector, Canny edge detector
- active contours
- global processing using the Hough transform
Contents
1 Fundamental Steps in Digital Image Processing 4
5 Image Restoration 17
5.1 Image Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Image Deblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Energy minimization methods for image reconstruction . . . . . . . . . . . . . . . 24
5.3.1 Computation of the first order optimality condition in the continuous case . 29
6 Image Segmentation 32
6.1 The gradient edge detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Edge detection by zero-crossings of the Laplacian (the Marr-Hildtreth edge detector) 33
6.3 Boundary detection by curve evolution and active contours . . . . . . . . . . . . . 34
6.3.1 Curve Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3
1 Fundamental Steps in Digital Image Processing
This field can be divided into several areas without clear boundaries:
- image processing
- image analysis
- computer vision
or into
- low-level vision
- mid-level vision
- high-level vision
In more details, the fundamental steps are as follows.
Image acquisition (output: digital image)
• Low-level vision: input = image, output = image
Image enhancement: subjective process (e.g. image sharpening)
Image Restoration: objective process, denoising, deblurring, etc (depends on the degradation)
Color Image Processing: there are several color modes. Color (R,G,B) images are represented
by vector-valued functions with three components; natural extensions from gray-scale to color
images (most of the time)
Wavelets: advance course, mathematical tool to allow representation of images at various de-
grees of resolutions, used in many image processing tasks .
Compression: reducing the storage required to save an image (jpeg 2000)
• Midd-level vision: input = image, output = image attributes
Mathematical morphology: tools for extracting image components useful in the representation
and description of shape.
Segmentation: to partition an image into its constituent parts or objects.
• High-level vision Input: boundaries and regions. Output: image attributes
Representation and description: following segmentation, gives representation of boundaries
and regions; description given by a set object attributes or features.
Recognition: assign a label (e.g., “vehicle”) to an object based on its descriptors
This course will deal with low-level and midd-level tasks.
4
We have
f (x, y) = i(x, y)r(x, y)
where
where l = f (x, y) is the gray-level at coordinates (x, y). It is common to shift the gray-scale (or
intensity scale) from the interval [Lmin , Lmax ] to the interval [0, L − 1]. Then l = 0 is considered
black and l = L − 1 is considered white on the gray scale. The intermediate values are shades
varying from black to white.
5
Visualizing the discrete function rk 7→ h(rk ) for all k = 0, 1, 2, ..., L − 1 gives the histogram.
We can also define the normalized histogram: let p(rk ) = nnk , where n is the total number of
pixels in the image (thus n = M N ).
The visualization of the discrete mapping rk 7→ p(rk ) for all k = 0, 1, 2, ..., L − 1 gives the
normalized histogram. We note that 0 ≤ p(rk ) ≤ 1 and
L−1
X
p(rk ) = 1.
k=0
See ”Lecture1.pdf” for a slide showing several images and their corresponding histograms. We
notice that a high-contrasted image has a more uniform, almost flat histogram. Thus, the idea is
to change the intensity values of f so that the output image g has a more uniform, almost flat
histogram. We will define T : [0, L − 1] 7→ [0, L − 1] as an increasing transformation using the
cumulative histogram.
In the discrete case, let rk = f (x, y). Then we define the histogram-equalized image g at (x, y)
by
k
X
g(x, y) = sk = (L − 1) p(rj ). (1)
j=0
We refer to ”Lecture1.pdf” for the slide showing the effect of histogram equalization applied to
poorly contrasted images.
Theoretical interpretation of histogram equalization (continuous case) We view the gray lev-
els r and s as random variables with associated probability distribution functions pr (r) and ps (s).
The continuous version of (1) uses the cumulative distribution function and we define
Z r
s = T (r) = (L − 1) pr (w)dw.
0
If pr (r) > 0 on [0, L − 1], then T is strictly increasing from [0, L − 1] to [0, L − 1], thus T
is invertible. Moreover, if T is differentiable, then we can use a formula from probabilities: if
s = T (r), then
∂r
ps (s) = pr (r) ,
∂s
where we view s = s(r) = T (r) as a function of r, and r = r(s) = T −1 (s) as a function of s.
From the definition of s, we have
∂s
= (L − 1)pr (r) = T 0 (r),
∂r
thus
∂r ∂ −1 1 1
= T (s) = 0 = .
∂s ∂s T (s) (L − 1)pr (r(s))
Therefore
∂r 1 1
ps (s) = pr (r) = pr (r) = .
∂s (L − 1)pr (r) L−1
In other words, ps is the uniform probability distribution function on the interval [0, L − 1], thus
corresponding to a flat histogram in the discrete case.
6
3.2 Spatial Linear Filters
A spatial filter consists of
(1) a neighborhood (typically a small rectangle)
(2) a predefined operation that is performed on the image pixels encompassed by the neighbor-
hood.
The output of the operation in (2) will provide the value of the output image g(x, y).
Let f (x, y) be the input image, and g(x, y) the output image in the discrete case, thus (x, y) are
viewed as integer coordinates, with 0 ≤ x ≤ M − 1 and 0 ≤ y ≤ N − 1. Therefore, both images
f and g are of size M · N . We define a neighborhood centered at the point (x, y) by
n o
S(x,y) = (x + s, y + t), −a ≤ s ≤ a, −b ≤ t ≤ b ,
where a, b ≥ 0 are integers. The size of the patch S(x,y) is (2a + 1)(2b + 1), and we denote by
m = 2a + 1 and n = 2b + 1 (odd integers), thus the size of the patch becomes m · n, and a = m−1
2
,
b = n−1
2
.
For example, the restriction f |S(x,y) for a 3 × 3 neighborhood S(x,y) is represented by
f (x − 1, y − 1) f (x − 1, y) f (x − 1, y + 1)
f |S(x,y) = f (x, y − 1) f (x, y) f (x, y + 1) .
f (x + 1, y − 1) f (x + 1, y) f (x + 1, y + 1)
We are now ready to define a linear spatial filter. The output image g is defined by
a
X b
X
g(x, y) = w(s, t)f (x + s, y + t), (2)
s=−a t=−b
with the condition that care must be taken at the boundary of the image f , in the case when
f (x + s, y + t) is not well defined. To solve this problem, we can extend the image f by zero in a
band outside of its original domain, or we can extend it by periodicity, or by reflection (mirroring).
In all these cases, all values f (x + s, y + t) will be well defined.
It is left as an exercise to show that the operation f 7→ g defined by (2) is a linear operation
(with respect to real numbers, satisfying the additivity and scalar multiplication).
For a 3 × 3 window mask and neighborhood, the filter in (2) becomes
7
(i) smoothing linear spatial filters
(ii) sharpening linear spatial filters
Smoothing can be understood as local averaging (or blurring),which is similar with spatial
summation or spatial integration. Small details and noise will be lost in the smoothing process, but
sharp edges will become blurry.
Sharpening has the opposite effect to smoothing. A blurry image f can be made sharper
through this process, details like edges will be enhanced.
Smoothing linear spatial filters Let us give two examples of smoothing spatial masks w of size
1 1 1
3×3. The first one gives the average filter or the box filter, w = 19 1 1 1 , and the second one can
1 1 1
2 2
1 2 1
− x +y 1
be seen as a discrete version of a two-dimensional Gaussian function e 2σ : w = 16 2 4 2
2
1 2 1
(the coefficients of w decrease as we move away from the center). Note that, for both these filter
masks w, the sum of the coefficients 1s=−1 1t=−1 w(s, t) = 1. Thus, the input image f and the
P P
The above equation is computed for all (x, y) with x = 0, 1, 2, ..., M − 1 and y = 0, 1, 2, ..., N − 1.
Note that the denominator is a constant, thus it has to be computed only once for all (x, y).
Additional mean filters (linear and nonlinear) will be discussed later in the section on image
denoising.
Sharpening linear filters In this section, we will present a method for image enhancement using
the Laplacian. Let’s assume for a moment that the image function f has second order partial
derivatives. We define the Laplacian of f in the continuous case by
∂ 2f ∂ 2f
4f (x, y) = (x, y) + (x, y).
∂x2 ∂y 2
You will prove as an exercise that the mapping f 7→ 4f is linear and rotationally invariant.
We remember from numerical analysis, that in one dimension, the second order derivative of
f , f 00 at the point x, can be approximated by finite differences of values of f at x, x + h, x − h:
f (x + h) − 2f (x) + f (x − h)
f 00 (x) ≈ , (3)
h2
where h > 0 is a small discretization parameter.
2 2
First we approximate ∂∂xf2 (x, y) using (3) keeping y fixed, then ∂∂yf2 (x, y) using (3) keeping x
fixed, by
∂ 2f f (x + h, y) − 2f (x, y) + f (x − h, y)
2
(x, y) ≈ ,
∂x h2
8
Figure 1: Illustration of the enhancement using the Laplacian. Left: a blurry 1D edge of the image
f , and the sign of f 00 . Right: the resulting g = f − f 00 as an ideal sharp edge.
which can be applied to discrete images f . We notice that the 5-point Laplacian from (4) can be
0 1 0
obtained using a linear spatial filter with ”Laplacian mask” w = 1 −4 1 .
0 1 0
It is possible to show that the Laplacian can also be discretized by a 9-point Laplacian, with
formula and corresponding Laplacian mask w
+f (x + 1, y + 1) + f (x − 1, y − 1) + f (x + 1, y − 1) + f (x − 1, y + 1) − 8f (x, y),
1 1 1
w = 1 −8 1 .
1 1 1
We notice that both Laplacian masks w have the sum of coefficients equal to 0, which is related
with the sharpening property of the filter. We will now show how the Laplacian can be used to
enhance images.
Consider a smoothed 1D edge profile f as shown in Figure 1 left, where the sign of f 00 is also
given. As we can see, applying the operation g = f − f 00 as shown in Figure 1 right, produces an
ideal sharp edge.
Similarly, for two dimensional images, if the input f is blurry, it can be enhanced by the
operation g(x, y) = f (x, y) − 4f (x, y). In the discrete case, if we use the 5-point Laplacian (4),
this corresponds to the following spatial linear filter and mask:
g(x, y) = f (x, y)−4f (x, y) = 5f (x, y)−f (x+1, y)−f (x−1, y)−f (x, y+1)−f (x, y−1), (5)
9
0 −1 0
w = −1 5 −1 .
0 −1 0
−1 −1 −1
Similarly, the enhancement using the 9-point Laplacian has the mask w = −1 9 −1 .
−1 −1 −1
Unsharp Masking and Highboost Filtering Image enhancement can also be obtained using
smoothing spatial filters, instead of the Laplacian. Assume that f is the input image (that may be
blurry). We first compute a smoother image fsmooth by applying a smoothing linear filter to f as
discussed before. Then the sharper output image g is defined by
g(x, y) = f (x, y) + k · f (x, y) − fsmooth (x, y) , (6)
where k ≥ 0 is a coefficient. When k = 1 the method is called unsharp masking. When k > 1, it
is called highboost filtering, since the edges and details are even more highlighted.
Consider the case k = 1 in equation (6). When we compute f − fsmooth , this gives negative
values on the left of the edge, and positive values on the right of the edge, and values close to 0
elsewhere. Thus when we compute f + (f − fsmooth ) this provides the sharper corrected edge in g.
10
4 The Fourier Transform and Filtering in the Frequency Do-
main
Illustration of Fourier series Consider the inner product vector space of functions
1 2π
R −inx
where cn = 2π 0 f (x)e dx. Thus the function f can be expressed as a sum of sines and and
cosines with appropriate coefficients.
The continuous and discrete Fourier transforms Remember that a linear discrete filter applied
to an image f (x, y) was defined by
a
X b
X
g(x, y) = h(m, n)f (x + m, y + n),
m=−a n=−b
where h was a window mask. In the continuous case, if h, f are functions in the plane, we can
define the correlation of two functions (as a continuous version of the spatial linear filter)
Z ∞ Z ∞
h ◦ f (x, y) = h(m, n)f (x + m, y + n)dmdn. (7)
−∞ −∞
If we make the change in the above formula (7): + signs into - signs inside f , we obtain the
convolution of two functions defined by
Z ∞ Z ∞
h ∗ f (x, y) = h(m, n)f (x − m, y − n)dmdn. (8)
−∞ −∞
11
Thus the convolution h ∗ f can also be seen as a linear filtering in the spatial domain: filtering
the image f with convolution kernel h. For the discrete convolution, the computation h ∗ f is
very expensive, especially if the support of h is not too small. It turns out that using the Fourier
Transform F that we will define, we have:
if H = F(h) and F = F(f ), then h ∗ f ⇔ HF (pointwise multiplication),
thus a fast computation.
We recall the Euler’s formulas: if θ is a real number, then eiθ = cos θ + i sin θ. If α, β are
real numbers, then eα+iβ = eα (cos β + i sin β), where i is the purely complex number such that
i2 = −1. R∞
In one dimension first, we consider a planar function f (x) satisfying −∞ |f (x)|dx < ∞. Then
we can define F : R → C, the Fourier transform of f , by
Z ∞
F (u) = f (x)e−2πiux dx, F(f ) = F,
−∞
We say that [f, F ] forms a transform pair. It is easy to verify that the mapping f 7→ F(f ) is a
linear transformation.
Since the transform F takes complex values, we have F (u, v) = R(u,q v) + iI(u, v) (using the
real and imaginary parts of F ). We call the Fourier spectrum |F (u, v)| = R(u, v)2 + I(u, v)2 .
We have the convolution theorem, that we prove in one dimension:
Convolution Theorem If H = F(h) and F = F(f ), then F(h ∗ f ) = HF , or F(h ∗ f ) =
F(h)F(f ).
Proof: We present the details of the proof in one dimension. The two-dimensional case is similar.
Recall the 1D convolution
Z ∞
h ∗ f (x) = h(m)f (x − m)dm.
−∞
Then
Z +∞ Z ∞ hZ ∞ i
F(h ∗ f ) = h ∗ f (x)e−2πiux dx = h(m)f (x − m)dm e−2πiux dx
−∞ −∞ −∞
Z ∞ hZ ∞ i
= h(m) f (x − m)e−2πiux dx dm.
−∞ −∞
12
By the change of variable x − m = X, we obtain
Z ∞ hZ ∞ i
F(h ∗ f ) = h(m) f (X)e−2πiu(m+X) dX dm
−∞ −∞
hZ ∞ ih Z ∞ i
= h(m)e−2πium dm f (X)e−2πiuX dX = H(u)F (u).
−∞ −∞
In the discrete case, we substitute the real numbers x, y, u, v by x4x, y4y, u4u, v4v where
now x, y, u, v are integers and 4x, 4y, 4u, 4v here denote step discretizations. We make the
convention that 4x4u = M1 , 4y4v = N1 . Then for an image f (x, y), with x = 0, 1, ..., M − 1,
y = 0, 1, ..., N − 1, the two-dimensional discrete Fourier transform (2D DFT) and its inverse (2D
IDFT) are defined by
M −1 N −1
ux vy
f (x, y)e−2πi( M + N ) , u = 0, 1, 2..., M − 1, v = 0, 1, ..., N − 1
X X
F (u, v) =
x=0 y=0
and
−1 N −1
1 MX ux vy
F (u, v)e2πi( M + N ) , x = 0, 1, 2..., M − 1, y = 0, 1, ..., N − 1.
X
f (x, y) =
M N u=0 v=0
(the one-dimensional versions of the discrete transforms are defined similarly).
Note that
M −1 N −1 M −1 N −1
0
X X X X
F (0, 0) = f (x, y)e = f (x, y) = (M N )average(f ),
x=0 y=0 x=0 y=0
Shifting the center of the transform: We have the formula F(f (x, y)(−1)x+y ) = F (u− M2 , v −
N
2
). Indeed, note that we have
x=0 y=0
M −1 N −1 h
−xM
− yN ux vy
i
f (x, y)e−2πi( )
e−2πi( M + N )
X X
= 2M 2N
x=0 y=0
M −1 N −1 h u−M v−N
2 2
i M N
f (x, y)e−2πi(x +y )
X X
= M N = F (u − , v − ).
x=0 y=0 2 2
From now on, we will assume that we work with the centered spectrum. Please see the handout
posted on the course webpage as an illustration of the shifting of the origin of the transform 0 to
M/2, in one dimension.
13
4.1 Principles of Filtering in the Frequency Domain
We assume that we work with the shifted transform based on the above property, thus the origin of
the transform (0, 0) has been moved to the center of the domain (M/2, N/2).
Usually it is impossible to make direct associations between the components of an image f and
its transform F . However, we can make connections between the frequency components of the
transform and spatial features of the image. We have
• frequency is directly related to spatial rates of change
• we have seen that the slowest varying frequency component ((u = 0, v = 0) before centering
or (u = M/2, v = N/2) after centering) is proportional to the average of the image intensity f (no
variation)
• as we move away from the origin (or from the center (u = M/2, v = N/2) after shifting),
the low frequency components correspond to slowly varying intensity components in the image
• as we move farther away from the origin (or from the center (u = M/2, v = N/2) after
shifting), high frequency components correspond to faster gray-level changes
x+y
domain 1. Multiply f (x, y) by (−1)
Basics of filtering in the frequency
2. Compute F (u, v) = DF T f (x, y)(−1)x+y (here F (u, v) is in fact F (u − M/2, v − N/2),
but we keep the simpler notation F (u, v))
3. Multiply F by a real ”filter” function H: G(u, v) = H(u, v)F (u, v) (pointwise multiplica-
tion, not matrix multiplication)
4. Compute the Discrete IFT of G (of the result in 3.)
5. Take the real part of the result in 4 (in order to ignore parasitic complex components due to
computational accuracies)
6. Multiply the result in 5 by (−1)x+y to obtained the filtered image g
H will suppress certain frequencies in the transform, while leaving other frequencies un-
changed.
Let h be such that F(h) = H. Due to the Convolution Theorem, we have that linear filtering
in the spatial domain with h is equivalent with linear filtering in the frequency domain with filter
H: h ∗ f ⇔ HF
As we have mentioned, the low frequencies of the transform correspond to smooth regions in
the image (homogeneous parts). High frequencies of the transform correspond to edges, noise,
details.
A low-pass filter (LPF) H leaves low frequencies unchanged, while attenuating the high fre-
quencies. This is a smoothing filter.
A high-pass filter (HPF) H leaves high frequencies unchanged, while attenuating the low fre-
quencies. This is a sharpening filter.
q
Let D(u, v) = dist (u, v), (M/2, N/2) = (u − M/2)2 + (v − N/2)2 the distance from
the frequency (u, v) to the center of the domain. Let D0 ≥ 0 be a parameter.
Examples of low-pass filters: (
1 if D(u, v) ≤ D0
• Ideal low-pass filter H(u, v) = . This filter produces the so-called
0 if D(u, v) > D0
ringing effect or Gibbs effect (oscillations near boundaries and edges), due to the sharp transition
between 0 and 1.
14
The following two filters produce less ringing effect.
1
• Let n a positive parameter. The Butterworh low-pass filter of order n H(u, v) = 1+(D(u,v)/D0 )2n
.
( u,v)/2D 2
• The Gaussian low-pass filter H(u, v) = e−D 0
The Laplacian in the frequency domain Assuming that f (x) is a one-dimensional function
defined on the real line in the continuous case, and that f has partial derivatives upto order n, we
∂nf
have the following formula that expresses the Fourier Transform of the derivative F ∂xn function
of the Fourier Transform of f :
∂ nf
F = (2πiu)n F (u). (9)
∂xn
To prove this formula, consider the IFT
Z ∞
f (x) = F (u)e2πiux du,
−∞
then differentiating both sides with respect to x (a parameter under the integral sign), we obtain
Z ∞ h
∂f i
= F (u)(2πiu) e2πiux du,
∂x −∞
15
Due to the shifting property of the center, we define Hlapl as
h M 2 N i
H(u, v) = −4π 2 (u − ) + (v − )2 .
2 2
Enhancement using the Laplacian in the frequency domain
Let f be the input image, and g be the output enhanced image obtained by g = f − 4f . This
operation can also be performed in the frequency domain, using the linearity of the transform and
the above Laplacian filter Hlapl :
F(f − 4f ) = F(f ) − F(4f ) = F (u, v) + 4π 2 (u2 + v 2 )F (u, v)
= (1 + 4π 2 (u2 + v 2 ))F (u, v) = (1 − Hlapl (u, v))F (u, v).
Using the shifting property of the center, we obtain enhancement using the Laplacian in the
Frequency domain by the following steps:
x+y
1. F (u, v) = 2DF T (f (x, y)(−1)
)
2. Henhance (u, v) = 1 + 4π (u − M/2)2 + (v − N/2)2
2
Final remarks:
- The direct and inverse discrete Fourier transforms are computed using the Fast Fourier Trans-
form (FFT) algorithm.
- Additional properties of the discrete and Fourier transforms are given in the assigned exer-
cises.
- Additional applications of filtering in the Fourier domain will be seen in the sections on Image
Reconstruction and Image Segmentation.
16
5 Image Restoration
Image restoration is the process of recovering an image that has been degraded, using a-priori
knowledge of the degradation process.
Degradation model: Let f (x, y) be the true ideal image that we wish to recover, and let g(x, y) be
its degraded version. One relation that links f to g is the degradation model
h i
g(x, y) = H f (x, y) + n(x, y),
where H denotes a degradation operator (e.g. blur) and n is additive noise.
Inverse Problem: Knowing the degradation operator H and statistics of the noise n, find a good
estimate fˆ of f .
Random noise We assume here that the noise intensity levels can be seen as a random variable,
with associated histogram or probability distribution function (PDF) denoted by p(r). We also
assume that the noise n is independent of the image f and independent of spatial coordinates
(x, y).
Most common types of noise
2 2
• The Gaussian noise (additive) with associated PDF given by p(r) = √ 1 e−(r−µ) /2σ , where
2πσ
µ is the mean and σ denotes the standard deviation. (
1
B−A
if A ≤ r ≤ B
• Uniform noise (additive) with associated PDF given by p(r) =
0 otherwise
• Impulse noise (salt-and-peper, or bipolar) (not additive), with associated PDF given by
pA if r = A
p(r) = pB if r = B
0 otherwise
Mean filters for random noise removal
Let g be the input noisy image, and fˆ be the output denoised image. Let S(x,y) be a neighbor-
hood of the pixel (x, y) defined by
n o
S(x,y) = (x + s, y + t), −a ≤ s ≤ a, −b ≤ t ≤ b ,
of size mn, where m = 2a + 1 and n = 2b + 1 are positive integers.
• Arithmetic Mean Filter
1
fˆ(x, y) =
X
g(s, t).
mn (s,t)∈S
(x,y)
This filter is useful for removing Gaussian noise or uniform noise, but blurring is introduced.
17
• Geometric Mean Filter
1/mn
fˆ(x, y) = Π(s,t)∈S(x,y) g(s, t) .
This filter introduces blurring comparable with the arithmetic mean filter, but tends to lose less
image detail in the process.
• Contraharmonic Mean Filter of order Q
g(s, t)Q+1
P
(s,t)∈S(x,y)
fˆ(x, y) = P .
(s,t)∈S(x,y) g(s, t)Q
1 5 20
For example, for a 3 × 3 median filter, if gS(x,y) = 200 5 25 then we first order the values in
25 9 100
the window as follows: 1, 5, 5, 9, 20, 25, 25, 100, 200. The 50% ranking (5th value here) is 20,
thus fˆ(x, y) = 20.
The median filter introduces much less blurring than the other filters of the same window size.
It can be used for salt noise, pepper noise, or salt-and-pepper noise. Note that the median operation
is not a linear operation.
The following two filters can be seen as combinations of order statistics filters and averaging
filters, and can be used for several types of noise.
• Midpoint Filter
1h i
fˆ(x, y) = max {g(s, t)} + min {g(s, t)} .
2 (s,t)∈S(x,y) (s,t)∈S(x,y)
This filter works for randomly distributed noise, such as Gaussian noise or uniform noise.
18
• Alpha-trimmed mean filter
Let d ≥ 0 be a even integer, such that 0 ≤ d ≤ mn − 1. We first order again the mn pixel
values of the input image g in the window S(x,y) , and then we remove the lowest d/2 and the largest
d/2. We denote the remaining mn − d values by gr .
1
fˆ(x, y) =
X
gr (s, t),
mn − d (s,t)∈S
(x,y)
Remember that the normalized weighted linear filter with window mask w computes a weighted
average over a spatial neighborhood of (x, y). Here, the NL Means filter computes a weighted
average over an intensity neighborhood of g(X). In the above formula, fˆ(X) is a weighted average
of all values g(X 0 ), for which the patch(X) is similar to the patch(X 0 ).
This method can be tested from the website:
http://www.ipol.im/pub/algo/bcm− non− local− means− denoising/
19
Periodic (deterministic) noise We assume here that the noise n(x, y) depends on the spatial
coordinates (x, y) and it may be due to electrical or electromechanical interference. An example
of periodic noise is n(x, y) = α cos(x + y) + β sin(x + y).
We define a unit impulse (or delta function) by
(
1 if (u, v) = (0, 0)
δ(u, v) =
0 otherwise
If we visualize δ as an image, it will be a bright white dot at (0, 0) and black otherwise.
Let u0 and v0 be two parameters. It is possible to prove the following formula in the discrete
case
ih i
F sin(2πu0 x + 2πv0 y) = δ(u + M u0 , v + N v0 ) − δ(u − M u0 , v − N v0 ) .
2
Thus, if we visualize the Fourier spectrum of the ”noisy” image g(x, y) = f (x, y) + αn(x, y), with
n(x, y) = sin(2πu0 x + 2πv0 y) as periodic noise, we would notice two bright regions, symmetri-
cally located, corresponding to the two impulse functions appearing in the above formula.
Thus in order to remove such periodic noise, we define two small rectangles or disks A and B,
each including the two bright regions, and a filter H(u, v) in the frequency domain, such that
(
0 if (u, v) ∈ A ∪ B
H(u, v) = .
1 otherwise
This type of filter is called a notch filter. Note that Gaussian-type filters, or Butterworth type filters
with smoother transition between 0 and 1 can also be adapted to this application. The filtered
image fˆ should have the periodic pattern removed, while most of the image details f kept.
Note that there is a improved version of the notch filter for removing periodic noise, called
”optimum notch filter” [7].
20
5.2 Image Deblurring
Linear and position invariant degradation functions Recall the degradation model g = H[f ]+
n, where H is a degradation operator and n is additive noise. Assume first that n(x, y) = 0 (thus
there is no noise). We need the following definitions and properties on H:
- we assume that H is linear: let f1 and f2 be two functions, then
H[f1 + f2 ] = H[f1 ] + H[f2 ] (additivity) , H[αf ] = αH[f ] (scalar multiplication) .
- we assume that the additivity property is extended to integrals
- we assume that H is position invariant: if g(x, y) = H[f (x, y)], then g(x − α, y − β) =
H[f (x − α, y − β)] for all x, y, α, β and all functions f and g.
- we define a continuous impulse function δ, such that ( (by definition) f ∗ δ = f for any other
∞ if (x, y) = (0, 0)
function f . In other words, δ is defined by δ(x, y) = and the property
0 otherwise
f ∗ δ = f becomes
Z ∞ Z ∞
f (x, y) = f (α, β)δ(x − α, y − β)dαdβ. (10)
−∞ −∞
We prove that under the above assumptions and notations, the operator H is a convolution.
Proof. We have g(x, y) = H[f (x, y)], thus according to (10) we have
hZ ∞ Z ∞ i
g(x, y) = H f (α, β)δ(x − α, y − β)dαdβ ,
−∞ −∞
Now f (α, β) is constant with respect to (x, y) thus by the scalar multiplication property, we obtain
Z ∞ Z ∞ h i
g(x, y) = f (α, β)H δ(x − α, y − β) dαdβ.
−∞ −∞
Denote by h(x, y) = H[δ(x, y)], thus by the position invariance property of H, we have h(x −
α, y − β) = H[δ(x − α, y − β)] and g(x, y) becomes
Z ∞ Z ∞
g(x, y) = f (α, β)h(x − α, y − β)dαdβ,
−∞ −∞
21
Examples of the degradation functions
We give below a few examples of linear degradation functions h or H, obtained by modeling
the physical phenomenon for the image acquisition process.
• Atmospheric Turbulence Blur
2 2 5/6
In the frequency domain, this is defined by H(u, v) = e−k(u +v ) , where k is a constant
depending on the degree of turbulence:
k = 0.0025 corresponds to severe turbulence
k = 0.001 corresponds to mild turbulence
k = 0.00025 corresponds to low turbulence
A Gaussian low-pass filter can also be used to model atmospheric turbulence blur.
• Out of Focus Blur (
1 if x2 + y 2 ≤ D02
In the spatial domain, this is defined by h(x, y) =
0 otherwise
(larger parameter D0 gives more severe blur).
• Uniform Linear Motion Blur
Assume that the image f (x, y) undergoes planar motion during the acquisition. Let (x0 (t), y0 (t))
be the motion components in the x and y-directions. Here, t denotes time and T is the duration of
the exposure. The motion blur degradation is
Z T
g(x, y) = f (x − x0 (t), y − y0 (t))dt.
0
We want to express the motion blur in the frequency domain. Let G = F(g), thus by the definition,
Z ∞ Z ∞
G(u, v) = g(x, y)e−2πi(ux+vy) dxdy
−∞ −∞
Z ∞ Z ∞ hZ T i
= f (x − x0 (t), y − y0 (t))dt e−2πi(ux+vy) dxdy
−∞ −∞ 0
Z T hZ ∞ Z ∞ i
= f (x − x0 (t), y − y0 (t))e−2πi(ux+vy) dxdy dt.
0 −∞ −∞
thus G(u, v) = H(u, v)F (u, v) with the motion degradation function in the frequency domain
Z Th i
H(u, v) = e−2πi(ux0 (t)+vy0 (t)) dt.
0
22
energy or the radiation in a different way). The data g is given in the Radon domain, and the
relation between f and g can be defined by
Z ∞ Z ∞
g(ρ, θ) = R(f ) = f (x, y)δ(x cos θ + y sin θ − ρ)dxdy,
−∞ −∞
or that Z
g(ρ, θ) = f,
linex cos θ+y sin θ=ρ
where (ρ, θ) are the parameters describing a line. The image g(ρ, θ) for 0 ≤ ρ ≤ r and 0 ≤ θ ≤ π
is called a sinogram. The problem is to recover f from g. In reality, we only have a finite (limited)
number of ”projections” (lines), and R(f ) is called the Radon transform of f .
Image reconstruction from blurry-noisy data We will discuss several models for image de-
convolution (with or without noise).
The simplest approach is called direct inverse filtering. In the ideal case without noise, G(u, v) =
G(u,v)
H(u, v)F (u, v), thus to recover F , we could define F̂ (u, v) = H(u,v) . However, this approach has
at least two problems:
G(u,v)
(i) in reality there is unknown noise N , thus the real relation would be F̂ (u, v) = H(u,v) =
N (u,v)
F (u, v) + H(u,v) . So even if we know the degradation function H(u, v), we cannot recover F
exactly because N (u, v) is unknown.
(ii) The blurring filter H(u, v) usually is zero or it has very small values (close to zero) away
from the origin (or away from the center of the domain). Thus the division by H(u, v) would not be
defined at many values (u, v), or the term N (u, v)/H(u, v) would dominate and lead to incorrect
reconstruction.
The Wiener Filter is given by
h 1 |H(u, v)|2 i
F̂ (u, v) = G(u, v),
H(u, v) K + |H(u, v)|2
where K > 0 is a positive parameter. K is chosen so that visually good results are obtained. This
filter can be interpreted in the following way:
|H(u,v)|2
(i) First, the factor K+|H(u,v)| 2 acts like a low-pass filter to remove the noise.
1
(ii) Then, the factor H(u,v) acts like in the direct inverse filtering (direct deconvolution). But we
no longer have division by very small values, since the filter is equivalent with
h H(u, v) i
F̂ (u, v) = G(u, v).
K + |H(u, v)|2
Error measures Assume that we perform an artifficial experiment, thus we know the true image
f that we wish to recover. Once we have a denoised or deblurred image fˆ, we can measure in
several ways, how good is the restored result fˆ. We q
give two examples of measures:
• Root-Mean-Square-Error (RMSE) RM SE = M1N M
P −1 PN −1 ˆ 2
x=0 y=0 [f (x, y) − f (x, y)] .
PM −1 PN −1 ˆ
f (x,y)2
• Signal-to-Noise-Ratio (SNR) SN R = PM −1x=0
PN −1 y=0 ˆ
x=0 y=0
[f (x,y)−f (x,y)]2
23
5.3 Energy minimization methods for image reconstruction
Consider first the denoising problem for simplicity. Let g be the given noisy image, f the image to
be restored, linked through the relation g(x, y) = f (x, y) + n(x, y).
Let’s recall for a moment the enhancement method using the Laplacian: g = f − 4f , where
f was the input blurry image and g was the sharper output image. However, to remove noise, we
need the opposite process of smoothing. So let’s invert the relation g = f − 4f to obtain
f = g + 4f, (11)
where g is the input noisy image and f is the denoised processed image. This is a linear partial
differential equation, with unknown f . There are many ways to solve (11).
• One way is using the Fourier transform: (11) is equivalent with
F(f ) = F(g + 4f ),
thus by the linearity of the Fourier transform, we first have
F(f ) = F(g) + F(4f ),
or h i
F (u, v) = G(u, v) + − 4π 2 (u2 + v 2 ) F (u, v),
therefore
1
F (u, v) = G(u, v),
1+ 4π 2 (u2 + v2)
1
and we notice that 1+4π 2 (u2 +v 2 )
acts as a low-pass filter.
• Another way is by discretization in the spatial domain (assuming that f is extended by reflec-
tion outside of its domain): a discrete version of equation (11) can be obtained using the 5-point
Laplacian and with (x, y) integer coordinates,
h i
f (x, y) = g(x, y)+ f (x+1, y)+f (x−1, y)+f (x, y+1)+f (x, y−1)−4f (x, y) , ∀(x, y). (12)
Equation (12) can be solved by direct methods or iterative methods for linear systems. As an
example of iterative method, the simplest one is as follows: let g(x, y) be the given noisy data,
defined for x = 1, ..., M and y = 1, ..., N .
(i) Start with an initial guess f 0 (x, y) defined for all (x, y) in the original domain x = 1, ..., M
and y = 1, ..., N . Then extend f 0 by reflection to the domain x = 0, ..., M + 1, y = 0, ..., N + 1
(sometimes we chose f 0 = g).
(ii) For integer n ≥ 0, and for all x = 1, ..., M , y = 1, ..., N :
h i
f n+1 (x, y) = g(x, y) + f n (x + 1, y) + f n (x − 1, y) + f n (x, y + 1) + f n (x, y − 1) − 4f n (x, y)
(iii) Impose boundary conditions by reflection:
Let f n+1 (0, y) = f n+1 (2, y) and f n+1 (M + 1, y) = f n+1 (M − 1, y) for all y = 1, 2, ..., N .
Let f n+1 (x, 0) = f n+1 (x, 2) and f n+1 (x, N + 1) = f n+1 (x, N − 1) for all x = 1, 2, ..., M .
Let f n+1 (0, 0) = f n+1 (2, 2), f n+1 (0, N + 1) = f n+1 (2, N − 1),
f n+1 (M + 1, 0) = f n+1 (M − 1, 2), f n+1 (M + 1, N + 1) = f n+1 (M − 1, N − 1).
(iv) Let n = n + 1 and repeat steps (ii)-(iii) until convergence
(v) If matrix norm kf n+1 −f n k ≤ tolerance, stop and output f = f n+1 on the domain x = 1, ..., M
and y = 1, ..., N .
24
We want to show now, that the solution f of (11) could also be obtained as a minimizer of the
following discrete energy (taking into account the extension by reflection outside of the domain),
M X
N M X
N h i
2
(f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2 .
X X
E(f ) = [f (x, y) − g(x, y)] +
x=1 y=1 x=1 y=1
(13)
Indeed, assume now that some discrete image function f is a minimizer of E. Then, since E is
differentiable, we must have ∂(f∂E
(xy))
= 0 for all points (x, y). Compute
∂E
= 2(f (x, y) − g(x, y)) + 2[f (x + 1, y) − f (x, y)](−1) + 2[f (x, y + 1) − f (x, y)](−1)
∂(f (xy))
+2[f (x, y) − f (x − 1, y)] + 2[f (x, y) − f (x, y − 1)] = 0,
and rearranging the terms we obtain
h i
f (x, y) = g(x, y) + f (x + 1, y) + f (x − 1, y) + f (x, y + 1) + f (x, y − 1) − 4f (x, y) ,
∂ 1 ∂f ∂ 1 ∂f
f =g+ + . (14)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
This is a anisotropic diffusion equation, that also comes from the following energy minimization:
n 1Z Z 2
Z Z o
min J(f ) = (f (x, y) − g(x, y)) dxdy + |∇f |dxdy . (15)
f 2
When we deal with both denoising-deblurring, we modify (15) into
1Z Z Z Z
J(f ) = (h ∗ f (x, y) − g(x, y))2 dxdy + λ |∇f |dxdy, (16)
2
25
introduced by Rudin, Osher and Fatemi in [11, 12]. The first term in (16) is a data fidelity term,
while the second term is called total variation regularization. The parameter λ > 0 is a weight
between the fidelity term and the regularization term.
1R R
The fidelity term 2 (h ∗ f (x, y) − g(x, y))2 dxdy is appropriate for additive Gaussian noise.
If the image g is noisy due to additive Laplacian noise or due to impulse noise (salt-and-pepper
|h ∗ f (x, y) − g(x, y))|dxdy.
RR
noise), then the fidelity term is modified into the 1-norm
We want now to show that the following anisotropic PDE
∂ 1 ∂f ∂ 1 ∂f
h̃ ∗ h ∗ f = h̃ ∗ g + λ +λ (17)
∂x |∇f |(x, y) ∂x ∂y |∇f |(x, y) ∂y
must be formally satisfied by a minimizer of the convex energy J in (16), where h̃(x, y) =
h(−x, −y). To do this, we formally impose ∂J ∂f
= 0 which will be an equivalent relation with
(17).
Consider the discrete case for the purpose of illustration. Assume that the given noisy-blurry
image g is a matrix of size M × N . We want to recover a denoised-deblurred image f as a matrix
of size M × N . We denote by H[f ] = h ∗ f for simplicity, where H must be a linear operator,
since h ∗ (f1 + f2 ) = h ∗ f1 + h ∗ f2 , which is easy to verify. A discrete version of J is
M X
X N 2 M X
X N r 2 2
J(f ) = H[f ](x, y)−g(x, y) +λ f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y) ,
x=1 y=1 x=1 y=1
(18)
assuming the space discretization steps 4x = 4y = 1, and x, y integers. Differentiating with
respect to f (x, y), for (x, y) fixed, we obtain
∂J
= H T H[f ] − g (x, y)
∂f (x, y)
2 f (x + 1, y) − f (x, y) (−1) + 2 f (x, y + 1) − f (x, y) (−1)
+ λ r 2 2
2 f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
2 f (x, y) − f (x − 1, y)
+ λ r 2 2
2 f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
2 f (x, y) − f (x, y − 1)
+ λ r 2 2 = 0.
2 f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)
After simplifications, we obtain
∂J
= H T H[f ] − g (x, y)
∂f (x, y)
f (x + 1, y) − f (x, y)
− λ r 2 2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y + 1) − f (x, y)
− λ r 2 2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
26
f (x − 1, y) − f (x, y)
− λ r 2 2
f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
f (x, y − 1) − f (x, y)
− λ r 2 2 = 0,
f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)
or
n f (x + 1, y) − f (x, y)
H T H[f ] − g (x, y) = λ r 2 2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y) − f (x − 1, y) o
− r 2 2
f (x, y) − f (x − 1, y) + f (x − 1, y + 1) − f (x − 1, y)
n f (x, y + 1) − f (x, y)
+ λ r 2 2
f (x + 1, y) − f (x, y) + f (x, y + 1) − f (x, y)
f (x, y) − f (x, y − 1) o
− r 2 2 ,
f (x + 1, y − 1) − f (x, y − 1) + f (x, y) − f (x, y − 1)
which can be seen as a discrete version of equation (17). But this is still a nonlinear difference
equation in f , for which we do not have explicit solutions. We apply an iterative explicit method
based on time-dependent gradient descent to solve it.
General gradient descent method In order to solve the above equation for all (x, y), with asso-
ciated boundary conditions, we apply the gradient descent method, based on the following general
idea: assume that we need to solve
min J(f ),
f
where f could be a number, a vector, or a function (in our case, f is a discrete image function,
(x, y) 7→ f (x, y)). We also define a time-dependent equation for t ≥ 0. Let f (0) = f0 , and assume
that f (t) solves
f 0 (t) = −J 0 (f (t)), t > 0.
(in our case for image functions f , we would have (x, y, t) 7→ f (x, y, t), t ≥ 0, and the differential
equation becomes ∂f ∂t
= −J 0 (f (·, ·, t))).
We show that J(f (t)) is decreasing as t increases. Indeed,
d 2
J(f (t)) = J 0 (f (t))f 0 (t) = J 0 (f (t))(−J 0 (f (t)) = − J 0 (f (t)) ≤ 0,
dt
thus J(f (t)) must be deacreasing as t increases. So solving f 0 (t) = −J 0 (f (t)), t > 0 for f (t),
decreases the original energy J(f ), with f (t) → f , as t → ∞, with f a minimizer of J.
We apply this idea to our main problem, and we discretize the time t by n4t, n integer. Let
f 0 (x, y) be an initial quess, defined for all (x, y) with 1 ≤ x ≤ M , 1 ≤ y ≤ N (and then extended
27
f n+1 −f n
by reflection). We want to have, as above, 4t
= −J 0 (f n ), which becomes for pixels (x, y)
with x = 1, ..., M , y = 1, ..., N , for n ≥ 0,
28
5.3.1 Computation of the first order optimality condition in the continuous case
We explain here the computation of the so-called ”Euler-Lagrange equation” associated with the
minimization of the energies that we have considered above in the continuous case. Assume that
f, g : Ω → R, where Ω is the image domain (a rectangle in the plane) and ∂Ω denotes its boundary.
We will use the notations fx = ∂f
∂x
, fy = ∂f
∂y
.
Consider a general energy of the form
Z
J(f ) = e(x, y, f, fx , fy )dxdy, (19)
Ω
enter into the details of defining V ; V will be a normed vector space over R (subspace of L2 (Ω),
and we will have f ∈ V iff E(f ) < ∞. The minimization problem is
min J(f ).
f ∈V
We assume that e is differentiable in all variables (or that e can be approximated by a differ-
entiable function). The corresponding first order optimality condition (Euler-Lagrange equation)
that must be satisfied by a minimizer f of J, can be expressed as
∂e ∂ ∂e ∂ ∂e
= + ⇔ ∂J(f ) = 0
∂f ∂x ∂(fx ) ∂y ∂(fy )
∂e ∂ ∂e ∂ ∂e
inside Ω, where ∂J(f ) = ∂f − ∂x ∂(fx )
− ∂y ∂(fy )
, together with boundary conditions.
We show next the derivation of the Euler-Lagrange equation in two cases.
• Quadratic Regularization (denoising only)
Let
Z Z Z Z h i
2 2 2
E(f ) = (f (x, y) − g(x, y)) dxdy + |∇f | dxdy = (f − g) dxdy + (fx )2 + (fy )2 dxdy.
Ω Ω Ω Ω
Define
V = {v ∈ L2 (Ω) : E(v) < ∞}.
If f is a minimizer of E(f ), then we must have E(f ) ≤ E(f + v), for all real numbers and all
other functions v ∈ V . Denote by s() = E(f + v); therefore we must have s(0) ≤ s() for all ,
thus we must have s0 (0) = 0 for all other test functions v ∈ V .
We first compute
Z Z h i
s() = E(f + v) = (f (x, y) + v(x, y) − g(x, y))2 dxdy + ((f + v)x )2 + ((f + v)y )2 dxdy,
Ω Ω
Z Z h i
s() = E(f + v) = (f (x, y) + v(x, y) − g(x, y))2 dxdy + (fx + vx )2 + (fy + vy )2 dxdy.
Ω Ω
0
We compute s ():
Z Z h i
0
s () = 2 (f (x, y) + v(x, y) − g(x, y))v(x, y)dxdy + 2(fx + vx )vx + 2(fy + vy )vy dxdy.
Ω Ω
29
We take = 0:
Z Z h i
s0 (0) = 2 (f (x, y) − g(x, y))v(x, y)dxdy + 2fx vx + 2fy vy dxdy = 0 for all v ∈ V.
Ω Ω
30
Computation of the adjoint H T of H
For the convolution term, we assume that all image functions f, g, h are defined on the entire
plane (by extension) and belong
R∞ R∞
to L2 . The space L2 is an inner product vector space associated
with the product hf, gi = −∞ −∞ f (x, y)g(x, y)dxdy, for any functions f and g in L2 . Also,
H : L2 → L2 is the linear convolution operator defined by
Z ∞ Z ∞
Hf (x, y) = h ∗ f (x, y) = h(α, β)f (x − α, y − β)dαdβ.
−∞ −∞
By the definition of the adjoint, we must find another linear operator H T such that
hHf, gi = hf, H T gi
for all functions f, g ∈ L2 . We have
Z ∞ Z ∞
hHf, gi = (h ∗ f )(x, y)g(x, y)dxdy
−∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= h(α, β)f (x − α, y − β)dαdβ g(x, y)dxdy
−∞ −∞ −∞ −∞
Z ∞ Z ∞ hZ ∞ Z ∞ i
= f (x − α, y − β)g(x, y)dxdy h(α, β)dαdβ.
−∞ −∞ −∞ −∞
31
6 Image Segmentation
Segmentation is a midd-level vision task, where the input is a pre-processed image f (sharp, with-
out noise). The output could be an image g, or could no longer be an image, but image attributes:
set of points representing the edges of f , boundaries of objects defined by implicit curves or by
sets of points, etc. Segmentation is the essential step before image analysis and recognition. Seg-
mentation can also be defined as the process of partitioning the input image f into its constituent
objects or regions.
The segmentation can be based on several different criteria:
- It can be based on discontinuity: partition an image based on abrupt changes in intensity
(edges)
- It can be based on similarity: partition an image into regions that are similar according to a
set of predefined rules (based on color, texture, etc).
0 −1 0 0 0 0
0 0 0 in x, −1 0 1 in y.
0 1 0 0 0 0
In image processing, since the input image f may still be noisy, a smoothed version of these
approximations is used, called the Sobel gradient operators, computed by the following two masks
w in x and y respectively:
−1 −2 −1 −1 0 1
0 0 0 in x, −2 0 2 in y.
1 2 1 −1 0 1
Note that all these four masks are sharpening masks, since the sum of their coefficients is zero.
32
The gradient magnitude can be used as an edge detector. Indeed, in the areas where the image
f does not vary too much, the gradient magnitude |∇f | is small (close to zero); on the contrary,
in the areas where there are strong variations (at edges), the gradient magnitude |∇f | is large. We
define the output image g(x, y) = |∇f |(x, y) (it will show white edges over black background), or
a thresholded version of |∇f |.
Assuming that |∇f | is approximated by some of the above discrete operators, we define
g(x, y) = |∇f |(x, y) (discrete version)
(followed by rescaling), or a thresholded gradient map:
(
255 if |∇f |(x, y) ≥ tolerance T,
g(x, y) =
0 if |∇f |(x, y) < tolerance T.
Note that the operation f 7→ g = |∇f | is nonlinear.
33
6.3 Boundary detection by curve evolution and active contours
We assume that we have given an image f : Ω 7→ R that contains several objects forming the
foreground, and a background. Curve evolution can be used to detect boundaries of objects in
images, as follows:
(i) Start the process with an initial curve;
(ii) Move or deform the curve according to some criterion;
(iii) The curve has to stop on boundaries of objects.
Implicit Representation Another way to represent a curve C, which is the boundary of an open
domain, is as the isoline of a Lipschitz continuous function φ. Such representation, called im-
plicit representation or level set representation [10], allows for automatic change of topology, and
discretization on a fixed regular (rectangular) grid. On the other hand, the method is computation-
ally more expensive and it does not directly allow the representation of open curves. This is the
approach that we will consider.
Let φ : Ω → be a Lipschitz continuous function, and define C = {(x, y) ∈ Ω : φ(x, y) = 0}.
In other words, C is the zero isoline of the function φ. Note that here C can be made of several
connected components.
As an example, consider a circle C centered at (x0 , y0 ) and of radius r; we can define
q
φ(x, y) = r − (x − x0 )2 + (y − y0 )2 .
Then
φ(x, y) = 0 if (x, y) ∈ circle C;
φ(x, y) < 0 if (x, y) ∈ outside of circle C;
φ(x, y) > 0 if (x, y) ∈ inside circle C.
We notice that for a general curve C which is the boundary of an open domain, we can define
φ as the signed distance function to C:
dist (x, y), C) if (x, y) ∈ inside C
φ(x, y) = −dist (x, y), C) if (x, y) ∈ outside C .
0 if (x, y) ∈ C
Note that, under some assumptions, the (signed) distance function to C satisfies the eikonal equa-
tion: |∇φ| = 1 over Ω in the sense of viscosity solutions.
34
Using the differentiable level set function φ, we can define several intrinsic geometric quantities
of the curve C:
• Unit normal to the curve N ~ = (N1 , N2 ) = − ∇φ
|∇φ|
φxx φ2y −2φx φy φxy +φyy φ2x
∂ φx ∂ φx
• Curvature at (x, y) ∈ C: K(x, y) = ∂x |∇φ|
+ ∂y |∇φ|
= (φ2x +φ2y )3/2
.
(
1 if φ ≥ 0
• Heaviside function H(φ) = , and the delta function δ(φ) = H 0 (φ) in the weak
0 if φ < 0
sense
• Area enclosed by the curve: A{φ ≥ 0} =
R
R Ω
H(φ(x, y))dxdy
• Length of the curve: L{φ(x, y) = 0} = Ω |∇H(φ(x, y)|
• If f : Ω → R is the given image over Ω, then we can define the mean of f over the regions
{φ ≥ 0} and {φ ≤ 0}:
For curve evolution, we need to work with a family of curves C(t), over time t ≥ 0. This
family of curves is represented implicitly, as isolines of the function φ(x, y, t), (x, y) ∈ Ω, t ≥ 0,
such that
C(t) = {(x, y) ∈ Ω : φ(x, y, t) = 0}.
Assume that C(t) = (x(t, s), y(t, s)) moves in the normal direction, according to the equation
~,
C 0 (t) = F N
where F is a scalar speed function. Assume also that C(t) is implicitly represented by φ: thus,
φ(x(t, s), y(t, s), t) = 0. Differentiating with respect to t, we obtain
d
φ(x(t, s), y(t, s), t) = 0,
dt
or
∂φ ∂φ ∂x ∂φ ∂y
+ + = 0,
∂t ∂x ∂t ∂y ∂t
but since C 0 (t) = ( ∂x ~ , we obtain
, ∂y ) = F N
∂t ∂t
∂φ ∂φ ∂φ
+ N1 F + N2 F = 0,
∂t ∂x ∂y
or
∂φ ~ = 0.
+ F ∇φ · N
∂t
~ by − ∇φ , we finally obtain
Substituting N |∇φ|
∂φ
= F |∇φ|,
∂t
which is the main level set equation. The speed F can depend on (x, y), on the curvature K(x, y),
could be a constant, or could depend on the image f to be segmented.
35
Active contours for boundary detection Usually models are based on two concepts: (i) regu-
larization of the curve (such as motion by curvature) and (ii) stopping criterion.
Edge-based models Let g(x, y) = 1+|∇(Gσ1∗f (x,y))|2 . This is an ”edge-function”: it is close to zero
(small) near edges (where the gradient magnitude is high) and large away from edges. Thus such
models segment objects from the background based on discontinuities in the image f .
• Model proposed by Caselles et al. in [3]
∂φ
φ(x, y, 0) = φ0 (x, y), = g(x, y)K(x, y)|∇φ|. (24)
∂t
Here the speed function is F = g(x, y)K(x, y). If g(x, y) > 0 (away from edges), the curve moves
by motion with a speed proportional to the mean curvature. If g(x, y) ≈ 0 (on an edge), F ≈ thus
the curve stops on boundaries of objects (edges). This nonlinear partial differential equation does
not come from an energy minimization.
• Geodesic active contours Caselles et al [4] , Kichenassamy et al [9]: find a curve C that minimizes
the weighted length term Z
min g(C)dS.
C C
By this method, a minimizer C should overlap with edges in the image f , since the energy is
minimal when g(C) ≈ 0. In the level set implicit representation, we define C(t) = {(x, y) :
φ(x, y, t) = 0}, thus the model becomes
n Z
min J(φ) = g(x, y)|∇H(φ(x, y)|dxdy.
φ Ω
The first order extremality condition associated with the minimization, combined with gradient
descent, ∂φ
∂t
= −∂J(φ), gives
∂φ h ∂ g(x, y) ∂φ g(x, y) ∂φ i
φ(x, y, 0) = φ0 (x, y), = δ(φ) + .
∂t ∂x |∇φ| ∂x |∇φ| ∂y
It is possible to extend the motion to all level lines of φ (even if we are interested only in the
zero-level line), substituting δ(φ) by |∇φ|, to obtain a more geometric motion
∂φ h ∂ g(x, y) ∂φ g(x, y) ∂φ i
φ(x, y, 0) = φ0 (x, y), = |∇φ| + . (25)
∂t ∂x |∇φ| ∂x |∇φ| ∂y
We notice that the difference between models (24) and (25) is that the stopping edge-function
g is outside the curvature term in (24), while it is inside the curvature term in (25). Model (25) has
a better attraction of the curve C towards edges.
For both of these models, the initial curve defined by φ0 (x, y) = 0 has to enclose the objects to
be detected. Also, these models have more difficulty in segmenting strongly concave shapes.
Region-based models This model [5, 6] is based on the partitioning of the input image f into
”objects” and ”background” based on pixels intensity similarity, and no longer on discontinuities.
The following energy is minimized
Z Z
2
min E(C, c1 , c2 ) = (f − c1 ) dxdy + (f − c2 )2 dxdy + λLength(C). (26)
C,c1 ,c2 inside(C) outside(C)
36
We assume that the foreground can be described by the region where f (x, y) ≈ constant c1 , and
the background can be described by the region where f (x, y) ≈ constant c2 , with c1 6= c2 .
Thus the first two terms form the stopping (segmentation) criterion. The last term is a regular-
ization (penalty on the length of the curve).
In the implicit representation, model (26) can be expressed as
Z Z Z
min E(φ, c1 , c2 ) = (f − c1 )2 H(φ)dxdy + (f − c2 )2 (1 − H(φ))dxdy + λ |∇H(φ)|. (27)
φ,c1 ,c2 Ω Ω Ω
Minimizing the energy E(φ, c1 , c2 ) with respect to its variables, we obtain the associated ex-
tremality conditions R
f (x, y)H(φ)dxdy
c1 = Ω R ,
Ω H(φ)dxdy
∂φ
and for the minimization in φ we use gradient descent, ∂t
= − ∂E
∂φ
, or after simplifications,
∂φ h i
φ(x, y, 0) = φ0 (x, y), = δ(φ) − (f − c1 )2 + (f − c2 )2 + λK(x, y) .
∂t
This model is less dependent on the choice of the initial curve (this does not have to enclose
the objects). Also, interior contours and strongly concave shapes are automatically detected.
For more details we refer to [5, 6].
References
[1] A. Buades, B. Coll, J.M. Morel, A non local algorithm for image denoising, IEEE Computer
Vision and Pattern Recognition 2005, Vol 2, pp: 60-65, 2005.
[2] A. Buades, B. Coll, J.M. Morel, A review of image denoising methods, with a new one, Multi-
scale Modeling and Simulation, Vol4 (2), pp: 490-530, 2006.
[3] V. Caselles, F. Catte, T. Coll, F. Dibos, A geometric model for active contours, Numerische.
Mathematik 66, pp. 1-31, 1993.
[5] T. Chan and L. Vese, An active contour model without edges, Scale-Space Theories in Com-
puter Vision, Lecture Notes in Computer Science, 1682:141-151, 1999.
[6] T.F. Chan and L.A. Vese, Active contours without edges, IEEE Transactions on Image Process-
ing, 10 (2), Feb. 2001, pp. 266 -277.
[7] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 3rd edition, Prentice-Hall, 2008.
[8] R.C. Gonzalez, R.E. Woods and S.L. Eddins, Digital Image Processing Using Matlab, 2nd
edition, Prentice-Hall, 2010.
37
[9] S. Kichenssamy, Kumar, P. Olver, A. Tannenbaum, A. Yezzi, Conformal curvature flows: From
phase transitions to active vision, Archive Rat.Mech. Anal. 134 (3): 275-301 1996.
[10] Osher, S.; Sethian, J. A. (1988), Fronts propagating with curvature-dependent speed: Algo-
rithms based on Hamilton-Jacobi formulations, J. Comput. Phys. 79: 1249.
[11] L. Rudin, S. Osher, E. Fatemi, Nonlinear Total Variation Based Noise Removal Algorithms,
Physica D: Nonlinear Phenomena, Vol. 60, Issues 1-4, 1992.
[12] L. Rudin and S. Osher, Total Variation Based Image Restoration with Free Local Constraints,
ICIP (1) 1994, pp. 31-35.
38