CIS769

CIS 769
DIGITAL IMAGE PROCESSING
Ahmet M. Eskicioglu
Department of CIS
Brooklyn College
1
OVERVIEW
Classification of graphic files

What is a digital image?
What is digital image processing?
Image enhancement
Image restoration
Color image processing
Image compression
Morphological image processing
Image segmentation
Representation and description
Object recognition
2
TEXTBOOKS
Required Textbook
R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall,
2002.
Recommended Textbooks
W. K. Pratt, Digital Image Processing, John Wiley and Sons, 2001.
T. Bose, Digital Signal and Image Processing, John Wiley and Sons,
2004.
I. J. Cox, M. L. Miller and J. A. Bloom, Digital Watermarking, Morgan
Kaufmann Publishers, 2001.
F. Halsall, Multimedia Communications, Addison-Wesley, 2001.
K. R. Rao, Zoran S. Bojkovic, Dragorad A. Milovanovic, Multimedia
Communication Systems, Prentice Hall, 2002.
R. Steinmetz, K. Nahrstedt, Multimedia Fundamentals, Prentice Hall,
2002.
M. Arnold, M. Schmucker, S. D. Wolthusen, Techniques and Applications
of Digital Watermarking and Content Protection, Artech House, 2003.
Mark S Drew, Ze-Nian Li, Fundamentals of Multimedia, Prentice Hall,
2004.
3
GRAPHIC FILES
Graphic file types

indexed system vs. RGB color
bit-mapped vs. object-oriented
Indexed system
Associates one number with each color in the palette.
Example: GIF format (uses 8 bits per color)
RGB color system
3 numbers associated with each color: R, G, B (primary colors)
The color obtained is a mix of the 3 primary colors.
Total number of possible colors: 256x256x256
Example: JPEG
Bit-mapped images (raster graphics)
Images are created or captured as a set of samples.
Each image is treated as a set of pixels each with a color.
The information for each pixel is stored.
Drawing and editing affects the pixels directly.
A raster is a grid of x and y coordinates on a display space.
Standardized format used in PC and Mac platforms.
Examples: JPEG, GIF
Object-oriented images (vector graphics)
Creation of images thru a sequence of commands or mathematical statements that place lines
and shapes in a given 2 or 3 dim space.
The computer maintains a list of objects and their placements in space.
A vector is a representation of both a quantity and a direction.
A vector graphics file is created and saved as a sequence of vector statements.
Objects can be moved around and edited.
When an object-oriented image is printed, the description of the object is sent to the
processor in the printer which renders it in the printer’s resolution.
One size fits all.
Examples: Postcript
4
EXAMPLE: BIT-MAPPED VS. OBJECT-ORIENTED
5
WHAT IS A DIGITAL IMAGE?
A digital image can be defined to be a function of two real

variables, f(x,y), where x and y are spatial coordinates, and
the amplitude f at a given pair of coordinates is called the
intensity of the image at that point. Every digital image is
composed of a finite number of elements, called pixels,
each with a particular location and a finite value.
6
WHAT IS DIGITAL IMAGE PROCESSING?
Digital image processing: analysis, manipulation,

storage, and display of graphical images from sources
such as photographs, drawings, and video.
The process is comprised of 3 steps
Input: the image is either directly captured with a digital
camera or an analog image is digitized in a scanner to obtain
the binary data a computer can understand.
Processing: may include any process including image
enhancement, restoration, compression, segmentation, or
watermarking as well as extraction of image features.
Output: consists of the display/printing of the processed
image, or the features extracted from the input image.
7
IMAGE ENHANCEMENT
The main objective of image enhancement is to process

an image to make it more suitable for a specific
application.
For example, a method that is useful for enhancing X-ray
images may not be the best method for enhancing pictures
of Mars transmitted by a space probe (i.e., a satellite or other
spacecraft that is designed to explore the solar system and
transmit data back to earth).
Two broad categories:
Spatial domain methods: based on direct manipulation of
pixels in an image.
Frequency domain methods: based on modifying the
Fourier transform of an image.
Smoothing filters and sharpening filters.
8
A SPATIAL DOMAIN EXAMPLE
9
A FREQUENCY DOMAIN EXAMPLE
Filtering in the
frequency domain
1. f(x,y)(-1)x+y
2. Compute F(u,v)
3. F(u,v)H(u,v)
4. Inverse DFT
5. Take the real part
6. Multiply the result
by (-1)x+y
10
IMAGE RESTORATION
Image enhancement: largely a subjective process.

contrast stretching is considered an enhancement
technique.
Image restoration: primarily an objective process.
removal of image blur is considered a restoration technique.
Spatial domain methods and frequency domain methods.
11
A SPATIAL DOMAIN EXAMPLE
12
COLOR IMAGE PROCESSING
The colors that humans (and some other animals

perceive) in an object are determined by the nature of the
light reflected from the object.
Two major areas:
Full color: Image is acquired with a full-color sensor (e.g., a
color TV camera)
Pseudo color: a color is assigned to a particular
monochrome intensity or range of intensities.
A color model (color space, color system) is a
specification of a coordinate system and a subspace
within that system where each color is represented by a
point.
13
RGB MODEL
14
PSEUDO COLOR
15
IMAGE COMPRESSION
Lossless compression
Huffman coding
Symbol Probability Code
a2 0.4 1
a6 0.3 00
a1 0.1 011
a4 0.1 0100
a3 0.06 01010
a5 0.04 01011
Lossy compression
JPEG
JPEG 2000
16
IMAGE COMPRESSION
JPEG: based on Discrete Cosine Transform
10:1 30:1 50:1
JPEG 2000: based on discrete wavelet transform
10:1 30:1 50:1
17
MORPHOLOGICAL IMAGE PROCESSING
Morphology: the study of the form and structure of

organisms.
Mathematical morphology: extracting image
components that are useful in the representation and
description of region shape.
Morphological operations for binary images
Morphological operations for gray scale images
An example of a basic morphological algorithm for
binary images is boundary extraction.
2 fundamental operations: dilation and erosion
18
IMAGE SEGMENTATION
Segmentation subdivides an image into its constituent

regions.
Two categories of segmentation algorithms:
partitioning an image based on abrupt changes in intensity
(e.g., line detection)
partitioning an image into regions that are similar according
to a set of criteria
19
LINE DETECTION
Suppose we are interested in finding all the lines in the wire-bond mask
that are one pixel thick and are oriented at -450. Which one of the above
masks do we use?
Strongest Thresholding
response using the max
value in the
image.
20
REGION GROWING
Region growing is a procedure that groups pixels or subregions into

larger regions based on predefined criteria: start with a set of “seed”
points and grow regions by appending to each seed those neighboring
pixels that have properties similar to the seed.
21
IMAGE WATERMARKING
Watermarking (data hiding) is the process of embedding

data into a multimedia element such as image, audio or
video.
A watermarking algorithm consists of:
the watermark structure
an embedding algorithm
an extraction (detection) algorithm
Watermarks can be embedded in the pixel domain or the
transform domain.
Desired attributes
invisibility
robustness
high data capacity
Attacks and normal A/V processes: noise, filtering
(blurring, sharpening, etc.), resampling, scaling, rotation,
cropping and lossy compression.
22
EXAMPLE IN THE DCT DOMAIN
W: watermark to be embedded.
X: sequence of pixel values.
XD and YD: row-concatenated DCT coefficients of X and Y.
a = scaling factor: determines the intensity of the watermark.
YD (i ) = X D (i )(1 + aW )
W*: extracted version of the watermark.

Z: possibly forged watermarked image.
* 1 Z D (i ) * W ⋅W *
W (i ) = 1 ==> S (W , W ) =
a X D (i )
W ⋅W *
T = user-defined threshold. If S > T, image is authentic.
23
SCALING FACTOR a = 0.1, 0.5, 1.0, 5.0
Original image
a = 0.1 a = 0.5 a = 1.0 a = 5.0
24
CHAPTER 1: INTRODUCTION
Interest in DIP methods stem from 2 principal

applications:
Improvement of pictorial information for human interpretation
Processing of image data for storage, transmission, and
representation for autonomous machine perception.
What is a digital image?
f(x,y), x&y: spatial plane coordinates, f(x,y): intensity
x,y, and f(x,y): finite, discrete quantities
picture elements, image elements, pels, pixels.
Humans: limited to the visual band of EM spectrum.
Imaging machines: cover the entire spectrum.
Scope of image processing
Both I and O are images?
Includes computer vision?
Image analysis is between image processing and computer vision
Low-level (I&O: images) process, mid-level (I: image, O:
extracted attributes) process, high-level process.
25
BARTLANE CABLE PICTURE
TRANSMISSION SYSTEM
Early 1920s: Bartlane cable picture transmission system

Reduced the time required to transport a picture across the
Atlantic from more than a week to less than three hours.
Pictures were coded for cable transmission and then
reconstructed at the receiving end on a telegraph printer fitted
with type faces simulating a halftone pattern.
The early Bartlane systems were capable of coding images in
five distinct brightness levels. This was increased to fifteen
levels in 1929.
26
MORE TRANSATLANTIC TRANSMISSIONS
IN 1920’S
27
OTHER HISTORICAL DEVELOPMENTS
1964: Pictures of the moon transmitted by Ranger 7 were

processed by a computer at the Jet Propulsion Lab to
correct various types of image distortion inherent in the
on-board TV camera.
Experience with Ranger 7 led to an improvement in image
enhancement and restoration in other projects.
Surveyor missions to the moon
Mariner series of flyby missions to Mars
Apollo manned flights to the moon
28
APPLICATIONS OF IMAGE PROCESSING
Early 1970’s: Invention of computerized axial tomography

Applications of image processing
For human interpretation
Geography
Archeology.
Physics
Astronomy
Biology
Nuclear medicine
Law enforcement
Defense
For machine perception
Automatic character recognition
Industrial machine vision for product assembly and inspection
Military recognizance
Automatic processing of fingerprints
Screening of X-rays and blood samples
Machine processing of aerial and satellite imagery for weather
prediction and environment assessment
29
ENERGY SOURCES FOR IMAGES
Electromagnetic energy spectrum: Principal source of

energy
The full range of frequencies, from radio waves to gamma rays,
that characterizes light.
Gamma rays: highest energy, shortest wavelength
Radio waves: lowest energy, longest wavelength
Electromagnetic radiation can be described in terms of a
stream of photons, each traveling in a wave-like pattern,
moving at the speed of light and carrying some amount of
energy.
The amount of energy a photon has makes it sometimes
behave more like a wave and sometimes more like a particle.
This is called the "wave-particle duality" of light.
Low energy photons (such as radio) behave more like waves,
while higher energy photons (such as X-rays) behave more
like particles.
Other sources of energy: acoustic, ultrasonic, electronic
30
ELECTROMAGNETIC SPECTRUM
31
GAMMA-RAY
Major uses: Nuclear medicine & astronomical observations
Bone scan: PET: another

locates sites major modality of
of bone nuclear imaging
pathology
One sample of a
sequence that
constitutes a 3d
rendition of the
patient.
A star in the
constellation Cygnus
exploded about 15K Gamma radiation
years ago. from a valve in a
nuclear reactor
The superheated
stationary gas cloud
(Cygnus Loop) glows
in a spectacular array
of colors.
32
X-RAY
X-rays are among the oldest sources of EM radiation used for imaging.
Major uses: Medical diagnostics, astronomy, angiography, CAT scans
Chest X-ray
X-ray image of an
electronic circuit
board
Aortic angiogram
Cygnus Loop
Head CAT slice imaged in the X-ray
band
33
ULTRAVIOLET
Major uses: lithography, industrial inspection, microscopy, lasers, biological imaging,

astronomical observations.
Fluorescence is a phenomenon discovered in the middle of 19th century.
Fluorescence microscopy is a excellent method for studying materials that can be made to
fluoresce (either in natural form or when treated with chemicals capable of fluorescing).
Fluorescence Corn infected by

microscope image “smut,” a disease of
of normal corn cereals, corn, etc.
Cygnus Loop imaged

in the high-energy
region of the
ultraviolet band.
34
VISIBLE AND INFRARED – LIGHT MICROSCOPE
Major uses: light microscopy, astronomy, remote sensing, industry, law enforcement.
Cholesterol
Taxol (anticancer agent) Microprocessor
Nickel oxide thin film Surface of audio CD
Organic superconductor
35
VISIBLE AND INFRARED – REMOTE SENSING
LANDSAT obtains
and transmits
images of the Earth
from space, for
Remote sensing: purposes of
usually includes several monitoring
bands in the visual and environmental
infrared regions. conditions.
Multispectral imaging:
One image for each

band in the above
table.
The differences
between visual and
infrared image
features are quite
noticeable.
36
VISIBLE AND INFRARED – WEATHER
OBSERVATION AND PREDICTION
An image of a hurricane taken by a

satellite using sensors in the visible
and infrared bands.
37
INFRARED – HUMAN SETTLEMENTS
(THE AMERICAS)
These images are part of

Nighttime Lights of the
World data set.
This set provides a

global inventory of
human settlements.
38
INFRARED – HUMAN SETTLEMENTS
(OTHER PARTS OF THE WORLD)
39
VISIBLE – AUTOMATED VISUAL INSPECTION OF
MANUFACTURED GOODS
Pill container
A circuit
board
controller
(the back
square is a
missing
component)
A clear plastic
part with an
unacceptable
A bottle that number of air
is not filled pockets.
up to an
acceptable
level.
An intraocular
A batch of cereal implant
inspected for (replacement
color and the lens for the
presence of human eye)
anomalies such
as burned flakes.
40
VISIBLE – OTHER APPLICATIONS
Paper
currency
A thumb
print
Automated
license plate
reading
41
MICROWAVE – IMAGING RADAR
Imaging radar
• has the unique ability to

collect data over
virtually any region at
any time, regardless of
weather or ambient
lightning conditions.
• Works like a flash

camera with its own Note the clarity and detail of
illumination. the image, unencumbered by
clouds or other atmospheric Lhasa River
• Uses an antenna and conditions that normally
digital computer interfere with image in the
processing to record visual band!
images.
42
RADIO – MRI
Major uses: medicine and astronomy
Magnetic resonance imaging (MRI):
• places a patient in a powerful magnet and passes radio

waves through the body in short pulses.
• each pulse causes a responding pulse of radio waves to

be emitted by the patient’s tissues.
• produces a 2D picture of section of the patient.
43
CRAB PULSAR IMAGES
In the summer of 1054 A.D., Chinese astronomers reported that a star in the
constellation of Taurus suddenly became as bright as the full Moon. Fading
slowly, it remained visible for over a year. It is now understood that a spectacular
supernova explosion - the detonation of a massive star whose remains are now
visible as the Crab Nebula - was responsible for the apparition. The core of the
star collapsed to form a rotating neutron star or pulsar, one of the most exotic
objects known to 20th century astronomy. Like a cosmic lighthouse, the rotating
Crab pulsar generates beams of radio, visible, x-ray and gamma-ray energy
which, as the name suggests, produce pulses as they sweep across our view.
Each image gives a totally different view of the Pulsar.
44
OTHER IMAGING MODALITIES:
ACOUSTIC IMAGING
Major uses: geological exploration, industry, medicine
This target is brighter than the

surrounding layers because the
change in density in the target
region is larger.
45
ULTRASOUND IMAGING
Ultrasound imaging: millions of HF sound pulses and

echoes are sent and received each second.
46
ELECTRON MICROSCOPY
Electron microscopes
• use a focused beam of electrons instead of light.

• are capable of very high magnification (10,000X or more).
• transmission electron microscope (TEM), scanning electron microscope (SEM)
47
COMPUTER-GENERATED OBJECTS
Images that are not obtained from physical objects.
48
FUNDAMENTAL STEPS IN
DIGITAL IMAGE PROCESSING
49
WHAT ARE THE FUNDAMENTAL STEPS?
Image acquisition: capturing an image in digital form.

Image enhancement: making an image look better in a subjective
way.
Image restoration: improving the appearance of ana image
objectively.
Color image processing: color models and basic color processing
Wavelets and multiresolution processing: Wavelet transform in one
and two dimensions.
Image compression: reducing the stored and transmitted image
data.
Morphological image processing: extracting image components that
are useful in the representation and description of shape.
Image segmentation: partitioning an image into its constituent parts
or objects.
Representation and description: boundary representation vs. region
representation. Boundary descriptors vs. region descriptors.
Recognition: assigning a label to an object based on its descriptors.
50
COMPONENTS OF AN
IMAGE PROCESSING SYSTEM
51
WHAT ARE THE COMPONENTS?
Image sensors and specialized IP HW

A physical device that is sensitive to the energy radiated by the
object.
A digitizer that converts the output of the physical sensing
device into digital form.
HW that performs other primitive operations.
Computer: Can range from a PC to a supercomputer.
Image processing SW: Specialized modules that perform
specific tasks.
Mass storage
Short term storage: main memory, frame buffers
On-line storage: magnetic disks and optical disks
Archival storage: magnetic tapes and optical disks in
“jukeboxes”
Hard copy: all types of printers, film cameras, optical disks
(CDs and DVDs).
Networking: bandwidth is the key consideration.
52
CHAPTER 2: DIGITAL IMAGE FUNDAMENTALS
Structure of the human eye

3 membranes:
• cornea & sclera

• choroid
• retina
Cornea: tough, transparent

tissue that covers the Iris diaphragm:
anterior surface of the eye. contracts or
d ≈ 20mm expands to control
Sclera: opaque membrane. the amount of light.
Choroid: contains a
network of blood vessels.
Retina: innermost Two types of receptors: cones and rods.

membrane.
Cones: 6-7 million. Each connected to its own
nerve end. Highly sensitive to color.
Cone vision is called photopic vision.
Rods: 75-150 million. Several are connected to a

Lens: suspended by fibers attached to the single nerve end. Serve to give an overall picture
ciliary body (60-70% water, 6% fat, protein). of the field of view. Not involved in color vision.
Rod vision is called scotopic vision.
Absorbs ~8% of the visible light spectrum.
Blind spot: the area on the retina without
Infrared and ultraviolet light are also absorbed. receptors that respond to light.
Fovea: Circular indentation in the retina (~1.5mm

in diameter)
53
CONE AND ROD CELLS
cone cell rod cell
54
DISTRIBUTION OF RODS AND CONES
55
IMAGE FORMATION IN THE EYE
The receptors transform radiant

energy into electrical impulses that
are decoded by the brain. What is the
height of
the image?
Focal length varies

from 14mm to 17mm.
56
BRIGHTNESS ADAPTATION
The range of light intensity levels to which the HVS can adapt is
enormous – on the order of 1010!
Experimental evidence indicates that subjective brightness is a

logarithmic function of the light intensity.
However, the HVS cannot operate over such a range simultaneously.
Brightness adaptation: the total range of distinct intensity levels that

can be discriminated is rather small.
Brightness adaptation level:
the current sensitivity level
of the HVS for any given set
of conditions.
Short curve represents the

range of subjective
brightness the eye can
perceive at this level.
The transition from scotopic to

photopic vision is gradual (-3 to
-1 mL in log scale).
57
BRIGHTNESS DISCRIMINATION
How does the eye discriminate between changes in light intensity at a

specific adaptation level?
A classical experiment:
A subject looks at a flat, uniformly illuminated area (large enough to occupy the
entire field of view).
The intensity I can be varied.
∆I is added in the form of a short-duration flash that appears as a circle in the
middle.
If ∆I is not bright enough, the subject says “no.”
As ∆I gets stronger, the subject may say “yes.”
When ∆I is stronger enough, the subject will say “yes” all the time.
Weber ratio: ∆Ic/ I, ∆Ic: increment of illumination discriminable 50% of the time with
background illumination I.
58
WEBER RATIO AS FUNCTION OF INTENSITY
Poor brightness Good brightness

discrimination: A large discrimination: A small value
value of ∆Ic/ I means a of ∆Ic/ I means a small
large percentage change percentage change in
in intensity is intensity is discriminable.
discriminable.
59
PERCEIVED BRIGHTNESS: NOT A SIMPLE
FUNCTION OF INTENSITY – MACH BANDS
The HVS tends to

undershoot or
overshoot around
the boundary of
regions of
different
intensities.
The intensities of the stripes is

constant but we perceive a
brightness pattern that is
strongly scalloped, especially
near the boundaries.
Ernst Mach (1838-1916) described the

phenomenon in 1865.
60
PERCEIVED BRIGHTNESS: NOT A SIMPLE FUNCTION
OF INTENSITY – SIMULTANEOUS CONTRAST
A region’s perceived brightness does not simply depend on its

intensity.
61
OPTICAL ILLUSIONS: OTHER EXAMPLES OF
HUMAN PERCEPTION PHENOMENA
A few lines
The outline of are sufficient
a square is to give the
seen clearly illusion of a
although complete
there are no circle.
lines defining
such a figure.
All lines that

Two are oriented at
horizontal line 450 are
segments equidistant
have the same and parallel.
length but
one appears
shorter than
the other.
62
LIGHT AND ELECTROMAGNETIC SPECTRUM
Electromagnetic spectrum
The range of colors we perceive in visible light represents a
very small portion of the spectrum.
Radio waves with wavelengths billions of times longer.
Gamma rays with wavelengths billions of times smaller.
Wavelength, frequency and energy
λ = c/v, c: speed of light (2.998x108 m/s)
E = hv, h: Planck’s constant
Electromagnetic waves can be visualized as
propagating sinusoidal waves with wavelength λ.
a stream of massless particles, each traveling in a wavelike
pattern and moving at the speed of light.
Each massless particle contains a bundle of energy called a photon.
Higher frequency electromagnetic phenomena carry more energy
per photon.
The visible band: 0.43 µm (violet) – 0.79 µm (red)
Six broad color regions: violet, blue, green, yellow, orange, red.
63
ELECTROMAGNETIC SPECTRUM
64
COLOR PERCEPTION
The colors perceived in an object are determined by the

nature of light reflected from the object.
A body that reflects light and is relatively balanced in all
visible wavelengths appears white.
A green object reflects light with wavelengths primarily in
[500-570] nm range, and absorb most of the energy at other
wavelengths.
Achromatic or monochromatic light: light that is void of
color.
Three basic quantities to describe the quality of a chromatic
light source
Radiance: total amount of energy that flows from the light
source - measured in watts (W)
Luminance: a measure of the amount of energy an observer
perceives from a light source - measured in lumens (lm).
Brightness: a subjective descriptor of light perception that is
impossible to measure (embodies the achromatic notion of
intensity).
65
IMAGE SENSING AND ACQUISITION
If a sensor can be developed with the capability of detecting

energy radiated by a band of the EM spectrum, we can image
events in that band!
The wavelength of am EM wave required to see an object <=
the size of the object!
A water molecule’s diameter ~10-10.
To study molecules, we need a source capable of emitting in the
far ultraviolet or soft X-ray region!
Illumination
Visible light, radar, infrared, X-ray, etc.
Ultrasound, etc.
Scene elements
Familiar 3-D objects
Molecules, buried rock formations, human brain
Illumination energy
Reflected from objects: light reflected from a planar object
Transmitted through objects: X-ray through a patient’s body
66
SENSORS
3 principal
sensor
arrangements
Illumination energy is
transformed into
digital images.
67
SINGLE SENSOR
Arrangement used in
high-precision scanning
68
SENSOR STRIPS
Typical arrangement in
most flat bed scanners
Basis for
medical and
industrial CAT
69
SENSOR ARRAYS
The energy from

an illumination
source is reflected
from a scene
element. The lens projects
the viewed scene
onto the lens focal
plane.
The sensor array The digitized

The incoming produces outputs image is obtained.
energy is proportional to
collected and the integral of the
focused onto light received at
an image plane. each sensor.
70
A SIMPLE IMAGE FORMATION MODEL
When an image is generated from a physical process, its

values are proportional to energy radiated by a physical
source.
Hence, 0 < f(x,y) < ∞
f(x,y) may be characterized by 2 components:
Amount of source illumination incident on the scene being
viewed – illumination i(x,y)
Amount of illumination reflected by the objects in the scene –
reflectance r(x,y)
The 2 functions combine as a product: f(x,y) = i(x,y)r(x,y)
0 < i(x,y) < ∞ (theoretical bound)
Typical values of i(x,y)
z On a clear day: 90,000 lm/m2
z On a cloudy day: 10,000 lm/m2
z On a clear evening: 0.1 lm/m2
0 < r(x,y) < 1 (theoretical bound)

Typical values of r(x,y)
z 0.01 for black velvet
z 0.65 for stainless steel
z 0.80 for flat-white wall paint
71
GRAY-SCALE IMAGES
(x0,y0) => l = f(x0,y0)
Lmin ≤ l ≤ Lmax
Lmin: positive, Lmax: finite
Lmin= iminrmin
Lmax= imaxrmax
Lmin ≈ 10, Lmax ≈ 1000: typical limits for indoor values
[Lmin, Lmax]: gray scale
[0,L-1], l =0 is black and l = L-1 is white.
Intermediate values: shades of gray
72
SAMPLING AND QUANTIZATION
The output of most sensors is continuous voltage

waveform.
To create a digital image, continuous sensed data should
be converted into digital form.
Two processes: sampling and quantization.
Consider a function f(x,y) representing an image.
f(x,y) is continuous w.r.t.
x and y coordinates
Amplitude
Sampling: digitizing the coordinate values
Quantization: digitizing the amplitude values
73
GENERATING A DIGITAL IMAGE
74
IMAGE ACQUISITION WITH A SENSING ARRAY
75
DIGITAL IMAGE REPRESENTATION
f(0,0)
f(0,N-1)
L: # of discrete gray levels
L = 2k
b=MxNxk
f(M-1,0)
76
STORAGE REQUIREMENTS
77
SPATIAL AND GRAY LEVEL RESOLUTION
Sampling is performed by deleting rows and columns from the original image.
78
RESAMPLING INTO 1024X1024 PIXELS
Resampling is performed by row and column duplication.
79
256/128/64/32 GRAY LEVELS
In these
images, the #
of samples is
constant but
the # of gray
levels was
reduced from False contouring
256 to 32.
80
16/8/4/2 GRAY LEVELS
In these
images, the #
of samples is
constant but
the # of gray
levels was
reduced from
16 to 2.
81
ISOPREFERENCE CURVES
Huang (1965) varied N and k

simultaneously, and attempted to
quantify the effects on image
quality:
Sets of these 3 types of images were

generated by varying N and k.
Observers were asked to rank the

Points on an
images according to their subjective isopreference curve
quality. correspond to images
of equal subjective
Results were summarized in the form of
isopreference curves in the Nk-plane.
quality.
82
ZOOMING DIGITAL IMAGES
Zooming requires 2 steps:

The creation of new pixel locations.
The assignment of gray levels to these new locations.
Nearest neighbor interpolation
A 500x500 image will be enlarged 1.5 times.
Lay an imaginary 750x750 grid over the original image.
The spacing is less than one pixel because the grid is fitted over a
smaller image.
For any point in the overlay, look for the closest pixel in the original
image, and assign its gray level to the new pixel in the grid.
When all the new pixels are assigned values, expand the overlay grid
to the original specified size to obtain the zoomed image.
Pixel replication is a special case that is applicable when the size of
the image needs to be increased an integer number of times.
Nearest neighbor interpolation is fast but it produces a checkerboard
effect!
Bilinear interpolation - A slightly more sophisticated method
(x',y'): coordinates of a point in the zoomed image.
v (x',y'): gray level assigned to the point.
v (x',y') = ax'+by'+cx'y'+d (the 4 coefficients are determined using the 4
nearest neighbors of (x',y').
83
SHRINKING DIGITAL IMAGES
Similar to image zooming.

Shrinking an image an integer number of times
Pixel replication is replaced by row&column deletion.
Shrinking an image by a non-integer factor
Expand the grid to fit over the original image.
Do gray-level interpolation (nearest neighbor or bilinear).
Shrink the grid back to its original specified size.
It is possible to use more neighbors for interpolation.
Points are fitted with a more complex surface.
Generally gives smoother results.
An important consideration for a number of applications.
Image generation for 3D graphics
Medical image processing
The extra computational burden is seldom justifiable for
general-purpose digital image zooming and shrinking.
In general, bilinear interpolation is the method of choice.
84
IMAGE ZOOMING: NEAREST NEIGHBOR
INTERPOLATION VS. BILINEAR INTERPOLATION
85
NEIGHBORS OF A PIXEL
4 diagonal neighbors
of p, ND(p)
8-neighbors of p, N8(p) 4-neighbors of p, N4(p)
pixel p at (x,y)
86
ADJACENCY, CONNECTIVITY
V: set of gray level values

3 types of adjacency
4- adjacency: 2 pixels p and q with values from V are 4-
adjacent if q is in the set N4(p)
8- adjacency: 2 pixels p and q with values from V are 8-
adjacent if q is in the set N8(p)
m- adjacency: 2 pixels p and q with values from V are m-
adjacent if
q is in N4(p), or
q is in ND(p) and the set N4(p) ∩ N4(q) has no pixels whose values
are from V
A digital path from pixel p with coordinates (x,y) to pixel q
with coordinates (s,t) is a sequence of distinct pixels with
coordinates (x0,y0), (x1,y1), …, (xn,yn), where (x0,y0)= (x,y)
and (xn,yn)=(s,t), and pixels (xi,yi) and (xi-1,yi-1) are adjacent
for 1 ≤ i ≤ n.
87
REGIONS, BOUNDARIES
S: a subset of pixels in an image. Two pixels p and q are

said to be connected in S if there exists a path between
them consisting entirely of pixels in S.
For any pixel p in S, the set of pixels that are connected to it in
S is called a connected component of S.
If S has only one connected component, it is called a connected
set.
R: a subset of pixels in an image. R is a region of the
image if R is a connected set.
The boundary of a region R is the set of pixels in the region
that have one or more neighbors that are not in R.
88
DISTANCE MEASURES
p with (x,y) D(p,q) = 0 iff p = q

q with (s,t) D(p,q) = D(q,p)
z with (v,w) D(p,z) ≤ D(p,q) + D(q,z)
Euclidean distance between p and q: De(p,q) = [(x-s)2 + (y-t)2]1/2
D4 distance: D4(p,q) = |x-s| + |y-t|
D8 distance: D8(p,q) = max (|x-s| , |y-t|)
D4 and D8 distances between p and q are independent of any

paths that might exist between the points.
For m-adjacency, Dm distance between two points is defined as
the shortest m-path between the points.
89
LINEAR AND NONLINEAR OPERATIONS
H: an operator whose I and O are images.

f and g: any two images
a and b: two scalars
H is a linear operator if H(af + bg) = aH(f) + bH(g).
Examples
Sum of K images: operator is linear.
Absolute value of the difference of 2 images: operator is not
linear.
Linear operations are very important in image processing
because the theory is well-established.
Nonlinear operations sometimes offer better performance
but the theory is not understood well!
90
CHAPTER 3: IMAGE ENHANCEMENT IN THE
SPATIAL DOMAIN
Principal objective: to process an image so that the result

is more suitable than the original image for a specific
application.
There is no general theory of image enhancement.
The viewer is the ultimate judge of how well a particular
method works.
Visual evaluation of image quality is a highly subjective
process!
Spatial domain: aggregate of pixels composing an image.
Spatial domain processes: g(x,y) = T[f(x,y)]
Processed Input
image image
Operator on f defined over

some neighborhood of f(x,y)
91
DEFINING A NEIGHBORHOOD
A square or rectangle is commonly

used in defining the neighborhood
about a point (x,y).
Operator T is applied at each location (x,y)

to produce the output g at that location.
Other neighborhood shapes

(e.g., approximation to a circle)
are sometimes used.
92
GRAY LEVEL TRANSFORMATION FUNCTION
Contrast stretching: Thresholding function:

Produces an image of higher Produces a binary image.
contrast than the original.
Neighborhood of size 1x1 => T is a gray level transformation function.
Larger neighborhoods => masks (filters)
93
BASIC GRAY LEVEL TRANSFORMATIONS
Among the simplest image enhancement techniques.

s = T(r): T maps a pixel value r into a pixel value s.
Three basic types of transformations
Linear (negative and identity)
Logarithmic (log and inverse-log)
Power law (nth power and nth root)
94
IMAGE NEGATIVES
Shows a small lesion.
s=L-1–r
Produces an equivalent of a photographic negative.
Enhances white or gray detail embedded in dark regions.
95
LOG TRANSFORMATIONS
8-bit display
Range: [0, 1.5x106]
Compresses the dynamic

range of image with large
variations in pixel values.
s = c log (1 + r), r ≥ 0
Maps a narrow range of low gray-level values into a wider range.
Maps a wide range of high gray-level values into a narrower range.
96
POWER LAW TRANSFORMATIONS
s = crγ
Map a narrow range
of dark input values
or
into a wider range of c(r + ε)γ , c and γ > 0
output values.
Map a wide range

of light input values
into a narrower range
of output values. Map a wide range
of dark input values
into a narrow range of
output values.
Map a narrow range

of light input values
into a wide range of
output values.
97
GAMMA CORRECTION
s = r 1/2.5
A variety of devices for
image capture, printing
and display respond
according to power law.
Such systems tend to

produce images that are
darker than intended.
98
CONTRAST MANIPULATION
Original image
is predominantly
dark! s = r 0.6
An expansion of
gray levels is
desired.
s = r 0.4 s = r 0.3
99
CONTRAST MANIPULATION
Original image A compression of

has a washed-out gray levels is
appearance! desired.
s = r 3.0
s = r 4.0 s = r 5.0
100
PIECEWISE-LINEAR TRANSFORMATIONS
A complementary approach to the previous methods.

Advantage: The form of piecewise functions can be
arbitrarily complex.
Disadvantage: requires considerable more user input.
Contrast stretching
Increases the dynamic range of gray levels in the image.
Gray-level slicing
Highlights a specific range of gray levels in the image.
Bit-plane slicing
Highlights the contribution made by specific bits.
If each pixel is represented by 8 bits, the image is sliced into 8
planes.
101
CONTRAST STRETCHING
Typical
transformation
Ø
(r1,s1) and (r2,s2)
control the shape of
the function.
r1 = s1
r2 = s2 Ö linear
r1 = r2
s1 = 0 Ö threshold
s2 = L - 1
(r1, s1) = (rmin, 0) (r1, s1) = (m, 0)

r1 ≤ r2 and s1 ≤ s2: (r2, s2) = (rmax, L - 1) (r2, s2) = (m, L - 1)
single valued
monotonically
increasing function mean gray level
102
GRAY-LEVEL SLICING
Two approaches:
Display a high level for all

gray levels in the range of
interest and a low value for
all other gray levels.
Brighten the desired range

of gray levels but preserve
the background and gray
level tonalities.
A gray scale image Transformation (a) applied.
103
BIT-PLANE SLICING
Higher-order bits contain

the majority of the visually
Slicing into significant data.
8 bit planes
Other bit planes

contain subtle
details.
104
HISTOGRAM PROCESSING
Histogram of a digital image

h(rk) = nk
rk : kth gray level
nk : # of pixels with having gray level rk
Normalized histogram
p(rk) = nk / n
Σ all components of a normalized histogram = 1.
Histogram manipulation for image enhancement.
Histogram data is also useful in other applications.
Image compression
Segmentation
Histograms are simple to calculate in software.
105
FOUR IMAGE TYPES
Compare the four images

and their histograms.
106
HISTOGRAM EQUALIZATION
k nj
sk = ∑ n , k = 0,1,2,..., L −1
j=0
Has the general tendency of spreading

the histogram of the input image.
107
AN EXAMPLE: LENA
108
HISTOGRAM MATCHING
Histogram
equalization
Is the result
any good?
There are applications for which histogram equalization is not the

best approach.
Sometimes it is useful to specify the shape of the histogram for the

processed image.
Histogram matching: method to generate a processed image that

has a specified histogram.
109
PROCEDURE FOR HISTOGRAM MATCHING
Obtain the histogram of the given image.

Precompute sk for rk:
sk= T(rk) = Σ pr(rj) = Σ (nj/n), k = 0,…,L-1
nj: # of pixels with gray level rj
n: total # of pixels in the image
L: # of discrete gray levels
Obtain the transformation function G from the given histogram:
vk = G(zk) = Σ pz(zi) = sk, k = 0,…,L-1
Precompute zk for each sk using the following iterative scheme:
G is a hypothetical transformation.
We need to find zk that correspond to sk
zk must satisfy the equation G(zk) = sk, i.e., G(zk) - sk = 0
Iterate on values of z to satisfy the equation for k = 0,…,L-1
Start with zk = ẑ for each k, where ẑ is the smallest integer in [0,L-1],
and increment in integer values
For k=k+1, start with ẑ = zk
rk Æ sk Æ zk (Use the precomputed values for these mappings)
110
SPECIFIED HISTOGRAM
Preserves the general

shape of the original
histogram but has a
smoother transition of
levels in the dark region.
G(z)
G-1(s)
Note that the low end of the

histogram has shifted right
toward the lighter region.
111
MAPPING FROM rk TO zk
112
LOCAL ENHANCEMENT WITH
Histogram equalization
Histogram matching Global methods for overall enhancement
Sometimes it is necessary to enhance details over small areas.
1. Define a neighborhood around a pixel

2. At each location
• Compute the histogram
• Obtain the transformation function (equalization or specification)
• Use the function to map the gray level of the center pixel
3. Move the center to an adjacent pixel location.
Reveals the presence

of small squares
inside the larger dark
squares.
113
HISTOGRAM STATISTICS
FOR IMAGE ENHANCEMENT
Local mean Can be used to develop simple yet powerful

Local variance enhancement techniques!
Ef(x,y) if mSxy ≤ k0MG and k1DG ≤ σSxy ≤ k2DG

g(x,y) =
f(x,y) otherwise
Global mean Global SD
Input Output
E = 4.0, k0 = 0.4, k1 = 0.02, k2 = 0.4

Selection of appropriate vales requires experimentation.
114
ENHANCEMENT USING ARITHMETIC/LOGIC
OPERATIONS
Arithmetic operations
Subtraction
Most useful in image enhancement
Addition
Multiplication: used as a masking operation
Division: multiplication of one image by the reciprocal of the
other.
Logical operations
AND For these operations, gray scale
Used for masking
OR pixel values are processed as
NOT strings of binary numbers.
Frequently used in conjunction with morphological
operations.
115
AND/OR MASKS
116
IMAGE SUBTRACTION
g(x,y) = f(x,y) – h(x,y) Two principal ways to scale:
1. Add 255 to every pixel

and divide by 2.
2. Find the min difference

and add its negative to
all the pixel values. Then
multiply each pixel by
255/Max (Max is the max
pixel value in the modified
difference image).
Contains more detail

than the darker image.
117
MASK MODE RADIOGRAPHY
The mask is an X-ray image of a region of a patient’s body captured

by an intensified TV camera.
A contrast medium is injected into the patient’s blood stream.
A series of images are taken of the same region.
The net effect of subtracting the mask from each sample: areas that are
different appear as enhanced detail in the output image.
118
IMAGE AVERAGING
g(x,y) = f(x,y) + η(x,y)
At every pair of coordinates,

the noise is uncorrelated and
has zero average value.
Averaging K different noisy images gi(x,y), we get f(x,y).
The average approaches f(x,y) as the # of noisy images increases.
The images gi(x,y) should be aligned to avoid the introduction of artifacts.
As in the case of subtraction, scaling is needed.

The sum of K 8bit images can range from 0 to 255K.
Image averaging may result in negative values.
119
GALAXY PAIR NGC 3314
Totally useless!
Averaging
120
DIFFERENCE IMAGES AND THEIR HISTOGRAMS
Mean and SD of the

difference images
decrease as the
value of K increases.
121
BASICS OF SPATIAL FILTERING
The filter mask is moved from point to point in an image.

At each point (x,y), the response of the filter is calculated.
Linear spatial filtering
3x3 mask
R = ω(-1,-1)f(x-1,y-1) + ω(-1,0)f(x-1,y) + ω(0,0)f(x,y) + …
+ ω(1,0)f(x+1,y) + ω(1,1)f(x+1,y+1)
ω(0,0) coincides with f(x,y) indicating that the mask is
centered at f(x,y).
Our focus will be on masks of odd sizes.
Image of size MxN, filter mask of size mxn.
a b
g ( x, y ) = ∑ ∑ ω ( s, t ) f ( x + s, y + t )
s = − at = − b
where a = (m-1)/2 and b = (n-1)/2
122
3x3 MASK
123
FILTERING AT THE BORDERS
What happens when the center of the filter approaches the

border of the image?
Several alternatives:
1. Limit the excursions of the center of the mask to be at a distance no
less than (n-1)/2 pixels. The resulting filtered image will be smaller than
the original
2. Filter all pixels with only the section of the mask that is fully contained
in the image.
3. Pad the image by adding rows and columns of zeros (or other constant
gray level).
4. Pad the image by replicating rows and columns.
Additional Center of a 3x3 mask

rows & columns
124
SMOOTHING SPATIAL FILTERS
Used for blurring and noise reduction.

Smoothing linear filters
Also called averaging filters or low pass filters.
The value of every pixel is replaced by the average of the
gray levels in the neighborhood.
Standard average and weighted average.
Weighted averaging filter: Image of size MxN, filter of size
mxn.
a b
∑ ∑ ω ( s, t ) f ( x + s, y + t )
g ( x, y ) = s =− a t =−b
a b
∑ ∑ ω ( s, t )
s =− as =−b
Order-statistics filters
Nonlinear spatial filters
Best-known example: Median filter
125
TWO 3X3 SMOOTHING FILTERS
A box filter The basic strategy is

to reduce blurring.
126
FIVE DIFFERENT FILTER SIZES
Pronounced black border is the

result of padding the border with
0’s, and the trimming off the
padded area.
127
BLURRING AND THRESHOLDING
THRESHOLD = 25% of highest

intensity in the blurred image
128
ORDER-STATISTICS FILTERS
Response of the filter is based on ordering (ranking)

the pixels.
The value of the center pixel is replaced by the value
determined by the ordering result.
Median filters
For certain types of noise, they provide excellent noise
reduction with less blurring than linear filters of similar
size (very effective in the presence of salt-and-pepper
noise).
The median, ξ, of a set of values: half of the values ≤ ξ,
and half ≥ ξ.
3x3 neighborhood: median is the 5th largest value.
5x5 neighborhood: median is the 13th largest value.
Principal function is to force points with distinct gray
levels to be more like their neighbors.
Max filter: 100th percentile
Min filter: 0th percentile
129
EFFECT OF AVERAGING AND MEDIAN FILTERS
Which one has removed the

salt-and-pepper noise??
130
SHARPENING SPATIAL FILTERS
Used for highlighting fine detail or enhancing detail that

has been blurred.
Averaging is analogous to integration.
Sharpening is analogous to differentiation.
First derivatives
f'(x) = f(x + 1) – f(x)
Must be zero in flat areas
Must be non-zero at the onset of a gray-level step or ramp
Must be nonzero along ramps
Second derivatives
f''(x) = f(x + 1) + f(x - 1) – 2f(x)
Must be zero in flat areas
Must be non-zero at the onset and end of a gray-level step or
ramp
Must be zero along ramps of constant slope
131
1ST AND 2ND DERIVATIVES
132
∇
2ND DERIVATIVES OF ENHANCEMENT – THE

LAPLACIAN
f'x(x,y) = f(x + 1,y) + f(x - 1,y) – 2f(x,y)
f'y(x,y) = f(x,y + 1) + f(x,y - 1) – 2f(x,y)
∇2f = f(x + 1,y) + f(x - 1,y) + f(x,y + 1) + f(x,y - 1) – 4f(x,y)
f(x,y) - ∇2f(x,y) if the center coefficient is “-”

g(x,y) =
f(x,y) + ∇2f(x,y) if the center coefficient is “+”
133
IMPLEMENTATIONS OF THE LAPLACIAN
134
SHARPENED NORTH POLE OF THE MOON
g(x,y) is usually implemented

with one pass of a single mask.
g(x,y) = f(x,y) - ∇2f(x,y)
0 -1 0
-1 5 -1
0 -1 0
Small details are

enhanced
And the background
tonality
is perfectly preserved.
135
TWO LAPLACIAN MASKS
136
UNSHARP MASKING & HIGH-BOOST FILTERING
Unsharp masking: f s ( x, y ) = f ( x, y ) − f ( x, y )
Blurred version of f
High-boost filtering: f hb ( x, y ) = Af ( x, y ) − f ( x, y )
137
AN APPLICATION OF BOOST FILTERING
-1 -1 -1
-1 A+8 -1
-1 -1 -1
Laplacian with A = 0
138
1ST DERIVATIVES OF ENHANCEMENT – THE
GRADIENT
∇f ≈ |Gx| + |Gy|
∇f ≈ |z9 – z5| + |z8 – z6|
The weight value

of 2 is used to Sobel operators
achieve some
smoothing by
giving more ∇f ≈ |(z7+2z8+z9)-(z1+2z2+z3)|
importance to the + |(z3+2z6+z9)-(z1+2z4+z7)|
center point.
139
APPLICATION OF SOBEL OPERATORS
Gradient is frequently used in industrial inspection.
Constant or slowly varying shades

of gray have been eliminated.
140
COMBINING SPATIAL ENHANCEMENT METHODS
141
CHAPTER 4: IMAGE ENHANCEMENT IN THE
FREQUENCY DOMAIN
French mathematician J. B. J. Fourier was born in 1768.

He died of heart disease at the age of 59.
Fourier Series: Any periodic function can be expressed
as the sum of complex exponentials (sines and/or
cosines) of different frequencies, each multiplied by a
different coefficient.
f(x): single variable, continuous function with period T0
∞
f ( x) = ∑c e
k = −∞
k
jΩ 0 kx
, Ω 0 = 2π / T0
Fourier Transform: Non-periodic functions can also be

represented in terms of complex exponentials.
f(x): single variable, continuous function.
∞ ∞
F (u ) = ∫ f ( x)e − j 2πux
dx f ( x) = ∫ F (u )e j 2πux du
−∞ −∞
Fourier transform pair

142
FOURIER SERIES
143
FOURIER TRANSFORM IN 2 VARIABLES
Fourier Transform can easily be extended to 2 variables:
∞ ∞
F (u, v) = ∫ ∫ f ( x, y )e − j 2π (ux + vy ) dxdy
−∞ −∞
∞ ∞
f ( x, y ) = ∫ ∫ F (u, v)e j 2π (ux +vy ) dudv
−∞ −∞
144
ONE-DIMENSIONAL DFT AND ITS INVERSE
f ( x ), x = 0,1,2,..., M − 1 : discrete function of one variable
M −1
1
F (u ) =
M
∑ f ( x )e
x =0
− j 2πux / M
, u = 0,1,..., M − 1
Approximately M2 summations and multiplications to compute
M −1
f ( x) = ∑ F (u )e j 2πux / M , x = 0,1,..., M − 1
u =0
The DFT and its inverse always exist.
145
FREQUENCY DOMAIN &
FREQUENCY COMPONENTS
e jθ = cosθ + j sin θ
cos(−θ ) = cosθ
M −1
∑ f ( x)[cos 2πux / M − j sin 2πux / M ], u = 0,1,2,..., M − 1.

1
⇒ F (u ) =
M
x =0
The domain over which the values of F(u) range is called the frequency domain.
Each of the M terms of F(u) is called a frequency component of the transform.
u & F(u): frequency domain & frequency components.

x and f(x): time domain and time components.
A useful analogy: Compare Fourier transform to a glass prism.
Glass prism: a physical device that separates light into its various color components.
Fourier transform: a mathematical prism that separates a function into various

components based on frequency content.
146
FOURIER TRANSFORM
IN POLAR COORDINATES
F (u ) =| F (u ) | e − jφ (u ) ,
⎡ I (u ) ⎤
where | F (u ) |= [ R 2 (u ) + I 2 (u )]1/ 2 and φ (u ) = tan −1 ⎢ ⎥
⎣ R (u ) ⎦
magnitude phase angle
P(u ) =| F (u ) | 2 = R 2 (u ) + I 2 (u )
power spectrum
147
A ONE-DIMENSIONAL EXAMPLE
It is common practice to multiply f(x)

M = 1024 by (-1)x before taking the transform.
A=1
K=8
Important things to note:
1. The height of the

spectrum doubled as
the area under the
curve in the x-domain
doubled.
2. The # of zeros in the
spectrum in the same
interval doubled as
the length of the
function doubled.
M = 1024
A=1
K = 16
148
SAMPLES IN SPATIAL AND FREQUENCY
DOMAINS
f ( x ), x = 0,1,..., M − 1 : M samples (equally spaced but arbitrary points)
f ( x0 ) : first point in the sequence
f ( x0 + ∆x), f ( x0 + k∆x),..., f ( x0 + [ M − 1]∆x)

f ( x) = f ( x0 + x∆x)
1
∆u =
F (u ), u = 0,1,..., M − 1 : M samples M∆x
F (0) : first point in the sequence
F (0 + ∆u ), F (0 + k∆u ),..., F (0 + [ M − 1]∆u )

F (u ) = F (u ∆ u )
149
TWO DIMENSIONAL DFT AND ITS INVERSE
1 M −1 N −1
F (u, v) = ∑
MN x =0
∑ f (x, y)e
y =0
− j 2π (ux / M + vy / N )
, u = 0,1,2,...,M −1, v = 0,1,2,...,N −1.
frequency variables
M −1 N −1
f ( x, y ) = ∑ ∑ F (u, v)e
u =0 v =0
j 2π ( ux / M + vy / N )
, x = 0,1,2..., M − 1, y = 0,1,2,..., N − 1.
spatial variables
⎡ I (u, v) ⎤
Polar coordinates: | F (u, v) |= [ R 2 (u, v) + I 2 (u, v)]1 / 2 φ (u, v) = tan −1 ⎢ ⎥
⎣ R(u, v) ⎦
Power spectrum: P(u , v) =| F (u , v) | 2 = R 2 (u , v) + I 2 (u , v)
150
SHIFTING OF ORIGIN
ℑ[ f ( x, y )(−1) x + y ] = F (u − M / 2, v − N / 2)
Fourier This shifts the origin

Transform of F(u,v) to (M/2,N/2).
M −1 N −1 If f(x,y) is an image, the value of DFT at

1
F (0,0) =
MN
∑ ∑
x =0 y =0
f ( x, y ) Ö the origin is equal to the average gray
level of the image.
dc component
f(x,y) is real: F (u , v) = F * (−u ,−v) Ö | F (u , v ) |=| F ( −u ,−v ) |
1 1
∆u = and ∆v = Spectrum is symmetric
M∆x M∆y
151
CENTERING THE SPECTRUM
Image was multiplied by (-1)x+y

prior to computing the transform.
152
RELATIONSHIP BETWEEN THE DFT COMPONENTS
AND SPATIAL CHARACTERISTICS OF AN IMAGE
Frequency is directly related to the rate of change.
DFT components can be associated with patterns of intensity variations in an image.
2 principal features:
1. Strong edges that

run approx. at
±450.
2. 2 white oxide
protrusions.
Correspond to the Corresponds to the

strong edges. short protrusion.
Corresponds to the
long protrusion.
153
BASIC STEPS IN
FREQUENCY DOMAIN FILTERING
(-1)x+y
Cropping to even dimensions Each component of H multiplies
Gray-level scaling both real and imaginary parts of
Conversion to floating point on input the corresponding component of F.
Conversion to 8-bit format on output
Etc.
Imaginary components
set to zero.
Complex Complex
Real filter
(zero-phase-shift filters)
1. Multiply the input image by (-1)x+y

2. Compute F(u,v)
3. Multiply F(u,v) by a filter function H(u,v)
4. Compute the inverse DFT
5. Obtain the real part
6. Multiply the real part by (-1)x+y
154
FILTERING IN THE FREQUENCY DOMAIN
G(u,v) = H(u,v)F(u,v)
H(u,v): real, zero-phase-shift filters
Zero-phase-shift filters do not change the phase angle.
In general, the components of F(u,v) are complex
quantities.
Filtered image = ℑ−1 [G(u,v)]
In general, the inverse DFT is complex.
If the input image F and the filter H are real, the imaginary
components of inverse DFT should all be zero.
Due to round-off errors, the inverse DFT has parasitic
imaginary components. These are ignored.
155
AN INTRODUCTORY EXAMPLE OF FILTERING
Assume we want the average value of an image to be zero.

Also assume the transform has been centered.
F(0,0) = 0
0 if (u,v) = (M/2, N/2)
H(u,v) =
1 otherwise
notch filter: constant function with a notch at the origin
In reality, the average

of the displayed image
cannot be zero.
The display of this

image is made
possible by scaling
(making the most
negative value 0, and
scaling all other
values up from that).
156
LOW-PASS AND HIGH-PASS FILTERS
Low-pass filter: A filter that attenuates high frequencies while passing low frequencies.
High-pass filter: A filter that attenuates low frequencies while passing high frequencies.
blurring
Circularly
symmetric
sharpening
157
ADDING A CONSTANT TO A HIGH-PASS FILTER
A constant is added to the HP filter so that it does not completely eliminate F(0,0).
Notice the improvement!
158
CONVOLUTION
M −1 N −1
1
f ( x, y ) * h ( x, y ) =
MN
∑∑ f (m, n)h( x − m, y − n)
m =0 n =0
Discrete convolution For each displacement (x,y), computes the sum

of 2 functions of products over all values of m and n.
f ( x, y ) * h( x, y ) ⇔ F (u , v) H (u , v)
Convolution theorem
f ( x , y ) h ( x , y ) ⇔ F (u , v ) * H (u , v )
159
IMPULSE FUNCTION OF STRENGTH A
An impulse function of strength A

located at coordinates (x0,y0) Ö Aδ ( x − x0 , y − y0 )
M −1 N −1
Definition: ∑∑ s( x, y) Aδ ( x − x , y − y ) = As( x , y )
x =0 y =0
0 0 0 0
Unit impulse M −1 N −1
located at the
origin:
∑∑ s( x, y)δ ( x, y) = s(0,0)
x =0 y =0
DFT of the unit M −1 N −1

1 1
impulse located
at the origin:
F (u, v) =
MN
∑∑ δ ( x, y)e − j 2π (ux / M +vy / N ) =
x =0 y =0 MN
real constant
160
FOURIER TRANSFORM PAIRS
f ( x , y ) * h ( x , y ) ⇔ F ( u , v ) H (u , v )
δ ( x, y ) * h( x, y ) ⇔ ℑ[δ ( x, y )]H (u , v )
1 1
h ( x, y ) ⇔ H (u , v)
MN MN
h ( x , y ) ⇔ H (u , v )
Hence, filters in the spatial and frequency domains
constitute a Fourier transform pair.
If both filters are of the same size, it is more efficient

to do the filtering in the frequency domain.
However, we use much smaller filters in the spatial domain.
161
GAUSSIAN FILTERS
/ 2σ 2
H (u ) = Ae −u
2
Fourier transform pair

−2π 2σ 2 x 2
h( x) = 2π σAe
Both are real.
Gaussian curves are intuitive and easy to manipulate.
They behave reciprocally w.r.t one another.
H(u) has a broad profile Ö h(x) has a narrow profile.
H(u) has a narrow profile Ö h(x) has a broad profile.
162
LOWPASS AND HIGHPASS GAUSSIAN FILTERS
/ 2σ 12 / 2σ 22
H (u ) = Ae −u − Be −u
2 2
The narrower the frequency

domain filter, the more it
will attenuate the low
frequencies, resulting in
increased blurring.
Fourier transform pair 2 examples
We can implement lowpass

filtering in the spatial
domain by using a mask The spatial filter has both
negative and positive
with all positive
coefficients.
coefficients.
h( x) = 2π σ 1 Ae −2π σ 12 x 2
− 2π σ 2 Be −2π σ 22 x 2
2 2
163
SMOOTHING AND SHARPENING FILTERS
Smoothing filters
Ideal lowpass filters (very sharp)
Butterworth lowpass filters
Gaussian lowpass filters (very smooth)
Butterworth filter parameter: filter order
• High values: filter has the form of the ideal filter.
• Low values: filter has the form of the Gaussian filter.
Sharpening filters
Ideal highpass filters
Butterworth highpass filters
Gaussian highpass filters
164
2-D IDEAL LOWPASS FILTERS
1 if D(u,v) ≤ D0 Nonnegative
quantity
H(u,v) =
0 if D(u,v) > D0
Distance from (u,v) to the

center of the frequency rectangle
Image size: MxN
Center of the frequency rectangle: (u,v) = (M/2, N/2)
Distance to the center: D(u,v) = [(u – M/2)2 + (v – N/2)2]1/2
The lowpass filters in this chapter are radially symmetric.
165
2-D IDEAL LOWPASS FILTERS
Cutoff frequency
166
2-D IDEAL LOWPASS FILTERS AS A FUNCTION
OF CUTOFF FREQUENCIES
M −1 N −1
∑∑
2
Total image power: PT = P(u , v), where P(u , v) = F (u , v)
u =0 v =0
Summation is taken
A circle of radius r ⎡ ⎤
α = 100 ⎢∑∑ P(u, v) / PT ⎥
over the values of
encloses α percent (u,v) that lie inside
of the power: ⎣u v ⎦ the circle or on its
boundary.
Radius %
5 92.0
15 94.6
30 96.4
80 98
230 pixels 99.5
167
ILP FILTERING WITH RADII 5,15,30,80,230
Ringing is Increase in filter radius

characteristics of
ideal filters
Less power removal
Ideal lowpass
filtering is not very
practical but they Less blurring
can be
implemented on a
computer to study
their behavior.
168
BLURRING AS A CONVOLUTION PROCESS
Gray-scale profile of a
horizontal scan line
through the center
of the spatial filter.
Obtained using the

filter to its right.
Gray-scale profile of a
diagonal scan line
through the center
of the filtered image.
169
BUTTERWORTH LOWPASS FILTERS
1
H (u , v) =
1 + [ D (u , v) / D0 ] 2 n
BLPF of order n D(u, v) = [(u − M / 2) 2 + (v − N / 2) 2 ]1 / 2
BLPF transfer function does not have a sharp

discontinuity that establishes a clear cutoff.
170
BLPF WITH ORDERS 1 THROUGH 4
171
BUTTERWORTH FILTERING WITH RADII 5,15,30,80,230
Ringing is not visible in

any of these images.
172
BLPFS OF ORDER 1,2,5,20
no ringing mild ringing exhibits the characteristics

no negative values small negative values of the ILPF
To faciliate comparisons,
additional enhancing with
a gamma transformation
was applied to all images.
173
GAUSSIAN LOWPASS FILTERS
− D 2 ( u ,v ) / 2σ 2
H (u, v) = e σ is a measure of the spread
of the Gaussian curve.
D(u , v) = [(u − M / 2) 2 + (v − N / 2) 2 ]1 / 2
A = 1 (to be consistent with the other filters)
The inverse Fourier transform of the Gaussian lowpass filter is also Gaussian.
Ö A spatial Gaussian filter will have no ringing.
174
GLPFS WITH DIFFERENT σ VALUES
σ = D0
175
GAUSSIAN FILTERING WITH RADII 5,15,30,80,230
no ringing
176
PRACTICAL APPLICATION OF LPF:
MACHINE PERCEPTION
Machine recognition systems have GLPF with D0 = 80

difficulty in reading broken characters.
177
PRINTING & PUBLISHING
Smoothed image looks

soft and pleasing!
178
PROCESSING SATELLITE AND AERIAL IMAGES
Boundaries between water More blurring to make

bodies were caused by large features
loop currents. Scan lines are reduced. recognizable..
179
SHARPENING FREQUENCY DOMAIN FILTERS
Blurring is achieved by attenuating the HF components of DFT of an image.
Sharpening is achieved by attenuating the LF components of DFT of an image.
Only zero-phase-shift filters are considered here.
H hp (u , v ) = 1 − H lp (u , v )
When LP attenuates Transfer function of the

frequencies, HP passes them. corresponding LP filter
Sharpening filters:
Ideal highpass filters

Butterworth highpass filters
Gaussian highpass filters
180
3 TYPES OF SHARPENING FILTERS
The Butterworth filter

represents a transition
between the sharpness
of the IF and the
smoothness of the GF.
181
CORRESPONDING SPATIAL DOMAIN FILTERS
To obtain the spatial domain representation of a frequency domain filter:
1. Multiply H(u,v) by (-1)u+v for centering

2. Compute the inverse DFT
3. Multiply the real part by (-1)x+y
182
IDEAL HIGHPASS FILTERS
0 if D(u,v) ≤ D0
H(u,v) =
1 if D(u,v) > D0
183
BUTTERWORTH HIGHPASS FILTERS
1
H (u , v) =
1 + [ D0 / D(u , v)]2n
184
GAUSSIAN HIGHPASS FILTERS
− D 2 ( u ,v ) / 2 D02
H (u , v) = 1 − e results are smoother than
with the previous 2 filters.
185
CHAPTER 5: IMAGE RESTORATION
The ultimate goal is to improve an image in some

predefined sense.
Restoration
The given image is degraded.
Degradation is modeled.
Inverse process is applied.
No focus on sensor, digitizer, and display degradations.
Some restoration techniques are best formulated in the
spatial domain while others are better suited for the
frequency domain.
Spatial domain: additive noise.
Frequency domain: blurring
186
A MODEL OF IMAGE
DEGRADATION/RESTORATION PROCESS
Linear & Degraded

Input Restored
position-invariant image
image image
g ( x , y ) = h ( x, y ) * f ( x, y ) + η ( x , y )
G (u , v) = H (u , v) F (u , v) + N (u , v)
187
NOISE MODELS
Principle sources of noise in digital images arise during

acquisition and transmission.
Acquisition: performance of image sensors is affected by a
variety of factors (environmental conditions, etc.) and by the
quality of the sensing elements.
Transmission: images are corrupted due to interference in
the channel.
Spatial properties of noise
Frequency properties of noise
White noise: Fourier spectrum of noise is constant.
Assumption: noise is independent of spatial
coordinates.
Partially invalid in some applications.
No correlation between pixel values and noise.
Exception: spatially periodic noise.
188
NOISE PROBABILITY DENSITY FUNCTIONS
We are concerned with the

statistical behavior of the gray-level
µ = a + πb / 4
b( 4 − π )
values of the noise component of
σ2 =
4 the model.
These values may be considered

random variables characterized by a
b
probability density function (PDF).
µ=
a
1
b µ=
σ2 = a
a2
1
σ2 =
a2 Most common PDFs:
1. Gaussian (normal) noise

2. Rayleigh noise
3. Erlang (Gamma) noise
µ=
a+b 4. Exponential noise
2 5. Uniform noise
(b − a ) 2
σ =
2
12 6. Impulse (salt-and-pepper) noise
189
MODELING A BROAD RANGE OF NOISE
CORRUPTIONS
The 6 PDFs are useful tools.

Gaussian noise arises in an image due to factors such
as electronic circuit noise and sensor noise due to poor
illumination and/or high temperature.
Rayleigh density is helpful in characterizing noise
phenomena in range imaging.
Gamma and exponential densities find application in
laser imaging.
Impulse noise is found in situations where quick
transients, such as faulty switching, take place during
imaging.
Uniform density may be the least descriptive of practical
situations but it is quite useful as the basis for numerous
random number generators.
190
A TEST PATTERN
composed of simple, constant areas

that span the gray scale from black
to near white in 3 increments.
191
NOISY IMAGES AND THEIR HISTOGRAMS:
GAUSSIAN, RAYLEIGH, AND GAMMA
The parameters of
the noise were
chosen in each
case so that the
histogram
corresponding
to the 3 gray levels
in the test pattern
would start to
merge.
192
NOISY IMAGES AND THEIR HISTOGRAMS:
EXPONENTIAL, UNIFORM, AND IMPULSE
The parameters of
the noise were
chosen in each
case so that the
histogram
corresponding
to the 3 gray levels
in the test pattern
would start to
merge.
Extra peak: the

noise components
were pure black
and white, and the
lightest
component of the
test pattern is
light gray.
193
PERIODIC NOISE
Image is severely corrupted by

spatial sinusoidal noise of
various frequencies.
The Fourier Transform of a pure

sinusoid is a pair of conjugate
impulses located at the
conjugate frequencies of the
sine wave.
If the amplitude of a sine wave

in the spatial domain is strong
Periodic noise arises enough, we would see in the
typically from electrical or spectrum of the image a pair of
electromechanical impulses for each sine wave in
interference during the image.
acquisition.
This is the only spatially

dependent noise
considered here.
194
ESTIMATION OF NOISE PARAMETERS
Periodic noise
Inspection of the Fourier spectrum
Inspection of the image (possible only in simple cases)
Automated analysis
Noise spikes are exceptionally pronounced.
Some knowledge is available about the general location of the
frequency components.
Noisy PDFs
Parameters may be partially known from sensor specs
Imaging system available
Capture a set of images of flat environments.
Images are available
Crop small patches of reasonably constant gray level.
Obtain the histogram.
Compute mean and variance.
Gaussian PDF: Completely determined by the mean and variance.
Impulse noise: actual probability of occurrence of white and black
pixels is needed.
Others: Use the mean and variance to solve for a and b.
195
ESTIMATION OF PDF PARAMETERS
FROM SMALL PATCHES
How to use the data from the image strips?
• Calculate the mean and variance of the gray levels.

• Identify the closest PDF.
• If the shape is Gaussian, the mean and variance specifies the Gaussian PDF.
• For other shapes, use the mean and variance to solve for a and b.
• For impulse noise, obtain the probability of occurrence of white & black pixels.
196
SPATIAL DOMAIN FILTERING
FOR ADDITIVE NOISE
g ( x , y ) = f ( x , y ) + η ( x, y ) additive noise
G (u, v) = F (u, v) + N (u, v)
Mean filters
Arithmetic mean filter

Geometric mean filter
Harmonic mean filter
Contraharmonic mean filter
Order-statistics filters
Median filter
Max & min filters
Midpoint filter
Alpha-trimmed mean filter
Adaptive filters
Adaptive local noise reduction filter
Adaptive median filter
197
MEAN FILTERS
1
fˆ ( x , y ) =
mn
∑ g ( s, t )
( s , t )∈S xy
Arithmetic mean
1
⎡ ⎤ mn
Geometric mean
ˆf ( x , y ) = ⎢
∏ g (s, t )⎥ (comparable to arithmetic
⎣⎢ ( s ,t )∈ S xy ) ⎦⎥
mean but tends to lose less
image detail)
mn
fˆ ( x , y ) = Harmonic mean
(works well for salt noise but
1
∑
( s , t )∈ S xy g (s, t )
fails for pepper noise. OK for
other types of noise as well)
Q=0 Ö arithmetic mean
Q=-1 Ö harmonic mean
∑ g ( s, t )
( s ,t )∈S xy
Q +1
Contraharmonic mean
(+Q: eliminates pepper noise
fˆ ( x, y ) =
∑ g ( s, t ) Q -Q: eliminates salt noise
Order not simultaneously!)
( s ,t )∈S xy of the filter
198
ARITHMETIC AND GEOMETRIC MEAN FILTERS
mean=0, variance=400
Geometric mean filter did

not blur the image as
much as the arithmetic
mean filter
199
SPATIAL FILTERING FOR ADDITIVE NOISE
pepper noise salt noise
Better job of
cleaning the
background at the
expense of blurring
the dark areas!
200
CONTRAHARMONIC FILTERING
WITH THE WRONG SIGN
Wrong sign for Q Wrong sign for Q
201
ORDER-STATISTICS FILTERS
Effective for
bipolar and
unipolar impulse fˆ ( x, y ) = median {g ( s, t )} Median filter
noise ( s ,t )∈S xy
Useful for finding

the brightest
points in an image
fˆ ( x , y ) = max { g ( s , t )} Max filter
( s ,t )∈S xy
Useful for finding

the darkest points
in an image
fˆ ( x, y ) = min {g ( s, t )} Min filter
( s ,t )∈S xy
⎡ ⎤
ˆf ( x, y) = 1 ⎢
{g ( s, t )} + min {g ( s, t )}⎥
Works best for
randomly max Mid-point filter
distributed noise 2 ⎣⎢( s ,t )∈S xy ( s , t )∈S xy ⎦⎥
d = 0 Ö arithmetic
1
d = (mn-1)/2 Ö median
Other d: Useful for
multiple types of noise
fˆ ( x, y) = ∑ g r ( s, t )
mn − d ( s ,t )∈S xy
Alpha-trimmed
mean filter
202
3 PASSES OF MEDIAN FILTER
FOR IMPULSE NOISE
1ST pass
Significant
Improvement!
2nd pass 3rd pass

barely visible # of passes
noise points! should be as
low as
possible!
203
MAX & MIN FILTERS FOR PEPPER NOISE
Pepper noise is Better job!

reasonably removed.
The filter also removed
The filter also removed some white points from
some dark points from the borders of the light
the borders of the dark objects.
objects.
204
REDUCTION OF NOISE
WITH 4 TYPES OF FILTERS
Uniform noise Additional salt-and-

(mean=0, pepper noise
variance=800) (Pa=Pb=0.1)
Arithmetic mean Geometric mean

(no good) (no good)
Median Alpha-trimmed mean

(much better) (better than median)
205
ADAPTIVE FILTERS
The filters discussed so far are applied to an image

independent of how image characteristics vary from one
point to another.
2 simple adaptive filters
Adaptive, local noise reduction filter
Adaptive median filter
Their behavior changes based on statistics of the image
inside the filter region.
Advantage: superior performance
Disadvantage: increase in filter complexity
206
ADAPTIVE, LOCAL NOISE REDUCTION FILTER
Local region Sxy
g(x,y): value of noisy image at (x,y)

σ η 2: variance of the noise
mL: local mean of pixels in Sxy
σ L 2: local variance of pixels in Sxy
Behavior of the filter:
1. σ η2 = 0 Ö g(x,y)=f(x,y) – trivial case with zero noise

2. σL2 >> ση2 Ö ~g(x,y) – edges should be preserved
3. σL 2 ≈ ση 2 Ö mL – local noise is reduced by averaging
ˆf ( x, y ) = g ( x, y ) − σ [ g ( x, y ) − m ]
2
η
Definition: L
σ2
L
Analysis of ση 2:
Needs to be known or estimated (but we seldom have exact knowledge).

Tacit assumption: ση2 ≤ σL2 (because Sxy⊂ g(x,y))
ση2 > σL2 Ö set ratio=1 (this makes the filter nonlinear but prevents meaningless results)
Another approach: ση2 > σL2 Ö allow negative values and rescale (results in loss of dynamic range in the image)
207
COMPARISON OF ADAPTIVE FILTER WITH
ARITHMETIC AND GEOMETRIC MEAN FILTERS
Best results:
Noise reduction is
comparable but the
restored image is
much sharper!
208
ADAPTIVE MEDIAN FILTER
Zmin = min gray level value in Sxy

zmax = max gray level value in Sxy
zmed = median of gray levels in Sxy
zxy = gray level at (x,y)
Smax = max allowed size of Sxy
The filter works in 2 levels:
Level A (to check if zmed is an impulse)
A1 = zmed – zmin
A2 = zmed – zmax
If A1 > 0 and A2 < 0, goto level B
Else increase the window size 3 main purposes:
If window size ≤ Smax, repeat level A
Else output zxy 1. To remove impulsive noise
2. To provide smoothing of non-
Level B (to check if zxy is an impulse) impulsive noise
3. To reduce distortion (e.g., excessive
B1 = zxy – zmin thinning or thickening of object
B2 = zxy – zmax boundaries)
If B1 > 0 and B2 < 0, output zxy
Else output zmed
209
A SIMPLE EXAMPLE ADAPTIVE MEDIAN FILTER
center
point 10 20 20 zmin = 10
20 15 20 zmax = 100
20 25 100 zmed = 20
A1 = 20-10 = 10
A2 = 20-100 = -80
A1 > 0 & A2 < 0: zmin < zmed < zmax Hence, zmed cannot be an impulse.
Go to level B
B1 = 15-10 = 5
B2 = 15-100 = -85
B1 > 0 & B2 < 0: zmin < zxy < zmax Hence, zxy cannot be an impulse.
Output zxy = 15 these intermediate-level points are not changed
(B1 > 0 & B2 < 0) is false: zxy = zmin or zxy = zmax
Output zmed = 20
210
COMPARISON OF MEDIAN AND ADAPTIVE
MEAN FILTERS
Median filter Adaptive median filter
Noise reduction is
comparable but the filter
preserved sharpness!
211
FREQUENCY DOMAIN FILTERING
FOR PERIODIC NOISE
Bandreject filters: remove or attenuate a band of

frequencies about the origin of the Fourier Transform.
Bandpass filters: pass or strengthen a band of

frequencies about the origin of the Fourier Transform .
Notch filters: reject or pass frequencies in predefined

neighborhoods about a center frequency.
212
BANDREJECT FILTERS
W
1 if D (u , v) < D0 −
2
2
1 ⎡ D 2 ( u ,v ) − D02 ⎤
W W − ⎢ ⎥
H (u , v ) = 1
0 if D0 − ≤ D(u, v) ≤ D0 + H (u , v) = H (u , v) = 1 − e
2 ⎢⎣ D ( u ,v )W ⎥⎦
2 2 2n
⎡ D(u , v)W ⎤
W 1+ ⎢ 2 2 ⎥
1 if D (u , v) > D0 −
2 ⎣ D (u, v) − D0 ⎦
A principal application: noise removal in situations where the general location of the noise
components in the frequency domain is approximately known.
213
APPLICATION OF A BANDREJECT FILTER
Restoration
is evident!
Note that it would not be possible to get equivalent results by

spatial domain filtering using small convolution masks!
214
BANDPASS FILTERS
H bp (u , v ) = 1 − H br (u , v )
Performs the
opposite operation
of a bandreject
filter.
Performing straight bandpass filtering on an image is not a common

procedure because it generally removes too much image detail.
Bandpass filtering is quite useful in isolating the effect on an image of

selected frequency bands.
215
APPLICATION OF A BANDPASS FILTER
Most image detail is lost

but the remaining
information is very useful
as it shows a noise
pattern that is close to
that of the noise that
corrupted the image!
Generated by:
• using the bandpass filter corresponding to
the bandreject filter in the previous
example
• taking the inverse transform
216
NOTCH FILTERS
Notch filters must appear in symmetric pairs about the

origin in order to obtain meaningful results.
The notch filter located at the origin is an exception.
The # of pairs of notch filters is arbitrary.
The shape of the notch areas is also arbitrary.
Two classes
Notch reject filters
Notch pass filters
Notch reject filters
Ideal notch reject filters
Butterworth notch reject filters
Gaussian notch reject filters
Notch pass filters
Ideal notch pass filters
Butterworth notch pass filters
Gaussian notch pass filters
217
NOTCH REJECT FILTERS
0 D1 (u , v) ≤ D0 or D2 (u, v) ≤ D0
H (u, v) =
1 otherwise
1 ⎡ D ( u ,v ) D2 ( u ,v ) ⎤
1 − ⎢ 1 ⎥
H (u , v ) = 2 ⎣⎢ D02 ⎦⎥
⎡ D 02 ⎤
n
H (u, v) = 1 − e
1+ ⎢ ⎥
D
⎣ 1 ( u , v ) D 2 ( u , v ) ⎦
Note that these

3 filters become
highpass filters
if u0=v0=0.
D1 (u , v) = [(u − M / 2 − u0 ) 2 + (v − N / 2 − v0 ) 2 ]1/ 2
D2 (u , v) = [(u − M / 2 + u0 ) 2 + (v − N / 2 + v0 ) 2 ]1/ 2
218
NOTCH PASS FILTERS
H np (u , v) = 1 − H nr (u , v)
Performs the
opposite operation
of a notch reject
filter.
Note that the 3 filters become lowpass filters if u0 = v0 = 0.
219
APPLICATION OF A NOTCH PASS FILTER
Notch filtering can do a

better job to reduce the
scan lines without
introducing undesirable
blurriness.
We start with a simple

Spectrum of
ideal notch pass filter
the input
(superimposed on the
image
spectrum) to get an idea
about the noise
contribution.
Now that we have a

Inverse suitable notch pass
transform of filter, it can be used to
notch-pass obtain the
filtered result corresponding notch
reject filter. Hence, the
filtered image!
220
LINEAR, POSITION-INVARIANT DEGRADATIONS
Linear, position-invariant techniques

Many types of degradations can be approximated by linear,
position-invariant processes.
Tools of linear system theory become available.
Nonlinear, position-dependent techniques
More general and usually more accurate
Very difficult to solve or no known solutions
Before restoration: g(x,y) = H[f(x,y)] + η(x,y)
Position-invariant operator
H[f(x-α, y-β)] = g(x-α, y-β), for any f(x,y), and any α & β.
The response at any point in the image depends only on the value
of the input at that point, not on its position.
Impulse response of H: H[δ(x-α, y-β)] = h(x,α,y,β)
If the impulse response of a linear system is known, we can
compute the response to any input f!
g(x,y) = h(x,y)*f(x,y)+ η(x,y) or G(u,v) = H(u,v)F(u,v)+ N(u,v)
η(x,y): random values, independent of position.
221
DEGRADATION FUNCTION ESTIMATION
Degradations are modeled as being the result of convolution.

Restoration seeks to find filters that apply the process in
reverse.
Linear image restoration: image deconvolution
Restoration filters: deconvolution filters
3 principal ways to estimate the degradation function
Observation
Experimentation
Mathematical modeling
Image restoration using a degradation function
Blind deconvolution
The true degradation function is seldom known completely.
222
ESTIMATION BY IMAGE OBSERVATION
Suppose we are given a degraded image without any

knowledge about the degradation function H.
One way to estimate the function is to gather information
from the image itself.
Estimation of the degradation function in a blurred image
Look for areas of strong signal content (to reduce the effect
of noise).
Choose a small section of the image with simple structures
(e.g., part of an object and the background).
Construct an unblurred image of the same size and
characteristics of the observed subimage.
g s ( x, y ) : observed subimage G (u , v)
H s (u , v) = s
fˆs ( x, y ) : constructed subimage Fˆs (u , v)
Suppose a plot of H s (u , v ) looks like a Butterworth filter.

Construct a function H (u , v ) on a larger scale but with the
same shape.
223
ESTIMATION BY EXPERIMENTATION
If equipment similar to the equipment used to acquire the

degraded image, it is possible to obtain an accurate
estimation of the degradation.
Estimation of the degradation function
Use various system settings until the image is degraded as
closely as possible to the image we want to restore.
Obtain the impulse response of the degradation by imaging
an impulse.
H(u,v) = G(u,v)/A
224
ESTIMATION BY MODELING
Provides insight into the restoration problem.

In some cases, models take into account environmental
conditions.
− k ( u 2 + v 2 )5 / 6
H (u, v) = e
A degradation model
based on the
k = 0.0025 physical
characteristics of
atmospheric
turbulence.
Sometimes, Gaussian
LPF is used to model
mild, uniform
k = 0.001 k = 0.00025 blurring.
225
AN EXAMPLE OF MODELING
T Model derived from an image

H (u , v) = sin[π (ua + vb)]e − jπ ( ua + vb ) that has been blurred by
π (ua + vb) uniform linear motion between
the image and the sensor.
226
INVERSE FILTERING
Direct inverse filtering: The simplest approach to restoration.

The following derivation shows that we cannot recover the
undegraded image exactly because N(u,v) is a random
function whose Fourier Transform is not known!
G (u , v)
Fˆ (u , v) =
H (u , v)
G (u , v ) = H (u , v ) F (u , v ) + N (u , v )
N (u , v)
⇒ Fˆ (u , v) = F (u , v) +
H (u, v)
More bad news: If the degradation has zero or very small

values, the ratio N(u,v)/H(u,v) may dominate the estimate.
To circumvent the problem, we can limit the filter frequencies to
values near the origin.
H(0,0) = Mean of h(x,y): usually the highest value of H(u,v).
227
AN EXAMPLE OF INVERSE FILTERING
Degradation function:
H (u, v) = e − k [(u − M / 2) + ( v − N / 2 ) 2 ]5 / 6
2
with k = 0.0025
The cutoff was implemented

by applying a Butterworth
lowpass function of order 10.
228
MINIMUM MEAN SQUARE ERROR (WIENER)
FILTERING
Inverse filtering makes no explicit provision for handling

noise.
Now we consider both images and noise as random
processes.
ˆ
The objective: Find an estimate f of the uncorrupted image f
such that the mean squared error between them is minimized.
Assumptions
The image and the noise are uncorrelated.
The image or the noise has zero mean.
The gray levels in the estimate are a linear function of the levels
in the degraded image.
The restored image is given by the inverse Fourier transform
of Fˆ (u , v) .
Noise = zero Ö Wiener filter = inverse filter!
229
DERIVATION OF THE WIENER FILTER
Minimize e = E{( f − fˆ ) }
2 2
⎡ 1 | H (u , v) |2 ⎤
⇒ F (u , v) = ⎢
ˆ ⎥G (u , v)
⎢⎣ H (u, v) | H (u, v) | + Sη (u, v) / S f (u, v) ⎥⎦
2
H (u, v) = transform of the degradatio n function

G (u,v) = transform of the degraded image
Sη (u , v) =| N (u , v) |2 = power spectrum of the noise
S f (u , v) =| F (u, v) |2 = power spectrum of the undegraded image
If the 2 power spectrums are not known, use Sη (u , v) / S f (u , v) = K
230
AN EXAMPLE OF WIENER FILTERING
Degraded input image
K was chosen
interactively to yield
the best possible visual
results.
Previous results for comparison
231
AN EXAMPLE OF WIENER FILTERING
Blurred image with

additive Gaussian
noise (mean=0,
variance=650)
Noise variance is
reduced by one
order of magnitude
Noise variance is
reduced by five
orders of magnitude
Wiener filter with
H ( u , v ) = e − k [( u − M / 2 ) + ( v − N / 2 ) 2 ]5 / 6
2
where k = 0.0025.
K was chosen interactively.

232
CHAPTER 6: COLOR IMAGE PROCESSING
The use of color image processing is motivated by two

factors:
Color is a powerful descriptor that often simplifies object
identification and extraction from a scene.
Humans can discern thousands of color shades and intensities
(compared to ~only 2 dozen shades of gray!).
Two major areas:
Full-color processing: images are acquired with a full-color
sensor.
Pseudo-color processing: a color is assigned to a particular
monochrome intensity or range of intensities.
In the past, most digital image color processing was done
at the pseudo-color level.
Today, full-color image processing techniques are used in
a broad range of applications.
233
COLOR FUNDAMENTALS
Perception and interpretation of color is a

physiopsychological phenomenon that is not fully
understood.
Sir Isaac Newton discovery in 1666.
Color spectrum can be divided into 6 broad regions:
Violet, blue, green, yellow, orange, and red.
The colors that human perceive in an object are
determined by the nature of the light reflected from that
object.
Green objects reflect light with wavelengths in 500-570nm
while absorbing most of the energy at other wavelengths.
Characterization of light is central to the science of color.
Achromatic light: intensity is the only attribute.
Chromatic light: spans the EM spectrum from 400nm to
700nm.
234
COLOR SPECTRUM
235
CHROMATIC LIGHT
3 basic quantities to describe the quality of a chromatic light

source:
Radiance (W): total amount of energy that flows from the light
source.
Luminance (lm): the amount of energy an observer perceives
from a light source.
Brightness: a subjective descriptor that is impossible measure
in partice.
Example: light emitted from a source operating in the far
infrared region.
Radiance: significant!
Luminance: hardly perceived!
The cones are responsible for color vision.
65% are sensitive to red light.
33% are sensitive to green light.
2% are sensitive to blue light (blue cones are the most sensitive)
236
VISIBLE SPECTRUM
237
ABSORPTION OF LIGHT BY THE CONES
No single color may

be called red, green,
or blue.
238
PRIMARY AND SECONDARY COLORS OF LIGHT
Primary colors are

mixed to produce
secondary colors of
light (magenta, cyan,
yellow).
Secondary colors are

mixed to produce
primary colors of light
(red, green, blue).
239
COLOR TV
A color TV screen differs from a black-and-white

screen in three ways:
There are three electron beams that move

simultaneously across the screen. They are named
the red, green and blue beams.
The screen is not coated with a single sheet of

phosphor as in a black-and-white TV. Instead, the
screen is coated with red, green and blue
phosphors arranged in dots or stripes. If you turn
on your TV or computer monitor, and look closely
at the screen with a magnifying glass, you will be
able to see the dots or stripes.
On the inside of the tube, very close to the

phosphor coating, there is a thin metal screen
called a shadow mask. This mask is perforated with
very small holes that are aligned with the phosphor
dots (or stripes) on the screen.
When a color TV needs to create a red dot, it fires
the red beam at the red phosphor. Similarly for
green and blue dots.
To create a white dot, red, green and blue beams

are fired simultaneously -- the three colors mix
together to create white.
To create a black dot, all three beams are turned

off as they scan past the dot. All other colors on a
TV screen are combinations of red, green and
blue.
240
CIE CHROMATICITY DIAGRAM
A straight line
segment joining
any 2 points
defines all the
different color
variations that can
Any point within be obtained by
the diagram mixing these 2
colors additively.
represents some
mixture of
spectrum colors.
Points on the
boundary are
pure colors in
the visible Any color in the
spectrum. triangle can be
produced by
various
combinations of
the corner colors.
241
COLOR MODELS
A color model is a specification of colors in a standard

way.
A coordinate system and a subspace where each color is
represented by a single point.
Most color models are oriented toward hardware or
applications.
Hardware oriented models for DIP
RGB (red, green, blue) model for color monitors and color
video cameras
CMY (cyan, magenta, yellow) model for color printing
CMYK (cyan, magenta, yellow, black) model for color
printing
HSI (hue, saturation, intensity) decouples the color and
gray-scale information in an image.
242
RGB MODEL
Images represented
with the RGB color
model have 3
component images:
Red component
Green component
Blue component It is assumed that all color
values are normalized.
If 8 bits are used for
each pixel, we have Each color is represented by a
a 24-bit RGB image. point in or on the unit cube.
243
RGB 24-BIT COLOR CUBE
(28)3 = 16,777,216 colors
244
COLOR PLANES
A color image is
acquired using the
process in reverse
order.
R = 127
G = 0-255
B = 0-255
245
SAFE RGB COLORS
246
RBG SAFE-COLOR CUBE
Each surface has 36 colors: 6x36 = 216
247
CMY & CMYK COLOR MODELS
Secondary colors of light : Cyan, magenta, yellow

(primary colors of pigments)
Most devices (color printers, copiers, etc.) that deposit color pigments on
paper require CMY data input or perform an internal RGB to CMY
conversion.
RGB to CMY conversion (all color values are in the range [0,1]):
C 1 R light reflected from a surface coated with pure cyan does not contain red
M = 1 - G light reflected from a surface coated with pure magenta does not contain green
Y 1 B light reflected from a surface coated with pure yellow does not contain blue
Equal amounts of cyan, magenta, and yellow should produce black.

In practice, this combination produces muddy-looking black.
In order to produce true black, a fourth color is added to the model.
CMYK color model: cyan, magenta, yellow, and black
248
HSI COLOR MODEL
RBG and CMY are suitable for hardware implementations.

Unfortunately, they are not good for describing colors for
human interpretation.
One does not refer to the color of a car by giving the % of
each of the primaries!
Humans describe a color object by its hue, saturation, and
brightness.
Hue: a color attribute that describes a pure color.
Saturation: gives a measure of the degree to which pure color
is diluted by white light.
Brightness: a subjective descriptor that is practically
impossible to measure.
The HSI model decouples the intensity component from the
color-carrying information (hue & saturation).
249
CONCEPTUAL RELATIONSHIP BETWEEN RGB
AND HSI COLOR MODELS
In this plane segment, all the

points have the same hue but
different saturation and intensity.
This line is the (1,1,1)
intensity axis
joining black
and white
vertices.
The intensity
component of
this color As the planes moves up
point can be and down, the boundaries
determined defined by the
intersection of each plane
by passing a with the faces of the cube
plane (0,0,0) have either a triangular or
perpendicular hexagonal shape.
to the axis
and
containing
the point. Conclusion: H, S, and I values required to form the HSI
space can be obtained from the RGB cube.
250
HUE AND SATURATION IN THE HSI MODEL
View obtained by
looking at the
RBG cube down
its gray-scale It is not unusual to see
axis. HSI planes defined in
terms of a hexagon, a
triangle, or even a circle.
251
TRIANGULAR AND CIRCULAR COLOR PLANES
IN THE HSI MODEL
mid-point of the
vertical intensity axis
252
CONVERTING COLORS FROM RGB TO HSI
The RGB values have been normalized to the range [0,1].
⎧ ⎫
θ if B ≤ G ⎪⎪
1
[( R − G ) + ( R − B ) ] ⎪⎪
H = with θ = cos −1 ⎨ 2
1/ 2 ⎬
360 − θ if B > G ⎪
⎪⎩
[
( R − G ) 2
+ ( R − B )(G − B ) ] ⎪
⎪⎭
Can be normalized
to the range [0,1] by
diving all the values
by 360.
3
S = 1− [min( R, G, B)]
( R + G + B)
1
I = ( R + G + B)
3
253
CONVERTING COLORS FROM HSI TO RGB
The HSI values are given in the interval [0,1].
The applicable equations depend on the values of H.
RG sector (00 ≤ H < 1200):
⎡ S cos H ⎤
B = I (1 − S ) R = I ⎢1 + ⎥ G = 3I − ( R + B )
⎣ cos(60 − H ) ⎦
0
GB sector (1200 ≤ H < 2400): first subtract 1200 from it.
⎡ S cos H ⎤
R = I (1 − S ) G = I ⎢1 + ⎥ B = 3I − ( R + G )
⎣ cos(60 − H ) ⎦
0
BR sector (2400 ≤ H ≤ 3600): first subtract 2400 from it.
⎡ S cos H ⎤
G = I (1 − S ) B = I ⎢1 + ⎥ R = 3 I − (G + B )
⎣ cos(60 − H ) ⎦
0
254
PSEUDO-COLOR IMAGE PROCESSING
Pseudo-color image processing: assignment of colors

to gray values based on a specified criterion.
The principal use of pseudo-color is for human
visualization and interpretation of gray-scale events in
images.
A principal motivation for using color is that humans can
discern thousands of color shades and intensities!
Intensity slicing and color coding is a simple example of
pseudo-color image processing.
The image is a 3-D function.
Planes parallel to the coordinate plane are used.
More general transformations achieve a wider range of
pseudo-color enhancement results.
255
INTENSITY SLICING
A different color is
assigned to each side of
the plane.
Algorithm:
[0,L-1]: gray-scale
L0: black
lL-1: white
P planes: l1,l2,…,lP
P planes partition the gray-

scale into P+1 intervals:
V1,V2,…,VP+1
Color assignments:
f(x,y) = ck if f(x,y) ∈ Vk
256
INTENSITY SLICING INTO 8 COLORS
A different color is assigned to each

region without regard for the
meaning of the gray levels in the
image.
257
INTENSITY SLICING INTO 2 COLORS
When there is a porosity or crack

in a weld, the full strength of the
X-rays going through the object
saturates the sensor on the other
side of the object.
Hence, gray levels of value 255 in

an 8-bit image coming from such
a system automatically imply a
problem with the weld!
A simple color coding:

One color to level 255.
Another color to all other gray
levels.
If a human is the ultimate

judge in inspecting welds,
this simple color coding
would result in lower error
rates!
258
INTENSITY SLICING INTO MULTIPLE COLORS
Average monthly
rainfall over a
period of 3 years
Much
easier to
interpret
259
3 INDEPENDENT COLOR TRANSFORMATIONS
260
PSEUDO-COLOR ENHANCEMENT: AN EXAMPLE
Why do we have the same Why do we have the same

color for the explosives and color for the explosives and
background? garment bag?
261
COMBINATION OF SEVERAL MONOCHROME
IMAGES INTO A SINGLE COLOR COMPOSITE
262
3 MONOCHROME IMAGES ARE COMBINED
visible red
visible green
visible blue near infrared
First 3 images
are combined Red component was
into an RGB replaced with the
image infrared image
Note that the infrared band

is strongly responsive to
the biomass components in
a scene
263
COMBINING IMAGES FROM A SPACECRAFT
One way to
combine the
sensed image
data is by how
they show This image was
differences in obtained by combining
surface chemical several of the sensor
composition. images from the Galileo
spacecraft.
Bright red
depicts material An analysis of
newly ejected individual images
from an active would not convey
volcano. similar information.
Surrounding
yellow materials
are older sulfur
deposits.
264
BASICS OF FULL-COLOR IMAGE PROCESSING
Two major categories of full-

color IP approaches
Example:
Each component image is
processed individually, and a Suppose the process is neighborhood
composite color image is averaging.
formed from the components.
Same result would be obtained using the
Color pixels (which are vectors) scalar and vector methods.
are directly processed.
Two conditions have to be

satisfied for per-color-
component and vector-based
processing to be equivalent.
The process has to be

applicable to both scalars and
vectors.
The operation on each

component of a vector must be
independent of the other
components.
265
COLOR TRANSFORMATIONS - FORMULATION
g(x,y) = T[f(x,y)]: pixels values are triplets or quartets.

si = Ti(r1,r2,…,rn):
si and ri: variables denoting the color components
{T1,T2,…,Tn}: set of transformation functions
For RGB color space, n=3.
For CMYK color space, n=4.
Example:
g(x,y)=kf(x,y), 0<k<1: intensity modification
Any color space can be used.
HSI color space: s3=kr3, s1=r1, s2=r2
RGB color space: si=kri, i=1,2,3
CMY color space: si=kri+(1-k), i=1,2,3
In this case, although the HSI transformation involves the
fewest # of operations, the computations required to convert
an RGB or CMY(K) image more than offsets the advantages!
266
COLOR-SPACE COMPONENTS OF A
FULL-COLOR IMAGE
CMYK
RGB
HSI
267
MODIFIED INTENSITY OF THE FULL-COLOR
IMAGE
Labels (H,S) and I

should be switched.
268
COLOR COMPLEMENTS
The hues directly opposite

one another on the color
circle are called
complements.
Complements are
analogous to gray-scale
negatives: they are useful
in enhancing detail
embedded in dark region of
a color image.
The RGB complement transformation functions

used here do not have a straightforward HSI color
space equivalent.
269
COLOR SLICING
Highlighting a specific range of colors in an image is useful for

separating objects from their surroundings.
The most straightforward approach is to extend the gray-level slicing

techniques.
si = Ti(r1,r2,…,rn): si is a function of all ri.
Colors of interest are enclosed by a cube:
0.5 if [|rj – aj| > W/2]any 1 ≤ j ≤ n

si = i = 1,2,…,n
ri otherwise
Colors of interest are enclosed by a sphere:
0.5 if Σ(rj – aj)2 > (R0)2

si = i = 1,2,…,n
ri otherwise
The width of the cube and the radius of the sphere were determined
interactively.
270
AN EXAMPLE OF COLOR SLICING
Edible parts of the strawberries are separated!
In each of the 2 cases below, a prototype red with RGB color

coordinate (0.6863,0.1608,0.1922) was selected from the most
prominent strawberry.
Reds within an Reds within an

RGB cube RGB sphere
271
It is generally unwise to
histogram equalize the
components of a color
image independently.
This results in erroneous
color.
A more logical approach

is to spread the color
intensities uniformly,
leaving the colors
themselves (e.g., hues)
unchanged.
The HSI color space is

ideally suited to this
approach.
The intensity The saturation component is

component is increased after histogram
histogram equalized. equalization
272
COLOR IMAGE SMOOTHING
1
c ( x, y ) =
K
∑ c ( x, y )
( x , y )∈S xy
1
K
∑ R ( x, y )
( x , y )∈S xy
Smoothing by
neighborhood
1 averaging can be
c ( x, y ) = K
∑ G ( x, y ) carried out using
( x , y )∈S xy either individual color
planes or the RGB
1 color vectors.
K
∑ B ( x, y )
( x , y )∈S xy
273
AN RGB IMAGE AND ITS COLOR PLANES
274
HSI PLANES
275
SMOOTHED IMAGES
By smoothing only the intensity plane,

the pixels in the smoothed image
maintain their original hue and
saturation – and their original color!
276
COLOR IMAGE SHARPENING
∇ 2 R ( x, y )
The Laplacian of a full-
color image can be
∇ 2 [c( x, y )] = ∇ 2 G ( x, y ) obtained by computing
the Laplacian of each
component plane
separately.
∇ 2 B ( x, y )
The Laplacian
277
SHARPENED IMAGES
Obtained by combining the Laplacian of the

intensity plane with the unmodified hue and
saturation planes.
278
COLOR SEGMENTATION IN HSI COLOR SPACE
Assume we want to segment an

image based on color, and to
carry out the process on
individual planes.
HSI space:
Hue plane conveniently

represents the color.
Saturation plane is used as a

masking image.
Intensity plane is seldom used

because it carries no color
information. The product
image is
thresholded
The binary mask is generated by with T = 0.9
thresholding the saturation plane
with T = 0.1x(maximum value in
the saturation plane).
279
COLOR SEGMENTATION IN RGB COLOR SPACE
RGB color vectors generally result in better segmentation results.
A set of sample points representative of the colors of interest is given.
Obtain an estimate of the average color that will be segmented.
Classify each pixel in the given image according to the specified range.
One measure of similarity is the Euclidean distance:
D(z,a) = ||z-a|| = [(z-a)T(z-a)]1/2 = [(zR-aR)2 + (zG-aG)2 + (zB-aB)2]1/2
D(z,a) ≤ D0 D(z,a)=[(z-a)TC-1(z-a)]1/2 ≤ D0 Computationally much simpler!
280
AN EXAMPLE OF COLOR SEGMENTATION
IN RGB COLOR SPACE
Compute the mean vector a

using the points contained
within the rectangle.
Center the box at a.
Compute the standard

deviation of the R,G,B values
of the sample points.
Determine the dimension

along each axis by
computing 1.25xSD.
σR: SD of red components
Dimension along the R-axis:
(aR – 1.25 σR) to (aR + 1.25σR)
281
COLOR EDGE DETECTION
The gradient introduced in Chapter 3 is not defined for vector quantities.
So, computing the gradient on individual planes, and then using the
results to form a color image will lead to erroneous results.
We need to define the gradient for c(x,y) = [R(x,y) G(x,y) B(x,y)]T
A method has been proposed by Di Zenzo in 1986:
1 ⎡ 2 g xy ⎤
Direction of max rate of change: θ = tan −1 ⎢ ⎥
2 ⎢⎣ ( g xx − g yy ) ⎥⎦
1
⎧1
Value the rate of change: F (θ ) = ⎨ [ ] ⎫
( g xx + g yy ) + ( g xx − g yy ) cos 2θ + 2 g xy sin 2θ ⎬
2
⎩2 ⎭
282
AN EXAMPLE OF COLOR EDGE DETECTION
USING 2 APPROACHES
The edge
detail is
more
complete!
Both
approaches
yielded
reasonable
results. Is the
extra detail
worth the added
computational
burden of the
vector
approach?
283
COMPONENT GRADIENT IMAGES
284
NOISE IN COLOR IMAGES
The noise models discussed in Chapter 5 are applicable to color

images.
Gaussian noise
Rayleigh noise
Erlang noise
Exponential noise
Uniform noise
Impulse (salt and pepper) noise
Usually, the noise content of a color image has the same

characteristics in each color channel.
However, it is possible for color channels to be affected differently

by noise.
285
GAUSSIAN NOISE IN A COLOR IMAGE
Fine grain noise tends

to be less noticeable in
color images.
286
NOISY RGB IMAGE CONVERTED TO HSI
The degradation in hue and saturation planes is due to the

nonlinearity of cos and min operations!
The intensity
plane is
slightly
smoother than
any of the 3
RGB noisy
planes.
1
I = ( R + G + B)
3
Compare
Image
averaging
reduces
random noise!
287
ONE NOISY RGB CHANNEL
AFFECTS ALL HSI PLANES
Salt & pepper noise
Probability of salt: 0.05

Probability of pepper: 0.05
The noise spreads

from the green
channel to all the HSI
planes.
288
EXCEPTIONS FOR COLOR VECTOR
PROCESSING
Filtering of full-color images can be carried out in 2 ways:
1. Per image basis

2. Color vector space
Example: Noise reduction with an averaging filter.
Other filters cannot be formulated in this manner.
Example: The class of order-statistics filters (which are non-linear)
How do we implement a median filter in color vector space?

How do we find a scheme for ordering vectors so that the median makes sense?
The process is considerably more complex when dealing with vectors!
The Plataniotis & Venetsanopoulos book is a good reference on:
• Vector ordering
• Some of the filters based on the ordering concept
289
COLOR IMAGE COMPRESSION
Compression reduces
the amount of data
required to represent a
digital image.
The data that are the

object of any
compression are the
components of each
color pixel.
Compressed with
JPEG 2000.
The compressed
image contains
only 1 data bit for
every 230 bits of
data in the original
image.
290
CHAPTER 7: IMAGE COMPRESSION
Every day, an enormous amount of digital information is

stored, processed and transmitted.
The entire catalog of the Library of Congress (the
world’s largest library) is available electronically.
The storage and communications requirements for
information storage are immense!
Image compression addresses the problem of reducing
the amount of data required to represent a digital image.
Two broad categories of compression techniques:
Lossless techniques
Useful in image archiving (e.g., storage of legal or medical
records)
No information is lost.
Lossy techniques
Useful in applications such as broadcast TV,
videoconferencing, and facsimile transmission.
A certain amount of information loss is acceptable.
291
MULTIMEDIA STORAGE REQUIREMENTS
(WINDOW OF 640X480 PIXELS)
Text
2 bytes for every 8x8 pixel character
# of characters/page = (640x480)/(8x8) = 4,800 bytes
Storage/screen page = 4,800x2 = 9,600 bytes = 9.4 KB
Vector images
A typical image with 500 lines
Each line is defined by its coordinates in x & y directions,
and by an 8-bit attribute field.
Coordinates in the x direction require 10 bits: log2(640).
Coordinates in the y direction require 9 bits: log2(480).
Bits/line = 9 + 10 + 9 + 10 + 8 = 46 bits.
Storage/screen page = 500x46/8 = 2,875 bytes = 2.8 KB
Bit-mapped images
256 different colors
Storage/screen page = 640x480x1 = 307,200 bytes = 300 KB
292
MULTIMEDIA STORAGE REQUIREMENTS
(WINDOW OF 640X480 PIXELS)
Speech of telephone quality

Sampled at 8 kHz, and quantized using 8 bits/sample => 64 Kbits/s
Storage/second = 64/8 = 64 Kbits = 8 KB
Stereo audio of CD quality
Sampled at 44.1 kHz, and quantized using 16 bits/sample.
Storage/second = (2x44,100x16)/8 = 176,400 bytes = 172 KB
Video
25 frames/second
1 luminance later, 2 chrominance layers
PAL standard: 625 lines x 833 pixels
CCIR 601 (studio standard)
Luminance (Y) sampled at 13.5 MHz
Chrominance (R-Y & B-Y) sampled at 6.75 MHz
Storage/second = 640x480x25x3 = 23,040,000 bytes = 22,500 KB
293
JPEG
Joint Photographic Experts Group (JPEG)

A standardized lossy image compression algorithm
Based on DCT (Discrete Cosine Transform)
Designed for compressing either full-color or gray-scale
images of natural, real-world scenes.
GIF compression
Best for images with flat areas of color (e.g., logos, line
drawings, simple cartoons).
Lossless
Based on LZW (Lempel-Ziv-Welsh)
Each color layer is handled separately.
The resolution of individual layers may be different.
Video color models: YUV, YIQ, YCbCr
Chroma subsampling: 4:4:4, 4:2:2, 4:2:0
294
FOUR MODES OF OPERATIONS
Sequential baseline
A simple and efficient algorithm.
Adequate for most applications.
The image is scanned in a raster scan fashion l-to-r/t-to-b.
Progressive
The image is encoded in multiple scans at the same spatial
resolution.
Hierarchical
The image is encoded at multiple spatial resolutions.
Lower resolution images may be displayed without having to
decompress the image at a higher spatial resolution.
Can be implemented using sequential, progressive, or lossless
modes.
Lossless
The image is encoded to guarantee exact recovery of every sample
value.
Compression efficiency is inherently lower than those of lossy
methods.
295
SEQUENTIAL BASELINE
DCT Quantization
Color
components
8x8
blocks
Quantization
tables
Coding
tables
Header
Tables
DPCM
Entropy
coding
Data RLC
296
DCT & INVERSE DCT
DCT
1 N −1 N −1
(2i + 1)uπ (2 j + 1)vπ
F (u, v) = C (u )C (v)∑∑ f (i, j ) cos cos
2N i =0 j =0 2N 2N
1
C ( x) = if x = 0, else 1 if x > 0
2
Inverse DCT
~ 1 N −1 N −1
( 2i + 1)uπ (2 j + 1)vπ
f (i, j ) =
2N
∑∑ C (u )C (v) F (u, v) cos
u =0 v =0 2N
cos
2N
1
C ( x) = if x = 0, else 1 if x > 0
2
297
A CASE STUDY: TEST IMAGE
298
A CASE STUDY: AN 8X8 BLOCK
52 55 61 66 70 61 64 73
63 59 66 90 109 85 69 72
62 59 68 113 144 104 66 73
63 58 71 122 154 106 70 69
67 61 68 104 126 88 68 70
79 65 60 70 77 68 58 75
85 71 64 59 55 61 65 83
87 79 69 68 65 76 78 94
299
A CASE STUDY: LEVEL SHIFTING
-76 -73 -67 -62 -58 -67 -64 -55
-65 -69 -62 -38 -19 -43 -59 -56
-66 -69 -60 -15 16 -24 -62 -55
The quantity 2n-1 -65 -70 -57 -6 26 -22 -58 -59

is subtracted
from each pixel
value. -61 -67 -60 -24 -2 -40 -60 -58
n=8 => 2n-1= 128 -49 -63 -68 -58 -51 -65 -70 -53
-43 -57 -64 -69 -73 -67 -63 -45
-41 -49 -59 -60 -63 -52 -50 -34
300
A CASE STUDY: APPLICATION OF DCT
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 -1 -1 -2 -1 -1 0 -1
301
A CASE STUDY: NORMALIZATION MATRIX
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
Z (u , v) 18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
302
A CASE STUDY: QUANTIZATION
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
DCT coefficients -4 1 2 -1 0 0 0 0
are quantized
using the formula
1 0 0 0 0 0 0 0
⎡ T (u , v) ⎤
Tˆ (u , v) = round ⎢ ⎥
⎣ Z (u , v) ⎦ 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
38 consecutive
zeros! 0 0 0 0 0 0 0 0
303
A CASE STUDY: ZIGZAG REORDERING
[-26 -3 1 -3 -2 -6 2 -4 1 -4 1 1 5 0 2 0 0 -1 2 0 0 0 0 0 -1 -1 EOB]
a special EOB
Huffman code word
indicates that the
remainder of the
coefficients are
zeros.
304
A CASE STUDY: PREPARATION FOR
ENTROPY CODING
Entropy coding is lossless.

DC and AC coefficients are treated differently.
Differential Pulse Code Modulation (DPCM) on DC coefficients
Each DPCM-coded DC coefficient is represented by a pair of
symbols: (CATEGORY, AMPLITUDE)
CATEGORY: indicates the # of bits needed to represent the
coefficient.
AMPLITUDE: contains the actual bits.
The 1’s complement notation is used for negative numbers.
Run-length coding (RLC) on AC coefficients
RLC replaces each AC coefficient by a pair (RUNLENGTH, VALUE)
RUNLENGTH: indicates the # of zeros in the run.
VALUE: the next nonzero coefficient.
The special pair (0,0) indicates the EOB after the last nonzero AC
coefficient.
305
A CASE STUDY: DPCM ON DC COEFFICIENTS
Each 8x8 block has only one DC coefficient

The DC coefficient is unlikely to change drastically within
a short distance.
DPCM is an ideal scheme for coding the DC coefficients
d0 = DC0, di = DCi+1 - DCi
Coding of the first difference
-26 (-17) = -9
From Table 8.17, -9 lies in category 4
For category K, additional K bits are needed.
Positive difference: K LSBs
Negative difference: K LSBs – 1 => 0111 – 1 = 0110
DPCM coded DC code word = 1010110
amplitude
category
306
A CASE STUDY: RLC ON AC COEFFICIENTS
Each 8x8 block has 63 AC coefficients.

Most image blocks tend to have small high frequency
coefficients.
These small coefficients are zeroed out by quantization.
Coding of the first AC coefficient
(RUNLENGTH, VALUE) = (0,-3)
From Table 8.17, -3 lies in category 2
From Table 8.19, (0/2) maps to 01
For category K, additional K bits are needed.
Positive difference: K LSBs
Negative difference: K LSBs – 1 => 01-1 = 00
RLC coded AC code word = 0100
307
A CASE STUDY: CODING CATEGORIES
308
A CASE STUDY: DC BASE CODES
309
A CASE STUDY: AC BASE CODES
310
A CASE STUDY: AC BASE CODES
311
A CASE STUDY: COMPLETELY CODED ARRAY
1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001
[-26 -3 1 -3 -2 -6 2 -4 1 -4 1 1 5 0 2 0 0 -1 2 0 0 0 0 0 -1 -1 EOB]
001 100101 11100110 110110 0110 11110100 000 1010
# of bits needed to store the 8x8 block = 64x8 = 512
# of bits after JPEG compression = 92
Compression ratio = 512/92 => 5.6:1
312
A CASE STUDY: DECOMPRESSION BEGINS
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
313
A CASE STUDY: DENORMALIZATION
-416 -33 -60 32 48 0 0 0
12 -24 -56 0 0 0 0 0
-42 13 80 -24 -40 0 0 0
-56 17 44 -29 0 0 0 0
18 0 0 0 0 0 0 0
T& (u , v ) = Tˆ (u , v ) Z (u , v ) 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
314
A CASE STUDY: INVERSE DCT
-70 -64 -61 -64 -69 -66 -58 -50
-72 -73 -61 -39 -30 -40 -54 -59
-68 -78 -58 -9 13 -12 -48 -64
-59 -77 -57 0 22 -13 -51 -60
-54 -75 -64 -23 -13 -44 -63 -56
-52 -71 -72 -54 -54 -71 -71 -54
-45 -59 -70 -68 -67 -67 -61 -50
-35 -47 -61 -66 -60 -48 -44 -44
315
A CASE STUDY: LEVEL SHIFTING
58 64 67 64 59 62 70 78
56 55 67 89 98 88 74 69
60 50 70 119 141 116 80 64
The quantity 2n-1 69 51 71 128 149 115 77 68

is added from
each pixel value.
74 53 64 105 115 84 65 72
n=8 => 2n-1= 128
76 57 56 74 75 57 57 74
83 69 59 60 61 61 67 78
93 81 67 62 69 80 84 84
316
A CASE STUDY: DIFFERENCE
52 55 61 66 70 61 64 73 58 64 67 64 59 62 70 78
63 59 66 90 109 85 69 72 56 55 67 89 98 88 74 69
62 59 68 113 144 104 66 73 60 50 70 119 141 116 80 64
63 58 71 122 154 106 70 69 69 51 71 128 149 115 77 68
67 61 68 104 126 88 68 70 74 53 64 105 115 84 65 72
79 65 60 70 77 68 58 75 76 57 56 74 75 57 57 74
85 71 64 59 55 61 65 83 83 69 59 60 61 61 67 78
87 79 69 68 65 76 78 94 93 81 67 62 69 80 84 84
-6 -9 -6 2 11 -1 -6 -5
7 4 -1 1 11 -3 -5 3
2 9 -2 -6 -3 -12 -14 9
-6 7 0 -4 -5 -9 -7 1
-7 8 4 -1 11 4 3 -2
3 8 4 -4 2 11 1 1
2 2 5 -1 -6 0 -2 5
-6 -2 2 6 -4 -4 -6 10
317
JPEG 2000
Major limitation of JPEG: coding performance degrades

rapidly at low bit rates (especially below 0.25 bpp).
The latest series of standards from the JPEG committee.
Designed to address a variety of applications including
color facsimile, printing, scanning, digital photography,
remote sensing, and medical imagery.
Based on the Discrete Wavelet Transform (DWT).
Addresses a number of problems including:
Low bit rate compression
Lossless and lossy compression
Computer-generated imagery
Error resilience for transmission in noisy environments
Core compression algorithm: EBCOT (Embedded Block
Coding with Optimized Truncation)
318
JPEG 2000: BLOCK DIAGRAM
Image Compressed
data data
Entropy
DWT Quantization
coding
Tiling DWT on each tile
DC level
shifting
319
1-DIM DISCRETE WAVELET TRANSFORM (DWT)
1
Wϕ ( j 0 , k ) =
M
∑ f ( x)ϕ
x
j0 , k (x) : approximation coefficients
1
Wψ ( j , k ) =
M
∑ f ( x)ψ
x
j ,k (x) : detail coefficients
∞
1 1
f ( x) =
M
∑Wϕ ( j
k
0 , k )ϕ j0 ,k ( x) +
M
∑∑Wψ ( j, k )ψ
j = j0 k
j ,k ( x)
f ( x ), ϕ j0 ,k ( x ), and ψ j ,k ( x ) are functions of the discrete variable x = 0,1,2,..., M − 1.
Normally, j = 0, and M is a power of 2.

0
j
Summations are performed over x = 0,1,2,..., M - 1, j = 0,1,2,..., J - 1, k = 0,1,...,2 − 1.
320
1-DIM FAST WAVELET TRANSFORM (FWT)
Computationally efficient implementation of DWT.
321
ϕ ( x, y ) : 2 - dim scaling function
ψ H ( x, y ), ψ V ( x, y ), ψ D ( x, y ) : 2 - dim wavelets
ϕ ( x, y ) = ϕ ( x)ϕ ( y ) : separable scaling function
ψ H ( x, y ) = ψ ( x)ϕ ( y ) : measures variations along columns
ψ V ( x, y ) = ϕ ( x)ψ ( y ) : measures variations along rows
ψ ( x, y ) = ψ ( x)ψ ( y )
D
: measures variations along diagonals
322
ϕ j ,m,n ( x, y ) = 2 j / 2 ϕ (2 j x − m, 2 j x − n)
i = {H , V , D}
ψ j ,m,n ( x, y ) = 2 ψ (2 x − m, 2 x − n)
j/2 j j
M −1 N −1
1
Wϕ ( j0 , m, n) =
MN
∑∑ f ( x, y)ϕ
x =0 y =0
j0 , m , n ( x, y )
M −1 N −1
1
Wψ ( j , m, n) =
i
MN
∑∑ f ( x, y )ψ
x =0 y =0
j ij , m , n , m , n
( x, y )
1 1
f ( x, y ) =
MN
∑∑Wϕ ( j0 , m, n)ϕ j0 ,m,n( x, y ) +
m n MN
∑ ∑∑∑Wψ ( j, m, n)ψ
i = H ,V , D j = j0 m n
i i
j ,m,n ( x, y )
j
Normally, j = 0, and M = N = 2 .
0
j
Summations are performed over x = 0,1,2,..., M - 1, j = 0,1,2,..., J - 1, m,n = 0,1,...,2 − 1.
323
2-DIM FAST WAVELET TRANSFORM (FWT)
1st and 2nd level decomposition
324
A FWT EXAMPLE
128x128
computer-
generated 1st level
image decomposition
2nd level 3rd level

decomposition decomposition
325
CHAPTER 8: MORPHOLOGICAL IMAGE PROCESSING
Morphology: a branch of biology that deals with the form

and structure of animals and plants.
Mathematical morphology: extraction of image
components that are useful in the representation and
description of regional shape.
Boundaries
Skeletons
Convex hull
The language of mathematical morphology is set theory.
Sets in mathematical morphology represent objects in an
image.
In binary images, the sets are members of Z2.
Until this chapter, the I/O of DIP methods were images.
Now, we have a transition to DIP methods where I are
images but O are attributes extracted from those images.
326
B̂
BASIC CONCEPTS FROM SET THEORY
A ∈ Z2.
a = (a1, a2) ∈ A.
Empty set: the set with no elements.
w = (w1,w2) & C = {w|w = -d, d ∈ D}: C is the set of elements w
s.t. w is formed by multiplying each of the 2 coordinates of all
the elements of D by -1.
A ⊆ B: A is a subset of B.
C = A U B: C is the union of A and B.
D = A ∩ B: D is the intersection of A and B.
A ∩ B = ∅: A and B are disjoint sets.
Ac = {w|w ∉ A}: The complement of A.
A – B = {w|w ∈ A, w ∉ B} = A ∩ Bc : The difference of A and B.
B̂ = {w|w = -b, b ∈ B}: Reflection of B.
(A)z = {c|c = A + z, a ∈ A}: Translation of A by z = (z1,z2)
327
BASIC SET OPERATIONS
328
3 BASIC LOGICAL OPERATIONS
Other logical operations can be constructed using the above

definitions.
Example:
p q p XOR q
0 0 0
0 1 1
1 0 1
1 1 0
329
LOGICAL OPERATIONS BETWEEN
BINARY IMAGES
p q p AND q
0 0 0
0 1 0
1 0 0
1 1 1
330
DILATION
Dilation and erosion: Operations fundamental to

morphological processing.
Dilation expands an image, and erosion shrinks it.
Many morphological algorithms in this chapter are based
on these 2 primitives.
A ⊕ B = {z | ( Bˆ ) z ∩ A ≠ ∅}: dilation of A and B
Obtain the reflection of B about its origin, and shift this
reflection by z.
The set of all displacements z s.t. B̂ and A overlap by at least
one element.
B: the structural element
Alternative definitions of dilation
A ⊕ B = {w ∈ Z 2 | w = a + b, a ∈ A, b ∈ B}
A ⊕ B = U ( A) b
b∈B
331
2 EXAMPLES OF DILATION
The structural element B and

its reflection are equal
because B is symmetric w.r.t
its origin.
332
AN APPLICATION OF DILATION:
BRIDGING GAPS
333
EROSION
AOB = {z | ( B) z ⊆ A} : Erosion of A by B.
The set of all points z s.t. B, translated by z, is contained in A.
Alternative definitions of erosion
A O B ={w∈Z 2 | w+ b∈ A, ∀b∈B}
AOB = I ( A) −b
b∈B
Dilation and erosion are duals of each other w.r.t. set

complementation and reflection:
AOB = Ac ⊕ Bˆ
334
2 EXAMPLES OF EROSION
335
AN APPLICATION OF EROSION
In general, dilation does not fully restore eroded objects.
Erosion Dilation
Structure element: Structure element:
13X13 pixels 13X13 pixels
(all 1’s) (all 1’s)
336
OPENING & CLOSING
opening and closing: 2 other important morphological

operations.
Opening A by B: A o B = ( A O B ) ⊕ B
Erosion of A by B, followed by a dilation of the result by B.
Generally smoothes the contour of an object
Breaks narrow isthmuses
Eliminates thin protrusions
Closing of A by B: A • B = ( A ⊕ B ) Ο B
Dilation of A by B, followed by an erosion of the result by B.
Also tends to smooth sections of contours.
Generally fuses narrow breaks and long thin gulfs
Eliminates small holes
Fills gaps in the contour
337
GEOMETRIC INTERPRETATION OF OPENING
the union
of all
translates
of B that
fit into A.
338
GEOMETRIC INTERPRETATION OF CLOSING
A point w ∈ A•B ⇔ (B)z ∩ A ≠ ∅
339
OPENING & CLOSING OPERATIONS
The bridge and A circular structural

the protruding element results in
elements were smoothing in parts of
eliminated. the object.
Erosion
Opening
Outward pointing of A by B
corners were rounded;
Dilation inward pointing
corners were not
affected.
The size of the

Dilation intrusion was
significantly
reduced.
Closing
of A by B
Inward pointing
corners were rounded;
Erosion outward pointing
corners were not
affected.
340
PROPERTIES SATISFIED BY
OPENING & CLOSING
Opening and closing are duals of each other w.r.t. set

complementation and reflection:
( A • B ) c = ( A c o Bˆ )
The opening operation satisfies the following properties:
A ° B is a subset of A.
If C ⊆ D, then (C ° B) ⊆ (D ° B).
(A ° B) ° B = A ° B
The closing operation satisfies the following properties:
A is a subset of A • B.
If C ⊆ D, then (C • B) ⊆ (D • B).
(A • B) • B = A • B
341
A MORPHOLOGICAL FILTER:
OPENING FOLLOWED BY CLOSING
Morphological
operations can be
used to construct
filters. The size
of the
The noise manifests dark
itself as light spots
elements on a dark increased
background and as in size.
dark elements on the
light components of
the fingerprint.
The opening operation

reduced the noise
components in size or
deleted them The noise
completely. However, has been
it created new gaps eliminated
between the with
fingerprint ridges. minimal
distortion
to the
fingerprint.
Most of the gaps were restored
but the ridges were thickened.
342
HIT-OR-MISS TRANSFORMATION
Local background
Set A is the union
of X w.r.t. W
of 3 disjoint sets
A O X: set of all
locations of the
origin of X at which
X found a match
(hit) in A.
Generalized notation:
B = (B1, B2) Ac O (W-X): set of
all locations of the
B1= X
origin of (W-X) at
B2= (W – X) which (W-X) found a
match (hit) in Ac.
A ∗ B = (A O B1) ∩ (Ac O B2)
Contains all the origin points at

which B1 found a match in A and
B2 found a match in Ac.
343
SOME BASIC MORPHOLOGICAL ALGORITHMS
Boundary extraction
Region filling
Extraction of connected components
Convex hull
Thinning
Thickening
Skeletons
Pruning
344
BOUNDARY EXTRACTION
Commonly used but

0’s not unique
1’s
When the origin of B is on the edge of the β ( A) = A − ( AΟB )

set A, part of B may be outside the image.
How do we handle this situation?
Normal treatment: Assume that values

outside the borders of the image are 0.
345
AN APPLICATION OF BOUNDARY EXTRACTION
The boundary is
1 pixel thick
1’s 0’s
1 1 1
1 1 1 Structural element
1 1 1
346
REGION FILLING
A subset
whose
elements are
8-connected
boundary
The objective is to fill the
points of a entire region with 1’s.
region.
X0 = p X k = ( X k −1 ⊕ B ) I A c , k = 1,2,3,...
The algorithm terminates

at step k when X k = X k −1
The union of X k and A contains

the filled region and its boundary.
347
AN APPLICATION OF REGION FILLING
Such an image may be the result of thresholding into 2 levels a scene

containing polished spheres.
We will eliminate the reflections by region filling.
348
EXTRACTION OF CONNECTED COMPONENTS
Extraction of connected components in a binary image is central to many automated

image analysis applications.
The shape of B assumes 8-

connectivity between pixels
Y is a
connected X k = ( X k −1 ⊕ B) I A, k = 1,2,3,...
component
in A.
X0 = p
The algorithm terminates

at step k when X k = X k −1.
Y = Xk
349
AN APPLICATION OF
CONNECTED COMPONENT EXTRACTION
Connected components are frequently used for automated inspection.
Analysis of the size of the

remaining objects.
4 of the
connected
components
are dominant
in size.
350
CONVEX HULL
A set A is said to be convex if the straight line segment

joining any 2 points lies entirely within A.
The convex hull H of an arbitrary set S is the smallest
convex set containing S.
The set difference H-S is called the convex deficiency of
S.
The convex hull and convex deficiency are useful for
object description.
The procedure for obtaining the convex hull, C(A), of a
set S:
Starting with X0i , iteratively apply the hit-or-miss transform to
A with Bi. When the iteration converges, perform the union
with A, and call the result Di.
C(A) = union of all Di’s.
351
CONVEX HULL: AN EXAMPLE
Algorithm for obtaining the

convex hull, C(A), of a set A:
B i , i = 1,2,3,4 : 4
structuring elements
X ki = ( X k −1 ∗ B i ) U A,
i = 1,2,3,4, and k = 1,2,3,...
X0 = A
X 01 = A D i = X conv
i
with X k = X k −1
X =A
2
0 4
Initial points
X =A
3 C ( A) = U D i
0
i =1
X =A
4
0
352
SHORTCOMING OF THE ALGORITHM FOR
OBTAINING THE CONVEX HULL
Shortcoming of the algorithm: the convex hull can grow beyond the
min dimensions required to guarantee convexity.
A simple approach to reduce this effect: limit growth so that it does

not extend past the vertical and horizontal dimensions of the original
set of points.
Boundaries of greater complexity can be used to limit growth even

further in images with more detail, e.g., the max dimensions of the
original set of points along the horizontal, vertical, and diagonal
directions can be used.
353
EXTENSIONS TO GRAY SCALE IMAGES
The following basic operations will be extended to gray

scale images:
Dilation
Erosion
Opening
Closing
These operations will be used to develop several
morphological algorithms.
Boundary extraction
Region partitioning
Smoothing
Sharpening
f(x,y): input image, b(x,y): a structuring element
354
GRAY SCALE DILATION
( f ⊕ b)( s, t ) = max{ f ( s − x, t − y ) + b( x, y ) | ( s − x), (t − y ) ∈ D f ; ( x, y ) ∈ Db }

D f : domain of f
Db : domain of b
f and b are functions, and not sets.
Gray scale: ( s − x ), (t − y ) ∈ D f ; ( x, y ) ∈ Db
analogous
Binary: the 2 sets have to overlap by at least 1 element
The general effect of performing dilation on a gray scale image:
• The values of b > 0 ⇒ the output image tends to be brighter.
• Dark details are either reduced or eliminated, depending on how their

values and shapes relate to b used for dilation.
355
1-D EXAMPLE OF GRAY SCALE DILATION
( f ⊕ b)( s ) = max{ f ( s − x) + b( x) | ( s − x) ∈ D f ; x ∈ Db }
Conceptually, f
Unlike the
sliding by b is binary case, f,
not different not b, is
from b sliding by shifted.
f.
The actual
mechanics of
gray scale
dilation is easier
to visualize if b
is the function
that slides past f.
At each position
of b, the value of
dilation at that
point is the max
of f+b in the
interval spanned
by b.
356
GRAY SCALE EROSION
( f Ο b)( s, t ) = min{ f ( s + x, t + y ) − b( x, y ) | ( s + x), (t + y ) ∈ D f ; ( x, y ) ∈ Db }

D f : domain of f
Db : domain of b
Gray scale: ( s + x), (t + y ) ∈ D f ; ( x, y ) ∈ Db

analogous
Binary: b has to be completely contained by A.
The general effect of performing erosion on a gray scale image:
• The values of b > 0 ⇒ the output image tends to be darker.
• Bright details that are smaller in area than the structuring element are
reduced, with the degree of reduction determined by the gray level
values surrounding the bright detail, and by the shape and values of b.
357
1-D EXAMPLE OF GRAY SCALE EROSION
( f Ο b)( s ) = max{ f ( s + x) − b( x) | ( s + x) ∈ D f ; x ∈ Db }
Conceptually, f
sliding by b is not Unlike the binary
case, f, not b, is
different from b shifted.
sliding by f.
At each position of
b, the value of
erosion at that
point is the min of
f-b in the interval
spanned by b.
( f Ο b) c ( s, t ) = ( f c ⊕ bˆ)( s, t )
Gray scale dilation and erosion are duals w.r.t.
function complementation and reflection.
f c = − f ( x, y ) & bˆ = b(− x,− y )
358
AN EXAMPLE OF GRAY SCALE
DILATION & EROSION
The structural element: The dilated image is brighter.

The sizes of the dark features are
• Flat-top reduced.
• Parallelepiped
• Unit height
• 5x5 pixels
The eroded image is darker.
The sizes of the small, bright
features are reduced.
359
OPENING AND CLOSING FOR
GRAY SCALE IMAGES
f o b = ( f Ο b) ⊕ b : The opening of image f by subimage b
f • b = ( f ⊕ b ) Οb : The closing of image f by subimage b
( f • b) c = f c o bˆ : The opening & closing of gray scale images are

duals w.r.t. function complementation and reflection.
Opening and closing of images have a simple geometric interpretation:
Suppose we view an image function f(x,y) in 3D perspective.
The image appears as a discrete surface: f at (x,y).
Opening: rolling a ball against the underside of the surface.
Closing: rolling a ball on top of the surface.
360
GEOMETRIC INTERPRETATION OF
OPENING & CLOSING
To simplify the
illustration, a scan line of
a gray scale image is
shown as a continuous
function.
Opening: rolling a ball

against the underside
of the surface.
All the peaks that were

narrow w.r.t. the
diameter of the ball
were reduced in
amplitude and
sharpness.
Closing: rolling a ball

on top of the surface.
The peaks are left in

their original form.
361
OPENING AND CLOSING OF
A GRAY SCALE IMAGE
Original image
f o b = ( f Ο b) ⊕ b f • b = ( f ⊕ b)Οb
The initial erosion The subsequent

removes the small dilation The initial dilation The subsequent
details but it also increases the removes the dark erosion darkens
darkens the overall intensity details but it also the image
image. without brightens the without
reintroducing image. reintroducing
the details the details
removed by removed by
erosion. dilation.
Opening is Closing is
generally used to generally used to
remove small remove dark
light details from details from an
an image while image while
leaving the leaving bright
overall gray features
levels and larger relatively
bright features undisturbed.
relatively
undisturbed.
362
MORPHOLOGICAL SMOOTHING
One way to achieve The structural element:

smoothing is to
perform a • Flat-top
morphological opening • Parallelepiped
followed by a closing. • Unit height
• 5x5 pixels
The net result of these
2 operations is to
remove or attenuate
both bright and dark
artifacts or noise.
363
MORPHOLOGICAL GRADIENT
dilation erosion
g = ( f ⊕ b) − ( f Ο b)
The morphological The structural element:

gradient highlights
sharp gray level • Flat-top
transitions in the • Parallelepiped
input image. • Unit height
• 5x5 pixels
364
TOP-HAT TRANSFORMATION
Original image h = f − ( f o b)
Note the enhancement of detail in the

background region below the lower part
of the horse’s head.
365
TEXTURAL SEGMENTATION
Closing tends to remove dark details from an image.
Procedure:
1. Close the image by using successively larger structuring elements.

2. When the size of the structuring element corresponds to that of the small blobs, the small blobs are
removed from the image, leaving a light background in the area previously occupied by them.
3. At this point, only the larger blobs and the light background on the left and between the large blobs
themselves remain.
4. A single opening is performed with a structuring element that is large in relation to the separation between
the large blobs.
5. This operation removes the light patches between the large blobs, leaving a large region on the right
consisting of the large dark blobs and the now equally dark patches between these blobs.
6. The process has produced a light region on the left and a dark region on the right. Boundary
7. A simple threshold yields the boundary between the 2 textural regions. superimposed
on the original
image
366
GRANULOMETRY
Granulometry: a field that deals principally with determining the size distribution of particles in an image.
Procedure:
1. Opening operations with structural elements of increasing size are performed on the original image.
2. The difference between the original image and its opening is computed after each pass when a different
structural element is completed.
3. At the end of the process, these differences are normalized, and then used to construct a histogram of
particle size distribution.
The approach is based on the idea that opening operations of a particular size have the most effect on
regions of the input image that contain particles of similar size. Thus, a measure of the relative number of
such particles is obtained by computing the difference between the input and output images.
Indicates the presence of 3

predominant particle sizes.
Light objects of 3
different sizes.
The objects are

overlapping and too
cluttered to enable
detection of
individual particles.
367
CHAPTER 9: IMAGE SEGMENTATION
Segmentation subdivides an image into its constituent

regions or objects.
Image segmentation algorithms are generally based on
one of 2 basic properties of intensity values:
Discontinuity
Similarity
Algorithms based on discontinuity: the approach is to
partition an image based on abrupt changes in intensity
(e.g., edges).
Algorithms based on similarity: the approach is to
partition an image into regions that are similar according
to a predefined set of criteria.
Segmentation bridges the gap between low-level image
processing and high-level image processing.
368
DETECTION OF DISCONTINUITIES
3 basic types of gray-level discontinuities:

Points
Lines
Edges
The most common way to look for discontinuities is to use
a mask.
w1 w2 w3 z1 z2 z3
w4 w5 w6 3x3 mask z4 z5 z6 3x3 region of the image
w7 w8 w9 z7 z8 z9
Response of the mask at any point in the image: R = Σwizi
Gray level of the

pixel associated
with wi
369
POINT DETECTION
A point is detected at the location on which the mask is centered if |R| ≥ T, T > 0.
The sum of the

coefficients is 0.
A single
black pixel
embedded
within the
porosity.
Application T = 90% of the highest

of the mask absolute pixel value in (c)
370
LINE DETECTION
Mask responses
R1 R2 R3 R4
At a certain point, |Ri| > |Rj|, for all j ≠ i: the point is more
likely associated with a line in the direction of mask i.
371
AN EXAMPLE OF LINE DETECTION
We are interested in
finding all the lines
that are 1 pixel
thick, and are
oriented at -450.
T = the max
Absolute
value in the
value of
image
the result
372
EDGE DETECTION
Edge detection is the most common approach for

detecting meaningful discontinuities in gray level.
Intuitively, an edge is a set of connected pixels that lie on
the boundary of 2 regions.
We will use first and second digital derivatives for edge
detection in images.
Difference between an edge and a boundary:
Edge is local.
Boundary is global.
In practice, edges are blurred.
The degree of blurring is determined by a number of factors.
Quality of the acquisition system
Sampling rate
Illuminations conditions
Hence, edges are modeled as having a ramplike profile.
373
MODELING OF A EDGE
The thickness of the edge

is determined by the
length of the ramp.
Sharp edges tend Blurred edges

to be thin. tend to be thick.
374
DETAIL NEAR AN EDGE
First derivatives: the magnitude

can be used to detect the presence
of an edge at a point.
Produces 2 values
for every edge (an
Second derivatives: the sign can undesirable feature).
be used to determine whether an
edge pixel lies on the dark or light
side of an edge. Crossing
near the
midpoint of
the edge
375
SENSITIVITY OF DERIVATIVES TO NOISE
Image segment 1st D 2nd D

Although the noise
is almost invisible Noise free
in the images, both
derivatives are
sensitive to noise.
Gaussian noise: mean =0, σ =0.1
376
DEFINITION OF AN EDGE POINT
To be classified as a meaningful edge point, the transition

in gray level associated with that point has to be
significantly stronger than the background at that point.
Two definitions of edge point
Its 1st order derivative > T.
Zero crossing of its 2nd order derivative.
Edge: a set of edge points that are connected according
to a predefined criterion of connectedness.
Edge segment: the edge is short relative to the
dimensions of the image.
The edge definitions do NOT guarantee success in finding
edges in an image. They simply give us a formalism to
look for them.
377
THE GRADIENT
∇f = [Gx2+Gy2]1/2
(not preferred)
2x2 filters
∇f ≈ |Gx|+|Gy|
(more attractive)
Gx = (z9-z5) Gy = (z8-z6)
α(x,y) =tan-1(Gy/Gx)
direction
Gx = (z7+z8+z9)-(z1+z2+z3) Gy = (z3+z6+z9)-(z1+z4+z7)
3x3 filters
Gx = (z7+2z8+z9)-(z1+2z2+z3) Gy = (z3+2z6+z9)-(z1+2z4+z7)
378
DIAGONAL EDGE DETECTION
379
AN EXAMPLE OF HORIZONTAL & VERTICAL
EDGE DETECTION
The roof tile, horizontal brick joints, and

horizontal segments of the windows.
1200x1600 pixels
The corner near the

wall, the lamp post,
vertical brick joints,
and vertical segments
of the windows.
The directionality of the 2

components is evident.
380
PRINCIPLE EDGE DETECTION
The contribution made to image detail by the wall bricks is significant.

This level of detail is often not desirable. One way to reduce it to smooth the image.
The response of each

mask now shows
almost no contribution
due to the bricks.
The principle edges

dominate the result.
Note that averaging

caused the response of
all edges to be weaker.
381
EMPHASIS ON DIAGONAL EDGE DETECTION
If it is important to emphasize the diagonal

edges, a diagonal mask can be used.
The stronger diagonal response is evident.
The response in H & V directions is weaker.
382
THE LAPLACIAN
z1 z2 z3
z4 z5 z6 3x3 region of the image
z7 z8 z9
0 -1 0
-1 4 -1 ∇2f = 4z5 –(z2+z4+z6+z8)

0 -1 0
-1 -1 -1
-1 8 -1 ∇2f = 8z5 –(z1+z2+z3+z4+z6+z7+z8+z9)

(diagonal neighbors included)
-1 -1 -1
383
ROLE OF THE LAPLACIAN IN SEGMENTATION
In general, the Laplacian is not used in its original form for

edge detection for several reasons:
It is unacceptably sensitive to noise.
Its magnitude produces double edges (an undesirable effect).
It is unable to detect edge direction.
Hence, the role of the Laplacian in segmentation consists of:
Using its zero-crossing property for edge location
Using it to check whether a pixel is on the dark or light side of an
edge
r2
−
Consider the smoothing function h(r ) = −e 2σ 2
, r 2 = x 2 + y 2 , σ : SD
r2
⎡ r 2 − σ 2 ⎤ − 2σ 2
The Laplacian of h: ∇ h(r ) = − ⎢
2
⎥e (Laplacian of a Gaussian)
⎣ σ
4
⎦
384
THE LAPLACIAN OF A GAUSSIAN (LoG)
The second derivative is a linear operation: convolving an image with ∇2h is the
same as convolving the image with h first and then computing the Laplacian of the
result.
A mask that
approximates ∇2h.
The coefficients sum to zero.
385
COMPARISON OF 2 APPROACHES
FOR EDGE FINDING
Sobel gradient
Comparison between (b) and (g):
1. Edges in the zero-crossing image are

Spatial thinner.
smoothing Laplacian
mask 2. The edges determined by zero-
function crossings form closed loops (a
serious drawback of the method).
3. The computation of zero-crossings
presents a challenge in general.
More sophisticated techniques are
often required to obtain acceptable
results.
4. Zero-crossing methods are of
interest because of their noise
reduction capabilities and potential
for rugged performance.
5. Because of the noted limitations, the
gradient-based edge finding
LoG Thresholded Zero-crossings techniques are still used more
LoG frequently.
386
EDGE LINKING & BOUNDARY DETECTION
The edge detection methods normally yield pixels lying only

on edges.
In practice, this set of pixels seldom characterizes an edge
completely for several reasons:
Noise
Breaks in the edge from non-uniform illumination
Other reasons that introduce spurious intensity discontinuities.
Edge detection algorithms are followed by linking
procedures to assemble edge pixels into meaningful edges.
We’ll look at several basic approaches.
Local processing
Global processing via the Hough Transform
Global processing via graph-theoretic techniques
387
LOCAL PROCESSING
A simple approach for linking edge points is to analyze the

characteristics of pixels in a small neighborhood (3x3, 5x5)
about every point (x,y) labeled an edge point.
All points that are similar according to a predefined set of
criteria are linked.
This forms an edge of pixels that share the specified criteria.
2 principal properties for establishing similarity:
The strength of the response of the gradient operator
The direction of the gradient operator
An edge pixel at (x0,y0) in the predefined neighborhood of (x,y)
is similar in magnitude to the pixel at (x,y) if
| ∇f(x,y) - ∇f(x0,y0)| ≤ E, E >0.
An edge pixel at (x0,y0) in the predefined neighborhood of (x,y)
has an angle similar to the pixel at (x,y) if
|α(x,y) - α(x0,y0)| ≤ A, A >0.
The point at (x0,y0) is linked to the pixel at (x,y) if both criteria
are satisfied.
388
AN EXAMPLE OF LOCAL PROCESSING
The objective is to find rectangles whose sizes make them candidates for license plates.
Gy
E = 25
Gx A = 15
One of the few rectangles detected.

In the US, the width-to-height ratio is 2:1.
389
THRESHOLDING
Image thresholding has intuitive properties and simple implementation: It

enjoys a central position in applications of image segmentation.
We now introduce thresholding in a more formal way.
f(x,y) > T: an object point T1 < f(x,y) < T2: (x,y) ∈ object class 1
f(x,y) < T: a background point f(x,y) > T2: (x,y) ∈ object class 2
f(x,y) < T1: (x,y) ∈ the background
An image An image
composed of composed of 2
light objects types of light
on a dark objects on a
background. dark
background.
In general, segmentation problems requiring multiple thresholds

are best solved using region growing methods.
390
THRESHOLDING FUNCTION
Thresholding may be viewed as an operation that involves

tests against a function T of the form
T = T[x, y, p(x,y), f(x,y)]
Denotes some local property of (x,y)

e.g., average gray level of a
neighborhood centered at (x,y)
A thresholded image is defined as

1 if f(x,y) > T
g(x,y) =
0 if f(x,y) ≤ T
T depends only on f: the threshold is global.
T depends on both f and p: the threshold is local.
T depends on f, p, and (x,y): the threshold is dynamic or
adaptive.
391
THE ROLE OF ILLUMINATION
Simple image formation model:
f(x,y) = i(x,y)r(x,y) r(x,y)

0 < i(x,y) < ∞ Easy to segment
0 < r(x,y) < 1
Histogram of r(x,y)
i(x,y)
i(x,y)r(x,y)
Difficult to segment
392
BASIC GLOBAL THRESHOLDING
The chosen threshold T

achieved a clean
segmentation by eliminating
the shadows, and leaving
only the objects themselves.
This type of global

thresholding is expected to be
successful in highly
controlled environments.
For example, in
industrial
applications,
illumination can
be controlled.
393
AUTOMATIC DETERMINATION OF T
In the previous example, the value of T was chosen by a

visual inspection of the histogram.
Algorithm to determine T automatically
1. Select an initial estimate for T.
2. Segment the image using T.
G1: all pixel values > T
G2: all pixel values ≤ T
3. Compute µ 1 and µ 2 in regions G1 and G2.
4. Compute a new threshold value: T = 0.5(µ1+ µ 2)
5. Repeat steps (2)-(4) until ∆T ≤ T0.
Estimation of T0
Background and object occupy comparable areas: T0 is the
average gray level of the image.
Objects are small compared to the area occupied by the
background (or vice versa): T0 is a value midway between the
max and min gray levels.
394
AN EXAMPLE OF AUTOMATIC
DETERMINATION OF T
Note the
clear valley
T0 = 0
T = 125
After 3 iterations: T = 125.4
395
BASIC ADAPTIVE THRESHOLDING
Imaging factors (uneven illumination, etc.) can transform

a perfectly segmentable histogram into a histogram that
cannot be partitioned effectively by a single global
threshold.
How do we handle such situations?
One approach is to divide the image into subimages, and
utilize a different threshold for each subimage.
Key issues
How do we subdivide the image?
How do we estimate the threshold for each subimage?
The threshold used for each pixel depends on the
location of the pixel.
Hence, this type of thresholding is adaptive.
396
AN EXAMPLE OF BASIC
ADAPTIVE THRESHOLDING
The global
threshold is
manually placed
in the valley of the
histogram.
Each subimage
with σ 2 > 100:
Subimages not
containing a T automatically
boundary: σ 2 < 75 determined.
T0: midway
Subimages between the min
containing a and max gray
boundary: σ 2 > 100 levels.
All subimages with σ 2 < 100:

16 subimages
Treated as one composite image, and a single T
estimated using the same algorithm.
397
SUBDIVISION OF A PROBLEM SUBIMAGE
clearly
bimodal
almost
unimodal
398
THRESHOLDS BASED ON SEVERAL VARIABLES
Multispectral thresholding: generalization of gray level

thresholding
In color images, each pixel is characterized by 3 RGB
values.
3D histogram:
An image with 3 RGB components, each having 16 possible
levels.
16x16x16 cube
Each cell of the cube contains the # of pixels whose RGB
components have values corresponding to the coordinates
defining the location of the cell.
Each entry is divided by the total # of pixels in the image to
form a normalized histogram.
Color segmentation can be based on any color model.
HSI model: segmentation using the H and S components is
attractive because it involves 2D data clusters.
399
AN EXAMPLE OF COLOR SEGMENTATION
The color image has 3 16- Thresholding about one Thresholding about a
level RGB components. of the histogram clusters cluster close to the red
corresponding to facial axis.
tones.
T: a distance of one cell

Pixels with components outside the cell: black
Pixels with components inside the cell: white
400
REGION-BASED SEGMENTATION
The objective of segmentation is to partition an image into

regions.
Previous approaches
Finding boundaries between regions based on discontinuities in
gray levels
Thresholding based on the distribution of pixel properties
We will now discuss techniques that are based on finding the
regions directly.
Basic formulation
R: the entire image region; n subregions: R1, R2, …, Rn
URi = R
Ri is a connected region, i = 1,2,…,n
Ri∩Rj = ∅ for all i and j, i ≠ j
P(Ri) = TRUE, i = 1,2,…,n, where P(Ri) is a logical predicate
defined over the points in Ri
P(Ri URj ) = FALSE for i ≠ j
401
REGION GROWING
Region growing: a procedure that groups pixels or

subregions into larger regions based on predefined criteria.
Basic approach
Start with a set of “seed” points
Grow regions by appending to each seed those neighboring
pixels with similar properties
Selection of the seed points
Often depends on the nature of the problem
No information is available:
Compute at every pixel the same set of properties that will be used to
assign pixels to regions during the growing process.
If the result of the computations shows cluster of values, the pixels
whose properties place them near the centroid of these clusters can
be used as seeds.
Selection of similarity criteria depends on (1) problem under
consideration, and (2) type of image data available.
Connectivity or adjacency information is often needed.
Formulation of a stopping rule
402
AN EXAMPLE OF REGION GROWING
Several cracks
Histogram of (a) and porosities
65
Selection of initial seeds:
All pixels with a value of 255
Selection of criteria for region growing:
1. |any pixel – seed| < 65
2. To be included in a region, the pixel

has to be 8-connected to at least one
pixel in that region.
No stopping rules were needed: the criteria for region

growing were sufficient to isolate the features of interest.
403
REGION SPLITTING AND MERGING
The previous method grows regions from a set of seed

points.
An alternative is to subdivide an image initially into a
set of arbitrary, disjointed regions, and then merge
and/or split the regions in an attempt to satisfy the
conditions stated earlier.
Split/merge algorithm
R: the entire region, P: the selected predicate
If P(R) = FALSE, divide the image into 4 quadrants.
At any step
Split into 4 disjoint quadrants any region Ri for which
P(Ri ) = FALSE.
Merge any adjacent regions Rj and Rk for which
P(RjURk) = TRUE.
Stop when no further merging or splitting is possible.
Several variations of the algorithm are possible.
404
QUADTREE REPRESENTATION
The root correspond

to the entire image
A quadtree: a tree in
which nodes have exactly
4 descendants
405
AN EXAMPLE OF REGION
SPLITTING AND MERGING
P(Ri) = TRUE if at least 80% of the pixels in Ri have the property |zj – mi| ≤ 2σi
Gray level of Mean gray

jth pixel in Ri level in Ri
If P(Ri) = TRUE, the values of all pixels in Ri are set to mi.

The shading and the stem
were erroneously eliminated.
segmentation T is placed midway

is perfect. between the two
principal peaks of
the histogram.
406
SEGMENTATION BY MORPHOLOGICAL
WATERSHEDS
We have discussed segmentation based on 3 principal concepts:

Detection of discontinuities
Thresholding
Region processing
Each approach have advantages and disadvantages
Speed is an advantage of global thresholding
The need for postprocessing is a disadvantage in methods based on detecting
discontinuities in gray levels
Segmentation by watersheds embodies many of the concepts of the other 3
approaches, and often produces more stable segmentation results.
Topographic and hydrology concepts have proven useful in the development
of region segmentation methods.
A monochrome image is considered to be an altitude surface.
High-amplitude pixels correspond to ridge points.
Low-amplitude pixels correspond to valley points.
If a drop of water were to fall on any point of the altitude surface, it would
move to a lower altitude until it reached a local minimum.
The accumulation of water in the vicinity of a local minimum is called a
catchment basin.
All points that drain into a common catchment basin are part of the same
watershed.
407
CHAPTER 10: DIGITAL IMAGE WATERMARKING
Data Hiding: process of embedding data

(watermark) into multimedia such as image, video,
and audio.
Invisibility: degree of distortion introduced by the
watermark.
Robustness: resistance against attacks and normal
A/V processes.
noise, filtering, resampling, scaling, rotation, cropping
lossy compression
A-D-A & D-A-D conversions
Capacity: amount of data that can be represented by
the embedded watermark
Typical use of watermarks
identification of the origin of content
tracing illegally distributed copies of the content
disabling unauthorized access to the content
408
WATERMARKING SYSTEM
attack
W
detection
Image, I Iw Iw *
embedding ……….
distribution
extraction
Key, k
409
CLASSIFICATION
Criterion Class Brief description
Domain Pixel Pixels values are modified to embed the

type watermark.
Transform Transform coefficients are modified to

embed the watermark. Recent popular
transforms are Discrete Cosine Transform
(DCT), Discrete Wavelet Transform (DWT),
and Discrete Fourier Transform (DFT).
Watermark Pseudo random number Allows the detector to statistically check

type (PRN) sequence (having a the presence or absence of a watermark.
normal distribution with zero A PRN sequence is generated by feeding
mean and unity variance) the generator with a secret seed.
Visual watermark The watermark is actually reconstructed,

and its visual quality is evaluated.
Information Non-blind Both the original image and the secret

type key(s)
Semi-blind The watermark and the secret key(s)
Blind Only the secret key(s)
410
POPULAR TRANSFORMS
In transform coding, the signal is mapped from the

spatial domain into the transform domain.
Each transform
decorrelates the pixels
results in energy compaction
concentrates the energy in the low-frequency zone of
the transform domain.
Discrete Cosine Transform (DCT)
Used in JPEG
Discrete Wavelet Transform (DWT)
Used in JPEG 2000
Discrete Fourier Transform (DFT)
DCT, DWT & DFT are commonly used in digital
watermarking.
411
DCT DOMAIN WATERMARKING
Watermark embedding
W: watermark to be embedded.
X: sequence of pixel values.
XD and YD: row-concatenated DCT coefficients of X and Y.
a = scaling factor: Determines the intensity of the watermark.
YD (i ) = X D (i )(1 + aW )
Watermark extraction
W*: extracted version of the watermark.
ZD: possibly forged watermarked image.
1 ZD (i) W ⋅W *
W (i) =
*
1 ==> S(W ,W ) =
*
a XD (i) W ⋅W *
T = user-defined threshold.
If S > T, image is authentic.
412
DCT DOMAIN WATERMARKING
S>T
413
SCALING FACTOR a = 0.1, 0.5, 1.0, 5.0
Original image
a = 0.1 a = 0.5 a = 1.0 a = 5.0
414
CONCLUSIONS
The scaling factor a is a critical system parameter.

If a is too small, the image is not distorted but the robustness
of the scheme is low.
If a is too large, the image is distorted but the robustness of
the scheme is high.
Modification of low-frequency coefficients
Distorts the image
Gives the hacker a clue about where the watermark is
embedded
Modification of high-frequency coefficients
No distortion
The watermark cannot be detected after attacks like JPEG
compression.
415
DWT DOMAIN WATERMARKING
Each level decomposition produces 4 bands.

LL, HL, LH, HH
The LL band can further be decomposed for another level
of decomposition.
This process is continued until the desired number of
levels determined by the application is reached.
Depending on the watermarking scheme, some of the
DWT coefficients are modified to embed the watermark.
LL2 HL2
HL1 DWT decomposition
LH2 HH2 with 2 levels
LH1 HH1
416
MULTIPLE WATERMARKING
IN THE DWT DOMAIN - FIRST PAPER
A recent paper on DWT-based multiple watermarking

R. Mehul and R. Priti, “Discrete Wavelet Transform
Based Multiple Watermarking Scheme,” Proceedings
of IEEE Region 10 Technical Conference on Convergent
Technologies for the Asia-Pacific, Bangalore, India,
October 14-17, 2003.
Embedding in low frequencies
robust with respect to attacks that have low pass
characteristics (filtering, lossy compression)
Embedding in middle and high frequencies
robust with respect to nonlinear deformations of the gray
scale (contrast/brightness adjustment, gamma correction,
and histogram equalization)
Experiments
Binary watermark “CO” is embedded in the second level LL
Binary watermark “EP” is embedded in the second level HH
Idea is good but implementation has a serious flaw!
417
MULTIPLE WATERMARKING IN THE DWT
DOMAIN - GENERALIZATION
P. Tao and A. M. Eskicioglu, “A Robust Multiple

Watermarking Scheme in the DWT Domain,” Optics East
2004 Symposium, Internet Multimedia Management
Systems V Conference, Philadelphia, PA, October 25-28,
2004.
A generalization of Mehul & Priti paper
Where is the watermark embedded?
First level decomposition
Second level decomposition
What is the visual watermark?
BC (embedded in all 4 quadrants)
A (for rewatermarking)
Comparison of extraction from first and second
decomposition levels
418
ALGORITHM
Input: Cover image, I = (aij, 0 ≤ i,j ≤ 2n) , and

Binary visual watermark, W = (wij ∈ {0,1}, 0 ≤ i,j ≤ n).
Process:
embedding
Using two-dimensional separable dyadic DWT, obtain the desired level of

decomposition of the cover image I.
Modify the DWT coefficients Vij in the LL, HL, LH, and HH bands:
Vwk,ij = Vijk + α kWij , i,j = 1,…,n, and k = 1,2,3,4.
Apply inverse DWT to obtain the watermarked cover image, I w
Output: Watermarked cover image

Input: Watermarked cover image
Process:
Using two-dimensional separable dyadic DWT, obtain the desired level of I w
*
extraction
decomposition of the watermarked (and possibly attacked) cover image

Extract the binary visual watermark from the LL, HL, LH, and HH bands:
Wij* = (Vw*,kij − Vijk ) / α k , i,j = 1,…,n, and k = 1,2,3,4.
If Wij* > 0.5 , then Wij* = 1 else Wij* = 0
Output: Binary visual watermark
419
COVER IMAGE AND VISUAL WATERMARKS
1.000 0.977
(c) (e) (f)
0.985 0.986
(a) (b) (d) (g) (h)
(a) Cover image Goldhill

(b) Watermarked Goldhill (PSNR = 42.40)
(c) Watermark 1 is used in 13 attacks
(d) Watermark 2 is used in rewatermarking and collusion attacks
(e)-(h) The watermarks extracted from the LL, HL, LH, and HH bands,
respectively, before the attacks. The number below each watermark is
the Similarity Ratio (SR) value.
SR = S/(S+D)
S: # of matching pixel values in compared images
D: # of different pixel values in compared images.
420
ATTACKS – 1ST LEVEL
JPEG 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen

(Matlab): 34.96 (Matlab): 29.63 (Matlab): 17.42 (Matlab): 11.44 (Photoshop): 31.21
JPEG 50 (Matlab): 33.11 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
(Matlab): 29.74 (Matlab): 18.87 (Matlab): 11.88 (Matlab): 38.51
JPEG 25 (Matlab): 31.27 Rescale 512 -> 256 -> 512 Gamma Correction 1.5 Pixelate 2 (mosaic) Collusion
(Matlab): 19.81 (Matlab): 17.90 (Photoshop): 30.13 (Matlab): 45.35
421
RECOVERED WATERMARKS – 1ST LEVEL
JPEG Quality 75 Blur (3,3) Histogram Equalization Rotate 200 (Matlab) Sharpen (Photoshop)
JPEG Quality 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
JPEG Quality 25 Rescale 512 -> 256 -> 512 Gamma correction 1.5 Pixelate 2 (mosaic) Collusion
422
ATTACKS – 2nd LEVEL
JPEG 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen

(Matlab): 34.85 (Matlab): 29.60 (Matlab): 17.42 (Matlab): 11.45 (Photoshop): 31.41
JPEG 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
(Matlab): 33.04 (Matlab): 29.73 (Matlab): 18.88 (Matlab): 11.88 ( Matlab): 37.76
JPEG 25 Rescale 512 -> 256 -> 512 Gamma Correction 1.5 Pixelate 2 (mosaic) Collusion
(Matlab): 31.22 (Matlab): 19.80 (Matlab): 17.89 (Photoshop): 30.08 (Matlab): 44.08
423
RECOVERED WATERMARKS – 2nd LEVEL
JPEG Quality 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen
JPEG Quality 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
JPEG Quality 25 Rescale 512 -> 256 -> 512 Gamma correction 1.5 Pixelate 2 (mosaic) Collusion
424
CONCLUSIONS
Watermarks embedded in the lowest frequencies are more robust

against one group of attacks.
JPEG compression, blurring, adding Gaussian noise, rescaling, rotation,
cropping, pixelation, and sharpening
Watermarks embedded in the highest frequencies are more robust
against another group of attacks.
histogram equalization, intensity adjustment, and gamma correction
The watermarks extracted from the four bands are identical for the
remaining attacks.
rewatermarking and collusion
Embedding in the mid frequencies is also resistant to certain attacks.
histogram equalization, intensity adjustment, gamma correction, and
sharpening
Watermark embedding at the first level of decomposition appears
advantageous:
All the frequency bands are modified.
The quality of corresponding images are almost identical.
The extracted watermarks are more textured with better subjective visual
quality.
425
DWT-SVD DOMAIN WATERMARKING
E. Ganic and A. M. Eskicioglu, “Robust DWT-SVD Domain Image

Watermarking: Embedding Data in All Frequencies,” ACM Multimedia and
Security Workshop 2004, Magdeburg, Germany, September 20-21, 2004.
2-D Discrete Wavelet Transform (DWT)

1. Each level decomposition produces 4 bands – LL, HL, LH, and HH.
2. The LL band can further be decomposed for another level of decomposition.
LL2 HL2
HL1
LH2 HH2
LH1 HH1
Singular Value Decomposition (SVD)

1. Every matrix A can be decomposed into a product of 3 matrices: A = UΣVT.
2. Columns of U: left singular vectors of A.
3. Columns of V: right singular vectors of A.
4. Diagonal entries of Σ : singular values of A.
A = λ1U1V1 + λ2U2V2 + … + λrUrVr 426

EMBEDDING AND EXTRACTION
Watermark embedding
1. Using DWT, decompose the cover image into 4 subbands: LL, HL, LH, and HH.
k
k k kT
2. Apply SVD to each subband image: A =U Σ V
a a a
T
3. Apply SVD to the visual watermark: W =U Σ V
W W W
4. Modify the singular values in each subband: λ*i k = λik + α k λ wi , i = 1,..., n

*k k * k kT
5. Construct the watermarked image: A =U Σ V
a a a
Watermark extraction
1. Decompose the watermarked cover image into 4 subbands: LL, HL, LH, and HH.
2. Apply SVD to each subband image: *k k * k kT
A =U Σ V
a a a
3. Extract the singular values from each subband: λ kwi = (λ*i k − λik ) / α k , i = 1,..., n
k k T
4. Construct the four visual watermarks using the singular vectors: W =U Σ V
W W W
427
COVER IMAGE AND VISUAL WATERMARK
Cover image Lena Cameraman Watermarked Lena Constructed Watermarks
428
ATTACKS
Blur 5x5 (xnview) Noise 0.3 (xnview) Pixelate 2 (mosaic) (Photoshop) JPEG 30:1 (xnview)
JPEG2000 50:1 (xnview) Sharpen 80 (xnview) Rescale 256 (xnview) Rotate 200 (Photoshop)
Crop on both sides (Photoshop) Contrast -25 (Photoshop) Histogram Equalization (Photoshop) Gamma correction 0.60 (ImageReady)
429
EXTRACTED WATERMARKS
Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic)
0.885 -0.235 0.865 0.207 1.000 -0.448
-0.184 -0.420 0.271 0.277 -0.380 -0.424
JPEG 30:1 JPEG2000 50:1 Sharpen 80
0.993 0.003 0.989 0.065 0.528 0.553
0.141 -0.381 0.064 -0.402 0.631 0.699
430
EXTRACTED WATERMARKS
Rescale 512Æ 256 Æ 512 Rotate 200 Crop on both sides
0.940 -0.256 -0.353 -0.003 -0.979 0.982
-0.211 -0.437 0.963 -0.335 0.945 0.985
Contrast -20 Histogram Equalization Gamma Correction 0.60
0.158 0.017 0.586 0.657 -0.942 0.946
0.376 0.738 0.716 0.823 0.987 0.997
431
BEST EXTRACTIONS
Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic) JPEG 30:1
JPEG2000 50:1 Sharpen 80 Rescale 512Æ 256 Æ 512 Rotate 200
Crop on both sides Contrast -20 Histogram Equalization Gamma Correction 0.60
432
PURE SVD DOMAIN WATERMARKING
Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic) JPEG 30:1
JPEG2000 50:1 Sharpen 80 Rescale 512Æ 256 Æ 512 Rotate 200
Crop on both sides Contrast -20 Histogram Equalization Gamma Correction 0.60
433
CONCLUSIONS
Watermark embedding in the LL subband is resistant to

one group of attacks.
Gaussian blur, Gaussian noise, pixelation, JPEG
compression, JPEG2000 compression, and rescaling
Watermark embedding in the HH subband is resistant to
another group of attacks.
sharpening, cropping, contrast adjustment, histogram
equalization, and gamma correction
Watermark embedding in the LH subband is resistant to
the rotation attack.
Observers can evaluate the quality of constructed
watermarks either subjectively or objectively.
Subjective evaluation: the reference watermark is
compared with the watermark constructed after an attack.
Objective evaluation: statistical measures like Pearson’s
correlation coefficient can be used, not requiring the
singular vectors of the watermark image.
434
DFT DOMAIN WATERMARKING
DWT or DCT domain semi-blind or blind watermarking

schemes are not robust against geometric attacks (rotation,
translation, scaling, etc.).
Because of the properties of the DFT, recent research has
focused on DFT-based watermarking.
Three approaches
Both a template and a watermark are embedded (Pereira & Pun,
2000).
The template contains no information but is ude to detect the
transformation applied to the watermarked image.
No template, the watermark is resistant to most geometric attacks
(Caldelli, Barni, Bartolini & Piva, 2000)
Circular watermark (Solachidis & Pitas, 1999; Licks & Jordan,
2000)
To guarantee that the inverse DFT is real, the watermark must satisfy
the following property: W(u,v)=W(N-u,N-v)
435
EMBEDDING MULTIPLE CIRCULAR WATERMARKS
IN THE DFT DOMAIN
In two papers, a circularly symmetric watermark is

embedded in the DFT domain
Solachidis and Pitas [1999]: a 2D circularly symmetric
sequence in a ring covering the middle frequencies
Licks and Jordan [2000]: use a watermark in the form of a
circle with a radius that corresponds to higher frequencies of
the image
E. Ganic and A. M. Eskicioglu, “A DFT-Based Semi-Blind
Multiple Watermarking Scheme for Images,” 4th New York
Metro Area Networking Workshop, The Graduate Center,
CUNY, September 10, 2004.
We extend the multiple watermarking idea by inserting two
circular watermarks in the DFT domain.
Test image: 256x256 Lena
Two radii:
32 (corresponds to lower frequencies)
96 (corresponds to higher frequencies)
436
MAJOR DISADVANTAGE OF CIRCULAR WATERMARKS
Watermarked Lena Magnitudes of DFT coefficients
437
TEST IMAGE
Original Lena Watermarked Lena
438
ATTACKS
JPEG Gaussian noise blurring Gamma correction
Resizing Histogram equalization Contrast adjustment Rotation
Cropping
Scaling
439
DETECTION
Presence of the watermark is detected using the correlation

N N
c = ∑∑ W (u , v) M w* (u , v)
u =1 v =1
Decision rule
H0: the image is watermarked with W if c > T
H1: the image is not watermarked with W if c < T
Threshold T = (µ0 + µ1)/2
0: the expected values of the Gaussian probability density
functions (pdfs) associated with the hypotheses H0
1: the expected values of the Gaussian probability density
functions (pdfs) associated with the hypotheses H1
Detection anomalies
False positives: detecting the watermark in an unmarked image
False negatives: not detecting the watermark in a marked image
440
EXPERIMENTAL RESULTS:
THRESHOLDS AND FALSE NEGATIVES
Radius = 96 Radius = 32
T % T %
JPEG 0.086 48 0.228 12
Gaussian noise 0.110 37 0.206 18
blurring 0.120 51 0.228 13
resizing 0.093 55 0.227 13
histogram equalization 0.272 1 0.267 14
contrast adjustment 0.273 0 0.232 11
gamma correction 0.271 0 0.231 11
scaling 0.251 1 0.233 11
rotation 0.142 35 0.174 42
cropping 0.154 21 0.150 34
441
EXPERIMENTAL RESULTS:
THRESHOLDS AND FALSE POSITIVES
Radius = 96 Radius = 32
T % T %
JPEG 0.086 40 0.228 7
Gaussian noise 0.110 24 0.206 13
blurring 0.120 41 0.228 8
resizing 0.093 45 0.227 8
histogram equalization 0.272 0 0.267 4
contrast adjustment 0.273 0 0.232 6
gamma correction 0.271 0 0.231 6
scaling 0.251 0 0.233 7
rotation 0.142 23 0.174 26
cropping 0.154 8 0.150 31
442
CONCLUSIONS
Embedded in higher frequencies

the percentage of false negatives or positives is higher for one
group of attacks
JPEG, Gaussian noise, blurring, and resizing
the percentage of false negatives or positives is lower for
another group of attacks
histogram equalization, contrast adjustment, gamma
correction, scaling, rotation, and cropping
Embedding in lower frequencies
the percentage of false negatives or positives is lower for one
group of attacks
JPEG, Gaussian noise, blurring, and resizing
the percentage of false negatives or positives is higher for
another group of attacks
histogram equalization, contrast adjustment, gamma
correction, scaling, rotation, and cropping
For all attacks, the percentages of false positives are lower
than false negatives.
443
AN EFFECTIVE BLIND WATERMARKING SYSTEM
FOR COLOR IMAGES
J. Kusyk and A.M. Eskicioglu, “An Effective Blind Watermarking

Scheme for Color Images through Comparison of DFT
Coefficients,” ACM Multimedia and Security Workshop 2005, New
York, NY, August 1-2, 2005.
Proposed algorithm
Attacks
Cropping
Low pass filtering
Gamma correction
Gaussian noise
Intensity adjustment
JPEG compression
Resizing
Rotation
Rescaling
Fragmentation
Experiments
Conclusions
444
EMBEDDING
Compute the DFT of the NxN cover image.

Move the origin to the center.
Obtain the magnitudes of DFT coefficients.
Divide the NxN matrix of magnitudes into four (N/2)x(N/2) matrices Mul, Mur, Mll,
Mlr. ul: upper left, ur: upper right, ll: lower left, lr: lower right.
Define three frequency bands: low, middle, and high.
Embed a visual binary watermark in these three bands by determining the
embedding locations.
In each band:
Choose a magnitude a in matrix Mul, and the corresponding magnitude b in matrix Mur.
Compute the mean m = (a+b)/2, and choose the value of the parameter p.
Embedding bit 1: If a < m-(p/2*m) then do not modify a and b
else a=m-(p/2*m) and b=m+(p/2*m)
Embedding bit 0: If a > m+(p/2*m) then do not modify a and b
else a=m+(p/2*m) and b=m-(p/2*m)
Copy the modified magnitudes in matrix Mul to matrix Mlr.
Copy the modified magnitudes in matrix Mur to matrix Mll.
Obtain the DFT coefficients of the entire image using the modified magnitudes.
Compute the inverse DFT.
445
EXTRACTION
Compute the DFT of the NxN watermarked (and possibly attacked) image.
Move the origin to the center.
Obtain the magnitudes of DFT coefficients.
Divide the NxN matrix of magnitudes into four (N/2)x(N/2) matrices Mul, Mur, Mll,
Mlr.
Use the three frequency bands and the embedding locations defined in the
embedding process: low, middle, and high.
In each band, if a > b then bit = 0 else bit = 1.
446
EMBEDDING THE WATERMARK
INTO LUMINANCE LAYER
p: 120%
12x12 WM Low frequency area
Modify every
4th magnitude
p: 60-90%
Mid frequency area
(180,180)
p: 10-60% High frequency area

(100,100)
Mul Mur
(20,20)
Mll Copy the

modified
Copy the
modified
Mlr
magnitudes magnitudes
Cover image: 512X512
447
WATERMARKING A FULL COLOR IMAGE
USING MATLAB
Cover image: 512x512 Lena Watermarked Lena: PSNR=46.83dB
Cover image DFT coefficient magnitudes Watermarked image DFT coefficient magnitudes
448
EXTRACTIONS FROM UNATTACKED
LUMINANCE LAYER
SR = 0.986
High frequency band
SR = 1.000
Mid frequency band
Unattacked cover image

SR = 1.000
Similarity Ratio (SR) = S/(S+D)

S: # of matching pixel values in compared images
D: # of different pixel values in compared images. Low frequency band
449
EXTRACTIONS FROM UNWATERMARKED
LUMINANCE LAYER
SR = 0.521
High frequency band
SR = 0.458
Mid frequency band
Unwatermarked cover image

SR = 0.486
Low frequency band
450
EXTRACTIONS FROM CROPPED
LUMINANCE LAYER
SR = 0.910
High frequency band
SR = 0.944
Mid frequency band
Cropping
SR = 0.826
Low frequency band
451
EXTRACTIONS FROM HISTOGRAM EQUALIZED
LUMINANCE LAYER
SR = 0.896
High frequency band
SR = 0.979
Mid frequency band
SR = 0.972
Low frequency band
452
EXTRACTIONS FROM LOW PASS FILTERED
LUMINANCE LAYER
SR = 0.465
High frequency band
SR = 0.625
Mid frequency band
Low pass filtering (3X3)

SR = 0.993
Low frequency band
453
EXTRACTIONS FROM GAMMA CORRECTED
LUMINANCE LAYER
SR = 0.958
High frequency band
SR = 0.972
Mid frequency band
Gamma correction (1.5)

SR = 0.951
Low frequency band
454
EXTRACTIONS FROM GAUSSIAN NOISY
LUMINANCE LAYER
SR = 0.535
High frequency band
SR = 0.806
Mid frequency band
Gaussian noise (0, 0.001)

SR = 0.958
Low frequency band
455
EXTRACTIONS FROM INTENSITY ADJUSTED
LUMINANCE LAYER
SR = 0.958
High frequency band
SR = 0.993
Mid frequency band
Intensity adjustment ([0 0.8], [0,1])

SR = 0.986
Low frequency band
456
EXTRACTIONS FROM JPEG COMPRESSED
LUMINANCE LAYER
SR = 0.458
High frequency band
SR = 0.486
Mid frequency band
JPEG compression (Q=50)

SR = 0.979
Low frequency band
457
EXTRACTIONS FROM RESIZED
LUMINANCE LAYER
SR = 0.549
High frequency band
SR = 0.667
Mid frequency band
Resizing (512 → 256 → 512)

SR = 0.993
Low frequency band
458
EXTRACTIONS FROM ROTATED
LUMINANCE LAYER
SR = 0.618
High frequency band
SR = 0.924
Mid frequency band
Rotation (50)
SR = 0.931
Low frequency band
459
EXTRACTIONS FROM SCALED
LUMINANCE LAYER
SR = 0.819
High frequency band
SR = 0.958
Mid frequency band
SR = 0.972
Low frequency band

Scaling (512 → 1024)
460
EXTRACTIONS FROM FRAGMENTED &
TRANSLATED LUMINANCE LAYER
SR = 0.986
High frequency band
SR = 1.000
Mid frequency band
Fragmentation and translation
SR = 1.000
Low frequency band
461
EXTRACTIONS FROM COLLUDED
LUMINANCE LAYER
SR = 0.736
High frequency band
Watermarked with “X” (46.83dB) Watermarked with “I” (47.43dB)

SR = 0.708
Mid frequency band
SR = 0.701
3 images added & scaled Low frequency band

Collusion
462
EXTRACTIONS FROM REWATERMARKED
LUMINANCE LAYER
SR = 0.938 SR = 1.000
High frequency band
Watermarked with “X” (46.83dB)

SR = 0.993 SR = 0.993
Mid frequency band
SR = 1.000 SR = 1.000
Low frequency band

Rewatermarked with “I” (43.41dB)
463
BEST EXTRACTIONS
Cropping H. equalization Low pass filtering Gamma correction Gaussian noise
SR = 0.944 SR = 0.979 SR = 0.993 SR = 0.972 SR = 0.958

Int. adjustment JPEG compression Resizing Rotation (40) Scaling
SR = 0.993 SR = 0.979 SR = 0.993 SR = 0.951 SR = 0.972

Fragmentation
SR = 1.000
464
CONCLUSIONS
A DFT-based blind watermarking scheme is presented.

Magnitudes of DFT coefficients are compared.
A binary logo is embedded in the luminance layer.
The same watermark is embedded in three frequency bands.
The proposed blind scheme is robust and secure.
Robustness
For each attack, there are three extractions.
The extraction with the best visual quality is used.
Security
The hacker does not know the embedding location or what is embedded.
In a binary watermark, each pixel value may be either “0” or “1”.
For a 512x512 cover image, brute force attack needs 2255x255 trials.
In our implementation, the value of p is the smallest for embedding in the
location closest to the origin. As the embedding location moves away from the
origin, p is assigned a larger value.
Extractions from the lowest frequency band are best for one group of attacks.
low pass filtering, adding Gaussian noise, JPEG compression, resizing, rotation, and
scaling
Extractions from the middle frequency band are best for another group of
attacks.
Cropping, histogram equalization, gamma correction, intensity adjustment
Extractions from the fragmented image are identical to extractions from the
unattacked image.
465

CIS769

Uploaded by

Copyright:

Available Formats

CIS769

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CIS769

Uploaded by

Copyright:

Available Formats

CIS 769

DIGITAL IMAGE PROCESSING

 Classification of graphic files

 Graphic file types

 A digital image can be defined to be a function of two real

 Digital image processing: analysis, manipulation,

 The main objective of image enhancement is to process

 Image enhancement: largely a subjective process.

 The colors that humans (and some other animals

JPEG: based on Discrete Cosine Transform

10:1 30:1 50:1

JPEG 2000: based on discrete wavelet transform

10:1 30:1 50:1

 Morphology: the study of the form and structure of

 Segmentation subdivides an image into its constituent

Region growing is a procedure that groups pixels or subregions into

 Watermarking (data hiding) is the process of embedding

W*: extracted version of the watermark.

T = user-defined threshold. If S > T, image is authentic.

a = 0.1 a = 0.5 a = 1.0 a = 5.0

 Interest in DIP methods stem from 2 principal

 Early 1920s: Bartlane cable picture transmission system

 1964: Pictures of the moon transmitted by Ranger 7 were

 Early 1970’s: Invention of computerized axial tomography

 Electromagnetic energy spectrum: Principal source of

Major uses: Nuclear medicine & astronomical observations

Bone scan: PET: another

Major uses: Medical diagnostics, astronomy, angiography, CAT scans

Major uses: lithography, industrial inspection, microscopy, lasers, biological imaging,

Fluorescence is a phenomenon discovered in the middle of 19th century.

Fluorescence Corn infected by

Cygnus Loop imaged

Taxol (anticancer agent) Microprocessor

Nickel oxide thin film Surface of audio CD

One image for each

An image of a hurricane taken by a

These images are part of

This set provides a

• has the unique ability to

• Works like a flash

Major uses: medicine and astronomy

Magnetic resonance imaging (MRI):

• places a patient in a powerful magnet and passes radio

• each pulse causes a responding pulse of radio waves to

• produces a 2D picture of section of the patient.

Each image gives a totally different view of the Pulsar.

Major uses: geological exploration, industry, medicine

This target is brighter than the

Ultrasound imaging: millions of HF sound pulses and

• use a focused beam of electrons instead of light.

Images that are not obtained from physical objects.

 Image acquisition: capturing an image in digital form.

 Image sensors and specialized IP HW

Structure of the human eye

• cornea & sclera

Cornea: tough, transparent

Retina: innermost Two types of receptors: cones and rods.

Rods: 75-150 million. Several are connected to a

Fovea: Circular indentation in the retina (~1.5mm

Classification of graphic files

Graphic file types

A digital image can be defined to be a function of two real

Digital image processing: analysis, manipulation,

The main objective of image enhancement is to process

Image enhancement: largely a subjective process.

The colors that humans (and some other animals

Morphology: the study of the form and structure of

Segmentation subdivides an image into its constituent

Watermarking (data hiding) is the process of embedding

Interest in DIP methods stem from 2 principal

Early 1920s: Bartlane cable picture transmission system

1964: Pictures of the moon transmitted by Ranger 7 were

Early 1970’s: Invention of computerized axial tomography

Electromagnetic energy spectrum: Principal source of

Image acquisition: capturing an image in digital form.

Image sensors and specialized IP HW

The colors perceived in an object are determined by the

If a sensor can be developed with the capability of detecting

When an image is generated from a physical process, its

0 < r(x,y) < 1 (theoretical bound)

(x0,y0) => l = f(x0,y0)

Lmin: positive, Lmax: finite

Lmin ≈ 10, Lmax ≈ 1000: typical limits for indoor values

[Lmin, Lmax]: gray scale

[0,L-1], l =0 is black and l = L-1 is white.

Intermediate values: shades of gray

The output of most sensors is continuous voltage

Zooming requires 2 steps:

Similar to image zooming.

V: set of gray level values

S: a subset of pixels in an image. Two pixels p and q are

p with (x,y) D(p,q) = 0 iff p = q

Euclidean distance between p and q: De(p,q) = [(x-s)2 + (y-t)2]1/2

D4 distance: D4(p,q) = |x-s| + |y-t|

D8 distance: D8(p,q) = max (|x-s| , |y-t|)

D4 and D8 distances between p and q are independent of any

H: an operator whose I and O are images.

Principal objective: to process an image so that the result

Among the simplest image enhancement techniques.

A complementary approach to the previous methods.

Histogram of a digital image