CIS769

Download as pdf or txt
Download as pdf or txt
You are on page 1of 465

CIS 769

DIGITAL IMAGE PROCESSING

Ahmet M. Eskicioglu
Department of CIS
Brooklyn College

1
OVERVIEW

‰ Classification of graphic files


‰ What is a digital image?
‰ What is digital image processing?
‰ Image enhancement
‰ Image restoration
‰ Color image processing
‰ Image compression
‰ Morphological image processing
‰ Image segmentation
‰ Representation and description
‰ Object recognition

2
TEXTBOOKS

‰ Required Textbook
 R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall,
2002.
‰ Recommended Textbooks
 W. K. Pratt, Digital Image Processing, John Wiley and Sons, 2001.
 T. Bose, Digital Signal and Image Processing, John Wiley and Sons,
2004.
 I. J. Cox, M. L. Miller and J. A. Bloom, Digital Watermarking, Morgan
Kaufmann Publishers, 2001.
 F. Halsall, Multimedia Communications, Addison-Wesley, 2001.
 K. R. Rao, Zoran S. Bojkovic, Dragorad A. Milovanovic, Multimedia
Communication Systems, Prentice Hall, 2002.
 R. Steinmetz, K. Nahrstedt, Multimedia Fundamentals, Prentice Hall,
2002.
 M. Arnold, M. Schmucker, S. D. Wolthusen, Techniques and Applications
of Digital Watermarking and Content Protection, Artech House, 2003.
 Mark S Drew, Ze-Nian Li, Fundamentals of Multimedia, Prentice Hall,
2004.

3
GRAPHIC FILES

‰ Graphic file types


 indexed system vs. RGB color
 bit-mapped vs. object-oriented
‰ Indexed system
 Associates one number with each color in the palette.
 Example: GIF format (uses 8 bits per color)
‰ RGB color system
 3 numbers associated with each color: R, G, B (primary colors)
 The color obtained is a mix of the 3 primary colors.
 Total number of possible colors: 256x256x256
 Example: JPEG
‰ Bit-mapped images (raster graphics)
 Images are created or captured as a set of samples.
 Each image is treated as a set of pixels each with a color.
 The information for each pixel is stored.
 Drawing and editing affects the pixels directly.
 A raster is a grid of x and y coordinates on a display space.
 Standardized format used in PC and Mac platforms.
 Examples: JPEG, GIF
‰ Object-oriented images (vector graphics)
 Creation of images thru a sequence of commands or mathematical statements that place lines
and shapes in a given 2 or 3 dim space.
 The computer maintains a list of objects and their placements in space.
 A vector is a representation of both a quantity and a direction.
 A vector graphics file is created and saved as a sequence of vector statements.
 Objects can be moved around and edited.
 When an object-oriented image is printed, the description of the object is sent to the
processor in the printer which renders it in the printer’s resolution.
 One size fits all.
 Examples: Postcript

4
EXAMPLE: BIT-MAPPED VS. OBJECT-ORIENTED

5
WHAT IS A DIGITAL IMAGE?

‰ A digital image can be defined to be a function of two real


variables, f(x,y), where x and y are spatial coordinates, and
the amplitude f at a given pair of coordinates is called the
intensity of the image at that point. Every digital image is
composed of a finite number of elements, called pixels,
each with a particular location and a finite value.

6
WHAT IS DIGITAL IMAGE PROCESSING?

‰ Digital image processing: analysis, manipulation,


storage, and display of graphical images from sources
such as photographs, drawings, and video.
‰ The process is comprised of 3 steps
 Input: the image is either directly captured with a digital
camera or an analog image is digitized in a scanner to obtain
the binary data a computer can understand.
 Processing: may include any process including image
enhancement, restoration, compression, segmentation, or
watermarking as well as extraction of image features.
 Output: consists of the display/printing of the processed
image, or the features extracted from the input image.

7
IMAGE ENHANCEMENT

‰ The main objective of image enhancement is to process


an image to make it more suitable for a specific
application.
 For example, a method that is useful for enhancing X-ray
images may not be the best method for enhancing pictures
of Mars transmitted by a space probe (i.e., a satellite or other
spacecraft that is designed to explore the solar system and
transmit data back to earth).
‰ Two broad categories:
 Spatial domain methods: based on direct manipulation of
pixels in an image.
 Frequency domain methods: based on modifying the
Fourier transform of an image.
‰ Smoothing filters and sharpening filters.

8
A SPATIAL DOMAIN EXAMPLE

9
A FREQUENCY DOMAIN EXAMPLE

Filtering in the
frequency domain

1. f(x,y)(-1)x+y
2. Compute F(u,v)
3. F(u,v)H(u,v)
4. Inverse DFT
5. Take the real part
6. Multiply the result
by (-1)x+y

10
IMAGE RESTORATION

‰ Image enhancement: largely a subjective process.


 contrast stretching is considered an enhancement
technique.
‰ Image restoration: primarily an objective process.
 removal of image blur is considered a restoration technique.
‰ Spatial domain methods and frequency domain methods.

11
A SPATIAL DOMAIN EXAMPLE

12
COLOR IMAGE PROCESSING

‰ The colors that humans (and some other animals


perceive) in an object are determined by the nature of the
light reflected from the object.
‰ Two major areas:
 Full color: Image is acquired with a full-color sensor (e.g., a
color TV camera)
 Pseudo color: a color is assigned to a particular
monochrome intensity or range of intensities.
‰ A color model (color space, color system) is a
specification of a coordinate system and a subspace
within that system where each color is represented by a
point.

13
RGB MODEL

14
PSEUDO COLOR

15
IMAGE COMPRESSION

‰ Lossless compression
 Huffman coding
Symbol Probability Code
a2 0.4 1
a6 0.3 00
a1 0.1 011
a4 0.1 0100
a3 0.06 01010
a5 0.04 01011

‰ Lossy compression
 JPEG
 JPEG 2000

16
IMAGE COMPRESSION

JPEG: based on Discrete Cosine Transform

10:1 30:1 50:1

JPEG 2000: based on discrete wavelet transform

10:1 30:1 50:1

17
MORPHOLOGICAL IMAGE PROCESSING

‰ Morphology: the study of the form and structure of


organisms.
‰ Mathematical morphology: extracting image
components that are useful in the representation and
description of region shape.
 Morphological operations for binary images
 Morphological operations for gray scale images
‰ An example of a basic morphological algorithm for
binary images is boundary extraction.
‰ 2 fundamental operations: dilation and erosion

18
IMAGE SEGMENTATION

‰ Segmentation subdivides an image into its constituent


regions.
‰ Two categories of segmentation algorithms:
 partitioning an image based on abrupt changes in intensity
(e.g., line detection)
 partitioning an image into regions that are similar according
to a set of criteria

19
LINE DETECTION

Suppose we are interested in finding all the lines in the wire-bond mask
that are one pixel thick and are oriented at -450. Which one of the above
masks do we use?

Strongest Thresholding
response using the max
value in the
image.

20
REGION GROWING

Region growing is a procedure that groups pixels or subregions into


larger regions based on predefined criteria: start with a set of “seed”
points and grow regions by appending to each seed those neighboring
pixels that have properties similar to the seed.

21
IMAGE WATERMARKING

‰ Watermarking (data hiding) is the process of embedding


data into a multimedia element such as image, audio or
video.
‰ A watermarking algorithm consists of:
 the watermark structure
 an embedding algorithm
 an extraction (detection) algorithm
‰ Watermarks can be embedded in the pixel domain or the
transform domain.
‰ Desired attributes
 invisibility
 robustness
 high data capacity
‰ Attacks and normal A/V processes: noise, filtering
(blurring, sharpening, etc.), resampling, scaling, rotation,
cropping and lossy compression.

22
EXAMPLE IN THE DCT DOMAIN

W: watermark to be embedded.
X: sequence of pixel values.
XD and YD: row-concatenated DCT coefficients of X and Y.
a = scaling factor: determines the intensity of the watermark.

YD (i ) = X D (i )(1 + aW )

W*: extracted version of the watermark.


Z: possibly forged watermarked image.

* 1 Z D (i ) * W ⋅W *
W (i ) = 1 ==> S (W , W ) =
a X D (i )
W ⋅W *

T = user-defined threshold. If S > T, image is authentic.

23
SCALING FACTOR a = 0.1, 0.5, 1.0, 5.0

Original image

a = 0.1 a = 0.5 a = 1.0 a = 5.0

24
CHAPTER 1: INTRODUCTION

‰ Interest in DIP methods stem from 2 principal


applications:
 Improvement of pictorial information for human interpretation
 Processing of image data for storage, transmission, and
representation for autonomous machine perception.
‰ What is a digital image?
 f(x,y), x&y: spatial plane coordinates, f(x,y): intensity
 x,y, and f(x,y): finite, discrete quantities
 picture elements, image elements, pels, pixels.
 Humans: limited to the visual band of EM spectrum.
 Imaging machines: cover the entire spectrum.
 Scope of image processing
„ Both I and O are images?
„ Includes computer vision?
„ Image analysis is between image processing and computer vision
 Low-level (I&O: images) process, mid-level (I: image, O:
extracted attributes) process, high-level process.

25
BARTLANE CABLE PICTURE
TRANSMISSION SYSTEM

‰ Early 1920s: Bartlane cable picture transmission system


 Reduced the time required to transport a picture across the
Atlantic from more than a week to less than three hours.
 Pictures were coded for cable transmission and then
reconstructed at the receiving end on a telegraph printer fitted
with type faces simulating a halftone pattern.
 The early Bartlane systems were capable of coding images in
five distinct brightness levels. This was increased to fifteen
levels in 1929.

26
MORE TRANSATLANTIC TRANSMISSIONS
IN 1920’S

27
OTHER HISTORICAL DEVELOPMENTS

‰ 1964: Pictures of the moon transmitted by Ranger 7 were


processed by a computer at the Jet Propulsion Lab to
correct various types of image distortion inherent in the
on-board TV camera.
‰ Experience with Ranger 7 led to an improvement in image
enhancement and restoration in other projects.
 Surveyor missions to the moon
 Mariner series of flyby missions to Mars
 Apollo manned flights to the moon

28
APPLICATIONS OF IMAGE PROCESSING

‰ Early 1970’s: Invention of computerized axial tomography


‰ Applications of image processing
 For human interpretation
„ Geography
„ Archeology.
„ Physics
„ Astronomy
„ Biology
„ Nuclear medicine
„ Law enforcement
„ Defense
 For machine perception
„ Automatic character recognition
„ Industrial machine vision for product assembly and inspection
„ Military recognizance
„ Automatic processing of fingerprints
„ Screening of X-rays and blood samples
„ Machine processing of aerial and satellite imagery for weather
prediction and environment assessment
29
ENERGY SOURCES FOR IMAGES

‰ Electromagnetic energy spectrum: Principal source of


energy
 The full range of frequencies, from radio waves to gamma rays,
that characterizes light.
 Gamma rays: highest energy, shortest wavelength
 Radio waves: lowest energy, longest wavelength
‰ Electromagnetic radiation can be described in terms of a
stream of photons, each traveling in a wave-like pattern,
moving at the speed of light and carrying some amount of
energy.
‰ The amount of energy a photon has makes it sometimes
behave more like a wave and sometimes more like a particle.
This is called the "wave-particle duality" of light.
‰ Low energy photons (such as radio) behave more like waves,
while higher energy photons (such as X-rays) behave more
like particles.
‰ Other sources of energy: acoustic, ultrasonic, electronic

30
ELECTROMAGNETIC SPECTRUM

31
GAMMA-RAY

Major uses: Nuclear medicine & astronomical observations

Bone scan: PET: another


locates sites major modality of
of bone nuclear imaging
pathology
One sample of a
sequence that
constitutes a 3d
rendition of the
patient.

A star in the
constellation Cygnus
exploded about 15K Gamma radiation
years ago. from a valve in a
nuclear reactor
The superheated
stationary gas cloud
(Cygnus Loop) glows
in a spectacular array
of colors.
32
X-RAY

X-rays are among the oldest sources of EM radiation used for imaging.

Major uses: Medical diagnostics, astronomy, angiography, CAT scans

Chest X-ray
X-ray image of an
electronic circuit
board

Aortic angiogram

Cygnus Loop
Head CAT slice imaged in the X-ray
band

33
ULTRAVIOLET

Major uses: lithography, industrial inspection, microscopy, lasers, biological imaging,


astronomical observations.

Fluorescence is a phenomenon discovered in the middle of 19th century.

Fluorescence microscopy is a excellent method for studying materials that can be made to
fluoresce (either in natural form or when treated with chemicals capable of fluorescing).

Fluorescence Corn infected by


microscope image “smut,” a disease of
of normal corn cereals, corn, etc.

Cygnus Loop imaged


in the high-energy
region of the
ultraviolet band.

34
VISIBLE AND INFRARED – LIGHT MICROSCOPE

Major uses: light microscopy, astronomy, remote sensing, industry, law enforcement.
Cholesterol

Taxol (anticancer agent) Microprocessor

Nickel oxide thin film Surface of audio CD

Organic superconductor

35
VISIBLE AND INFRARED – REMOTE SENSING

LANDSAT obtains
and transmits
images of the Earth
from space, for
Remote sensing: purposes of
usually includes several monitoring
bands in the visual and environmental
infrared regions. conditions.

Multispectral imaging:

One image for each


band in the above
table.

The differences
between visual and
infrared image
features are quite
noticeable.

36
VISIBLE AND INFRARED – WEATHER
OBSERVATION AND PREDICTION

An image of a hurricane taken by a


satellite using sensors in the visible
and infrared bands.

37
INFRARED – HUMAN SETTLEMENTS
(THE AMERICAS)

These images are part of


Nighttime Lights of the
World data set.

This set provides a


global inventory of
human settlements.

38
INFRARED – HUMAN SETTLEMENTS
(OTHER PARTS OF THE WORLD)

39
VISIBLE – AUTOMATED VISUAL INSPECTION OF
MANUFACTURED GOODS

Pill container
A circuit
board
controller
(the back
square is a
missing
component)
A clear plastic
part with an
unacceptable
A bottle that number of air
is not filled pockets.
up to an
acceptable
level.
An intraocular
A batch of cereal implant
inspected for (replacement
color and the lens for the
presence of human eye)
anomalies such
as burned flakes.

40
VISIBLE – OTHER APPLICATIONS

Paper
currency
A thumb
print

Automated
license plate
reading

41
MICROWAVE – IMAGING RADAR

Imaging radar

• has the unique ability to


collect data over
virtually any region at
any time, regardless of
weather or ambient
lightning conditions.

• Works like a flash


camera with its own Note the clarity and detail of
illumination. the image, unencumbered by
clouds or other atmospheric Lhasa River
• Uses an antenna and conditions that normally
digital computer interfere with image in the
processing to record visual band!
images.

42
RADIO – MRI

Major uses: medicine and astronomy

Magnetic resonance imaging (MRI):

• places a patient in a powerful magnet and passes radio


waves through the body in short pulses.

• each pulse causes a responding pulse of radio waves to


be emitted by the patient’s tissues.

• produces a 2D picture of section of the patient.

43
CRAB PULSAR IMAGES

In the summer of 1054 A.D., Chinese astronomers reported that a star in the
constellation of Taurus suddenly became as bright as the full Moon. Fading
slowly, it remained visible for over a year. It is now understood that a spectacular
supernova explosion - the detonation of a massive star whose remains are now
visible as the Crab Nebula - was responsible for the apparition. The core of the
star collapsed to form a rotating neutron star or pulsar, one of the most exotic
objects known to 20th century astronomy. Like a cosmic lighthouse, the rotating
Crab pulsar generates beams of radio, visible, x-ray and gamma-ray energy
which, as the name suggests, produce pulses as they sweep across our view.

Each image gives a totally different view of the Pulsar.

44
OTHER IMAGING MODALITIES:
ACOUSTIC IMAGING

Major uses: geological exploration, industry, medicine

This target is brighter than the


surrounding layers because the
change in density in the target
region is larger.

45
OTHER IMAGING MODALITIES:
ULTRASOUND IMAGING

Ultrasound imaging: millions of HF sound pulses and


echoes are sent and received each second.

46
OTHER IMAGING MODALITIES:
ELECTRON MICROSCOPY

Electron microscopes

• use a focused beam of electrons instead of light.


• are capable of very high magnification (10,000X or more).
• transmission electron microscope (TEM), scanning electron microscope (SEM)

47
OTHER IMAGING MODALITIES:
COMPUTER-GENERATED OBJECTS

Images that are not obtained from physical objects.

48
FUNDAMENTAL STEPS IN
DIGITAL IMAGE PROCESSING

49
WHAT ARE THE FUNDAMENTAL STEPS?

‰ Image acquisition: capturing an image in digital form.


‰ Image enhancement: making an image look better in a subjective
way.
‰ Image restoration: improving the appearance of ana image
objectively.
‰ Color image processing: color models and basic color processing
‰ Wavelets and multiresolution processing: Wavelet transform in one
and two dimensions.
‰ Image compression: reducing the stored and transmitted image
data.
‰ Morphological image processing: extracting image components that
are useful in the representation and description of shape.
‰ Image segmentation: partitioning an image into its constituent parts
or objects.
‰ Representation and description: boundary representation vs. region
representation. Boundary descriptors vs. region descriptors.
‰ Recognition: assigning a label to an object based on its descriptors.

50
COMPONENTS OF AN
IMAGE PROCESSING SYSTEM

51
WHAT ARE THE COMPONENTS?

‰ Image sensors and specialized IP HW


 A physical device that is sensitive to the energy radiated by the
object.
 A digitizer that converts the output of the physical sensing
device into digital form.
 HW that performs other primitive operations.
‰ Computer: Can range from a PC to a supercomputer.
‰ Image processing SW: Specialized modules that perform
specific tasks.
‰ Mass storage
 Short term storage: main memory, frame buffers
 On-line storage: magnetic disks and optical disks
 Archival storage: magnetic tapes and optical disks in
“jukeboxes”
‰ Hard copy: all types of printers, film cameras, optical disks
(CDs and DVDs).
‰ Networking: bandwidth is the key consideration.
52
CHAPTER 2: DIGITAL IMAGE FUNDAMENTALS

Structure of the human eye


3 membranes:

• cornea & sclera


• choroid
• retina

Cornea: tough, transparent


tissue that covers the Iris diaphragm:
anterior surface of the eye. contracts or
d ≈ 20mm expands to control
Sclera: opaque membrane. the amount of light.

Choroid: contains a
network of blood vessels.

Retina: innermost Two types of receptors: cones and rods.


membrane.
Cones: 6-7 million. Each connected to its own
nerve end. Highly sensitive to color.
Cone vision is called photopic vision.

Rods: 75-150 million. Several are connected to a


Lens: suspended by fibers attached to the single nerve end. Serve to give an overall picture
ciliary body (60-70% water, 6% fat, protein). of the field of view. Not involved in color vision.
Rod vision is called scotopic vision.
Absorbs ~8% of the visible light spectrum.
Blind spot: the area on the retina without
Infrared and ultraviolet light are also absorbed. receptors that respond to light.

Fovea: Circular indentation in the retina (~1.5mm


in diameter)

53
CONE AND ROD CELLS

cone cell rod cell

54
DISTRIBUTION OF RODS AND CONES

55
IMAGE FORMATION IN THE EYE

The receptors transform radiant


energy into electrical impulses that
are decoded by the brain. What is the
height of
the image?

Focal length varies


from 14mm to 17mm.

56
BRIGHTNESS ADAPTATION

The range of light intensity levels to which the HVS can adapt is
enormous – on the order of 1010!

Experimental evidence indicates that subjective brightness is a


logarithmic function of the light intensity.

However, the HVS cannot operate over such a range simultaneously.

Brightness adaptation: the total range of distinct intensity levels that


can be discriminated is rather small.
Brightness adaptation level:
the current sensitivity level
of the HVS for any given set
of conditions.

Short curve represents the


range of subjective
brightness the eye can
perceive at this level.

The transition from scotopic to


photopic vision is gradual (-3 to
-1 mL in log scale).

57
BRIGHTNESS DISCRIMINATION

How does the eye discriminate between changes in light intensity at a


specific adaptation level?

A classical experiment:
A subject looks at a flat, uniformly illuminated area (large enough to occupy the
entire field of view).
The intensity I can be varied.
∆I is added in the form of a short-duration flash that appears as a circle in the
middle.
If ∆I is not bright enough, the subject says “no.”
As ∆I gets stronger, the subject may say “yes.”
When ∆I is stronger enough, the subject will say “yes” all the time.
Weber ratio: ∆Ic/ I, ∆Ic: increment of illumination discriminable 50% of the time with
background illumination I.

58
WEBER RATIO AS FUNCTION OF INTENSITY

Poor brightness Good brightness


discrimination: A large discrimination: A small value
value of ∆Ic/ I means a of ∆Ic/ I means a small
large percentage change percentage change in
in intensity is intensity is discriminable.
discriminable.

59
PERCEIVED BRIGHTNESS: NOT A SIMPLE
FUNCTION OF INTENSITY – MACH BANDS

The HVS tends to


undershoot or
overshoot around
the boundary of
regions of
different
intensities.

The intensities of the stripes is


constant but we perceive a
brightness pattern that is
strongly scalloped, especially
near the boundaries.

Ernst Mach (1838-1916) described the


phenomenon in 1865.

60
PERCEIVED BRIGHTNESS: NOT A SIMPLE FUNCTION
OF INTENSITY – SIMULTANEOUS CONTRAST

A region’s perceived brightness does not simply depend on its


intensity.

61
OPTICAL ILLUSIONS: OTHER EXAMPLES OF
HUMAN PERCEPTION PHENOMENA

A few lines
The outline of are sufficient
a square is to give the
seen clearly illusion of a
although complete
there are no circle.
lines defining
such a figure.

All lines that


Two are oriented at
horizontal line 450 are
segments equidistant
have the same and parallel.
length but
one appears
shorter than
the other.

62
LIGHT AND ELECTROMAGNETIC SPECTRUM

‰ Electromagnetic spectrum
 The range of colors we perceive in visible light represents a
very small portion of the spectrum.
 Radio waves with wavelengths billions of times longer.
 Gamma rays with wavelengths billions of times smaller.
‰ Wavelength, frequency and energy
 λ = c/v, c: speed of light (2.998x108 m/s)
 E = hv, h: Planck’s constant
‰ Electromagnetic waves can be visualized as
 propagating sinusoidal waves with wavelength λ.
 a stream of massless particles, each traveling in a wavelike
pattern and moving at the speed of light.
„ Each massless particle contains a bundle of energy called a photon.
„ Higher frequency electromagnetic phenomena carry more energy
per photon.
‰ The visible band: 0.43 µm (violet) – 0.79 µm (red)
‰ Six broad color regions: violet, blue, green, yellow, orange, red.

63
ELECTROMAGNETIC SPECTRUM

64
COLOR PERCEPTION

‰ The colors perceived in an object are determined by the


nature of light reflected from the object.
‰ A body that reflects light and is relatively balanced in all
visible wavelengths appears white.
‰ A green object reflects light with wavelengths primarily in
[500-570] nm range, and absorb most of the energy at other
wavelengths.
‰ Achromatic or monochromatic light: light that is void of
color.
‰ Three basic quantities to describe the quality of a chromatic
light source
 Radiance: total amount of energy that flows from the light
source - measured in watts (W)
 Luminance: a measure of the amount of energy an observer
perceives from a light source - measured in lumens (lm).
 Brightness: a subjective descriptor of light perception that is
impossible to measure (embodies the achromatic notion of
intensity).
65
IMAGE SENSING AND ACQUISITION

‰ If a sensor can be developed with the capability of detecting


energy radiated by a band of the EM spectrum, we can image
events in that band!
‰ The wavelength of am EM wave required to see an object <=
the size of the object!
 A water molecule’s diameter ~10-10.
 To study molecules, we need a source capable of emitting in the
far ultraviolet or soft X-ray region!
‰ Illumination
 Visible light, radar, infrared, X-ray, etc.
 Ultrasound, etc.
‰ Scene elements
 Familiar 3-D objects
 Molecules, buried rock formations, human brain
‰ Illumination energy
 Reflected from objects: light reflected from a planar object
 Transmitted through objects: X-ray through a patient’s body
66
SENSORS

3 principal
sensor
arrangements

Illumination energy is
transformed into
digital images.

67
SINGLE SENSOR

Arrangement used in
high-precision scanning

68
SENSOR STRIPS

Typical arrangement in
most flat bed scanners

Basis for
medical and
industrial CAT

69
SENSOR ARRAYS

The energy from


an illumination
source is reflected
from a scene
element. The lens projects
the viewed scene
onto the lens focal
plane.

The sensor array The digitized


The incoming produces outputs image is obtained.
energy is proportional to
collected and the integral of the
focused onto light received at
an image plane. each sensor.

70
A SIMPLE IMAGE FORMATION MODEL

‰ When an image is generated from a physical process, its


values are proportional to energy radiated by a physical
source.
‰ Hence, 0 < f(x,y) < ∞
‰ f(x,y) may be characterized by 2 components:
 Amount of source illumination incident on the scene being
viewed – illumination i(x,y)
 Amount of illumination reflected by the objects in the scene –
reflectance r(x,y)
 The 2 functions combine as a product: f(x,y) = i(x,y)r(x,y)
 0 < i(x,y) < ∞ (theoretical bound)
„ Typical values of i(x,y)
z On a clear day: 90,000 lm/m2
z On a cloudy day: 10,000 lm/m2
z On a clear evening: 0.1 lm/m2

 0 < r(x,y) < 1 (theoretical bound)


„ Typical values of r(x,y)
z 0.01 for black velvet
z 0.65 for stainless steel
z 0.80 for flat-white wall paint

71
GRAY-SCALE IMAGES

‰ (x0,y0) => l = f(x0,y0)

‰ Lmin ≤ l ≤ Lmax

‰ Lmin: positive, Lmax: finite

‰ Lmin= iminrmin

‰ Lmax= imaxrmax

‰ Lmin ≈ 10, Lmax ≈ 1000: typical limits for indoor values

‰ [Lmin, Lmax]: gray scale

‰ [0,L-1], l =0 is black and l = L-1 is white.

‰ Intermediate values: shades of gray

72
SAMPLING AND QUANTIZATION

‰ The output of most sensors is continuous voltage


waveform.
‰ To create a digital image, continuous sensed data should
be converted into digital form.
‰ Two processes: sampling and quantization.
‰ Consider a function f(x,y) representing an image.
‰ f(x,y) is continuous w.r.t.
 x and y coordinates
 Amplitude
‰ Sampling: digitizing the coordinate values
‰ Quantization: digitizing the amplitude values

73
GENERATING A DIGITAL IMAGE

74
IMAGE ACQUISITION WITH A SENSING ARRAY

75
DIGITAL IMAGE REPRESENTATION

f(0,0)

f(0,N-1)

L: # of discrete gray levels

L = 2k
b=MxNxk

f(M-1,0)

76
STORAGE REQUIREMENTS

77
SPATIAL AND GRAY LEVEL RESOLUTION

Sampling is performed by deleting rows and columns from the original image.

78
RESAMPLING INTO 1024X1024 PIXELS

Resampling is performed by row and column duplication.

79
256/128/64/32 GRAY LEVELS

In these
images, the #
of samples is
constant but
the # of gray
levels was
reduced from False contouring
256 to 32.

80
16/8/4/2 GRAY LEVELS

In these
images, the #
of samples is
constant but
the # of gray
levels was
reduced from
16 to 2.

81
ISOPREFERENCE CURVES

Huang (1965) varied N and k


simultaneously, and attempted to
quantify the effects on image
quality:

Sets of these 3 types of images were


generated by varying N and k.

Observers were asked to rank the


Points on an
images according to their subjective isopreference curve
quality. correspond to images
of equal subjective
Results were summarized in the form of
isopreference curves in the Nk-plane.
quality.

82
ZOOMING DIGITAL IMAGES

‰ Zooming requires 2 steps:


 The creation of new pixel locations.
 The assignment of gray levels to these new locations.
 Nearest neighbor interpolation
„ A 500x500 image will be enlarged 1.5 times.
„ Lay an imaginary 750x750 grid over the original image.
„ The spacing is less than one pixel because the grid is fitted over a
smaller image.
„ For any point in the overlay, look for the closest pixel in the original
image, and assign its gray level to the new pixel in the grid.
„ When all the new pixels are assigned values, expand the overlay grid
to the original specified size to obtain the zoomed image.
„ Pixel replication is a special case that is applicable when the size of
the image needs to be increased an integer number of times.
„ Nearest neighbor interpolation is fast but it produces a checkerboard
effect!
 Bilinear interpolation - A slightly more sophisticated method
„ (x',y'): coordinates of a point in the zoomed image.
„ v (x',y'): gray level assigned to the point.
„ v (x',y') = ax'+by'+cx'y'+d (the 4 coefficients are determined using the 4
nearest neighbors of (x',y').
83
SHRINKING DIGITAL IMAGES

‰ Similar to image zooming.


‰ Shrinking an image an integer number of times
 Pixel replication is replaced by row&column deletion.
‰ Shrinking an image by a non-integer factor
 Expand the grid to fit over the original image.
 Do gray-level interpolation (nearest neighbor or bilinear).
 Shrink the grid back to its original specified size.
‰ It is possible to use more neighbors for interpolation.
 Points are fitted with a more complex surface.
 Generally gives smoother results.
 An important consideration for a number of applications.
„ Image generation for 3D graphics
„ Medical image processing
 The extra computational burden is seldom justifiable for
general-purpose digital image zooming and shrinking.
„ In general, bilinear interpolation is the method of choice.

84
IMAGE ZOOMING: NEAREST NEIGHBOR
INTERPOLATION VS. BILINEAR INTERPOLATION

85
NEIGHBORS OF A PIXEL

4 diagonal neighbors
of p, ND(p)

8-neighbors of p, N8(p) 4-neighbors of p, N4(p)

pixel p at (x,y)

86
ADJACENCY, CONNECTIVITY

‰ V: set of gray level values


‰ 3 types of adjacency
 4- adjacency: 2 pixels p and q with values from V are 4-
adjacent if q is in the set N4(p)
 8- adjacency: 2 pixels p and q with values from V are 8-
adjacent if q is in the set N8(p)
 m- adjacency: 2 pixels p and q with values from V are m-
adjacent if
„ q is in N4(p), or
„ q is in ND(p) and the set N4(p) ∩ N4(q) has no pixels whose values
are from V
‰ A digital path from pixel p with coordinates (x,y) to pixel q
with coordinates (s,t) is a sequence of distinct pixels with
coordinates (x0,y0), (x1,y1), …, (xn,yn), where (x0,y0)= (x,y)
and (xn,yn)=(s,t), and pixels (xi,yi) and (xi-1,yi-1) are adjacent
for 1 ≤ i ≤ n.

87
REGIONS, BOUNDARIES

‰ S: a subset of pixels in an image. Two pixels p and q are


said to be connected in S if there exists a path between
them consisting entirely of pixels in S.
 For any pixel p in S, the set of pixels that are connected to it in
S is called a connected component of S.
 If S has only one connected component, it is called a connected
set.
‰ R: a subset of pixels in an image. R is a region of the
image if R is a connected set.
‰ The boundary of a region R is the set of pixels in the region
that have one or more neighbors that are not in R.

88
DISTANCE MEASURES

‰ p with (x,y) D(p,q) = 0 iff p = q


‰ q with (s,t) D(p,q) = D(q,p)
‰ z with (v,w) D(p,z) ≤ D(p,q) + D(q,z)

‰ Euclidean distance between p and q: De(p,q) = [(x-s)2 + (y-t)2]1/2

‰ D4 distance: D4(p,q) = |x-s| + |y-t|

‰ D8 distance: D8(p,q) = max (|x-s| , |y-t|)

‰ D4 and D8 distances between p and q are independent of any


paths that might exist between the points.
‰ For m-adjacency, Dm distance between two points is defined as
the shortest m-path between the points.

89
LINEAR AND NONLINEAR OPERATIONS

‰ H: an operator whose I and O are images.


‰ f and g: any two images
‰ a and b: two scalars
‰ H is a linear operator if H(af + bg) = aH(f) + bH(g).
‰ Examples
 Sum of K images: operator is linear.
 Absolute value of the difference of 2 images: operator is not
linear.
‰ Linear operations are very important in image processing
because the theory is well-established.
‰ Nonlinear operations sometimes offer better performance
but the theory is not understood well!

90
CHAPTER 3: IMAGE ENHANCEMENT IN THE
SPATIAL DOMAIN

‰ Principal objective: to process an image so that the result


is more suitable than the original image for a specific
application.
‰ There is no general theory of image enhancement.
‰ The viewer is the ultimate judge of how well a particular
method works.
‰ Visual evaluation of image quality is a highly subjective
process!
‰ Spatial domain: aggregate of pixels composing an image.
‰ Spatial domain processes: g(x,y) = T[f(x,y)]

Processed Input
image image

Operator on f defined over


some neighborhood of f(x,y)

91
DEFINING A NEIGHBORHOOD

A square or rectangle is commonly


used in defining the neighborhood
about a point (x,y).

Operator T is applied at each location (x,y)


to produce the output g at that location.

Other neighborhood shapes


(e.g., approximation to a circle)
are sometimes used.

92
GRAY LEVEL TRANSFORMATION FUNCTION

Contrast stretching: Thresholding function:


Produces an image of higher Produces a binary image.
contrast than the original.

Neighborhood of size 1x1 => T is a gray level transformation function.

Larger neighborhoods => masks (filters)

93
BASIC GRAY LEVEL TRANSFORMATIONS

‰ Among the simplest image enhancement techniques.


‰ s = T(r): T maps a pixel value r into a pixel value s.
‰ Three basic types of transformations
 Linear (negative and identity)
 Logarithmic (log and inverse-log)
 Power law (nth power and nth root)

94
IMAGE NEGATIVES

Shows a small lesion.

s=L-1–r
Produces an equivalent of a photographic negative.
Enhances white or gray detail embedded in dark regions.

95
LOG TRANSFORMATIONS

8-bit display
Range: [0, 1.5x106]

Compresses the dynamic


range of image with large
variations in pixel values.
s = c log (1 + r), r ≥ 0
Maps a narrow range of low gray-level values into a wider range.
Maps a wide range of high gray-level values into a narrower range.

96
POWER LAW TRANSFORMATIONS

s = crγ
Map a narrow range
of dark input values
or
into a wider range of c(r + ε)γ , c and γ > 0
output values.

Map a wide range


of light input values
into a narrower range
of output values. Map a wide range
of dark input values
into a narrow range of
output values.

Map a narrow range


of light input values
into a wide range of
output values.
97
GAMMA CORRECTION

s = r 1/2.5
A variety of devices for
image capture, printing
and display respond
according to power law.

Such systems tend to


produce images that are
darker than intended.

98
CONTRAST MANIPULATION

Original image
is predominantly
dark! s = r 0.6

An expansion of
gray levels is
desired.

s = r 0.4 s = r 0.3

99
CONTRAST MANIPULATION

Original image A compression of


has a washed-out gray levels is
appearance! desired.
s = r 3.0

s = r 4.0 s = r 5.0

100
PIECEWISE-LINEAR TRANSFORMATIONS

‰ A complementary approach to the previous methods.


‰ Advantage: The form of piecewise functions can be
arbitrarily complex.
‰ Disadvantage: requires considerable more user input.
‰ Contrast stretching
 Increases the dynamic range of gray levels in the image.
‰ Gray-level slicing
 Highlights a specific range of gray levels in the image.
‰ Bit-plane slicing
 Highlights the contribution made by specific bits.
 If each pixel is represented by 8 bits, the image is sliced into 8
planes.

101
CONTRAST STRETCHING

Typical
transformation
Ø
(r1,s1) and (r2,s2)
control the shape of
the function.

r1 = s1
r2 = s2 Ö linear

r1 = r2
s1 = 0 Ö threshold
s2 = L - 1

(r1, s1) = (rmin, 0) (r1, s1) = (m, 0)


r1 ≤ r2 and s1 ≤ s2: (r2, s2) = (rmax, L - 1) (r2, s2) = (m, L - 1)
single valued
monotonically
increasing function mean gray level

102
GRAY-LEVEL SLICING

Two approaches:

Display a high level for all


gray levels in the range of
interest and a low value for
all other gray levels.

Brighten the desired range


of gray levels but preserve
the background and gray
level tonalities.

A gray scale image Transformation (a) applied.

103
BIT-PLANE SLICING

Higher-order bits contain


the majority of the visually
Slicing into significant data.
8 bit planes

Other bit planes


contain subtle
details.

104
HISTOGRAM PROCESSING

‰ Histogram of a digital image


 h(rk) = nk
 rk : kth gray level
 nk : # of pixels with having gray level rk
‰ Normalized histogram
 p(rk) = nk / n
 Σ all components of a normalized histogram = 1.
‰ Histogram manipulation for image enhancement.
‰ Histogram data is also useful in other applications.
 Image compression
 Segmentation
‰ Histograms are simple to calculate in software.

105
FOUR IMAGE TYPES

Compare the four images


and their histograms.

106
HISTOGRAM EQUALIZATION

k nj
sk = ∑ n , k = 0,1,2,..., L −1
j=0

Has the general tendency of spreading


the histogram of the input image.

107
AN EXAMPLE: LENA

108
HISTOGRAM MATCHING

Histogram
equalization

Is the result
any good?

There are applications for which histogram equalization is not the


best approach.

Sometimes it is useful to specify the shape of the histogram for the


processed image.

Histogram matching: method to generate a processed image that


has a specified histogram.

109
PROCEDURE FOR HISTOGRAM MATCHING

‰ Obtain the histogram of the given image.


‰ Precompute sk for rk:
 sk= T(rk) = Σ pr(rj) = Σ (nj/n), k = 0,…,L-1
 nj: # of pixels with gray level rj
 n: total # of pixels in the image
 L: # of discrete gray levels
‰ Obtain the transformation function G from the given histogram:
 vk = G(zk) = Σ pz(zi) = sk, k = 0,…,L-1
‰ Precompute zk for each sk using the following iterative scheme:
 G is a hypothetical transformation.
 We need to find zk that correspond to sk
 zk must satisfy the equation G(zk) = sk, i.e., G(zk) - sk = 0
 Iterate on values of z to satisfy the equation for k = 0,…,L-1
„ Start with zk = ẑ for each k, where ẑ is the smallest integer in [0,L-1],
and increment in integer values
„ For k=k+1, start with ẑ = zk
‰ rk Æ sk Æ zk (Use the precomputed values for these mappings)
110
SPECIFIED HISTOGRAM

Preserves the general


shape of the original
histogram but has a
smoother transition of
levels in the dark region.

G(z)
G-1(s)

Note that the low end of the


histogram has shifted right
toward the lighter region.

111
MAPPING FROM rk TO zk

112
LOCAL ENHANCEMENT WITH
HISTOGRAM PROCESSING

Histogram equalization
Histogram matching Global methods for overall enhancement

Sometimes it is necessary to enhance details over small areas.

1. Define a neighborhood around a pixel


2. At each location
• Compute the histogram
• Obtain the transformation function (equalization or specification)
• Use the function to map the gray level of the center pixel
3. Move the center to an adjacent pixel location.

Reveals the presence


of small squares
inside the larger dark
squares.

113
HISTOGRAM STATISTICS
FOR IMAGE ENHANCEMENT

Local mean Can be used to develop simple yet powerful


Local variance enhancement techniques!

Ef(x,y) if mSxy ≤ k0MG and k1DG ≤ σSxy ≤ k2DG


g(x,y) =
f(x,y) otherwise
Global mean Global SD

Input Output

E = 4.0, k0 = 0.4, k1 = 0.02, k2 = 0.4


Selection of appropriate vales requires experimentation.

114
ENHANCEMENT USING ARITHMETIC/LOGIC
OPERATIONS

‰ Arithmetic operations
 Subtraction
Most useful in image enhancement
 Addition
 Multiplication: used as a masking operation
 Division: multiplication of one image by the reciprocal of the
other.
‰ Logical operations
 AND For these operations, gray scale
Used for masking
 OR pixel values are processed as
 NOT strings of binary numbers.
 Frequently used in conjunction with morphological
operations.

115
AND/OR MASKS

116
IMAGE SUBTRACTION

g(x,y) = f(x,y) – h(x,y) Two principal ways to scale:

1. Add 255 to every pixel


and divide by 2.

2. Find the min difference


and add its negative to
all the pixel values. Then
multiply each pixel by
255/Max (Max is the max
pixel value in the modified
difference image).

Contains more detail


than the darker image.

117
MASK MODE RADIOGRAPHY

The mask is an X-ray image of a region of a patient’s body captured


by an intensified TV camera.

A contrast medium is injected into the patient’s blood stream.

A series of images are taken of the same region.

The net effect of subtracting the mask from each sample: areas that are
different appear as enhanced detail in the output image.

118
IMAGE AVERAGING

g(x,y) = f(x,y) + η(x,y)

At every pair of coordinates,


the noise is uncorrelated and
has zero average value.

Averaging K different noisy images gi(x,y), we get f(x,y).

The average approaches f(x,y) as the # of noisy images increases.

The images gi(x,y) should be aligned to avoid the introduction of artifacts.

As in the case of subtraction, scaling is needed.


ƒ The sum of K 8bit images can range from 0 to 255K.
ƒ Image averaging may result in negative values.

119
GALAXY PAIR NGC 3314

Totally useless!

Averaging

120
DIFFERENCE IMAGES AND THEIR HISTOGRAMS

Mean and SD of the


difference images
decrease as the
value of K increases.

121
BASICS OF SPATIAL FILTERING

‰ The filter mask is moved from point to point in an image.


‰ At each point (x,y), the response of the filter is calculated.
‰ Linear spatial filtering
 3x3 mask
 R = ω(-1,-1)f(x-1,y-1) + ω(-1,0)f(x-1,y) + ω(0,0)f(x,y) + …
+ ω(1,0)f(x+1,y) + ω(1,1)f(x+1,y+1)
 ω(0,0) coincides with f(x,y) indicating that the mask is
centered at f(x,y).
‰ Our focus will be on masks of odd sizes.
‰ Image of size MxN, filter mask of size mxn.
a b
g ( x, y ) = ∑ ∑ ω ( s, t ) f ( x + s, y + t )
s = − at = − b

where a = (m-1)/2 and b = (n-1)/2

122
3x3 MASK

123
FILTERING AT THE BORDERS

‰ What happens when the center of the filter approaches the


border of the image?
‰ Several alternatives:
1. Limit the excursions of the center of the mask to be at a distance no
less than (n-1)/2 pixels. The resulting filtered image will be smaller than
the original
2. Filter all pixels with only the section of the mask that is fully contained
in the image.
3. Pad the image by adding rows and columns of zeros (or other constant
gray level).
4. Pad the image by replicating rows and columns.

Additional Center of a 3x3 mask


rows & columns

124
SMOOTHING SPATIAL FILTERS

‰ Used for blurring and noise reduction.


‰ Smoothing linear filters
 Also called averaging filters or low pass filters.
 The value of every pixel is replaced by the average of the
gray levels in the neighborhood.
 Standard average and weighted average.
 Weighted averaging filter: Image of size MxN, filter of size
mxn.
a b

∑ ∑ ω ( s, t ) f ( x + s, y + t )
g ( x, y ) = s =− a t =−b
a b

∑ ∑ ω ( s, t )
s =− as =−b

‰ Order-statistics filters
 Nonlinear spatial filters
 Best-known example: Median filter
125
TWO 3X3 SMOOTHING FILTERS

A box filter The basic strategy is


to reduce blurring.

126
FIVE DIFFERENT FILTER SIZES

Pronounced black border is the


result of padding the border with
0’s, and the trimming off the
padded area.

127
BLURRING AND THRESHOLDING

THRESHOLD = 25% of highest


intensity in the blurred image

128
ORDER-STATISTICS FILTERS

‰ Response of the filter is based on ordering (ranking)


the pixels.
‰ The value of the center pixel is replaced by the value
determined by the ordering result.
‰ Median filters
 For certain types of noise, they provide excellent noise
reduction with less blurring than linear filters of similar
size (very effective in the presence of salt-and-pepper
noise).
 The median, ξ, of a set of values: half of the values ≤ ξ,
and half ≥ ξ.
 3x3 neighborhood: median is the 5th largest value.
 5x5 neighborhood: median is the 13th largest value.
 Principal function is to force points with distinct gray
levels to be more like their neighbors.
‰ Max filter: 100th percentile
‰ Min filter: 0th percentile

129
EFFECT OF AVERAGING AND MEDIAN FILTERS

Which one has removed the


salt-and-pepper noise??

130
SHARPENING SPATIAL FILTERS

‰ Used for highlighting fine detail or enhancing detail that


has been blurred.
‰ Averaging is analogous to integration.
‰ Sharpening is analogous to differentiation.
‰ First derivatives
 f'(x) = f(x + 1) – f(x)
 Must be zero in flat areas
 Must be non-zero at the onset of a gray-level step or ramp
 Must be nonzero along ramps
‰ Second derivatives
 f''(x) = f(x + 1) + f(x - 1) – 2f(x)
 Must be zero in flat areas
 Must be non-zero at the onset and end of a gray-level step or
ramp
 Must be zero along ramps of constant slope

131
1ST AND 2ND DERIVATIVES

132

2ND DERIVATIVES OF ENHANCEMENT – THE


LAPLACIAN

f'x(x,y) = f(x + 1,y) + f(x - 1,y) – 2f(x,y)

f'y(x,y) = f(x,y + 1) + f(x,y - 1) – 2f(x,y)

∇2f = f(x + 1,y) + f(x - 1,y) + f(x,y + 1) + f(x,y - 1) – 4f(x,y)

f(x,y) - ∇2f(x,y) if the center coefficient is “-”


g(x,y) =
f(x,y) + ∇2f(x,y) if the center coefficient is “+”

133
IMPLEMENTATIONS OF THE LAPLACIAN

134
SHARPENED NORTH POLE OF THE MOON

g(x,y) is usually implemented


with one pass of a single mask.

g(x,y) = f(x,y) - ∇2f(x,y)

0 -1 0

-1 5 -1

0 -1 0

Small details are


enhanced
And the background
tonality
is perfectly preserved.

135
TWO LAPLACIAN MASKS

136
UNSHARP MASKING & HIGH-BOOST FILTERING

Unsharp masking: f s ( x, y ) = f ( x, y ) − f ( x, y )

Blurred version of f

High-boost filtering: f hb ( x, y ) = Af ( x, y ) − f ( x, y )

137
AN APPLICATION OF BOOST FILTERING

-1 -1 -1

-1 A+8 -1

-1 -1 -1

Laplacian with A = 0

138
1ST DERIVATIVES OF ENHANCEMENT – THE
GRADIENT

∇f ≈ |Gx| + |Gy|

∇f ≈ |z9 – z5| + |z8 – z6|

The weight value


of 2 is used to Sobel operators
achieve some
smoothing by
giving more ∇f ≈ |(z7+2z8+z9)-(z1+2z2+z3)|
importance to the + |(z3+2z6+z9)-(z1+2z4+z7)|
center point.

139
APPLICATION OF SOBEL OPERATORS

Gradient is frequently used in industrial inspection.

Constant or slowly varying shades


of gray have been eliminated.

140
COMBINING SPATIAL ENHANCEMENT METHODS

141
CHAPTER 4: IMAGE ENHANCEMENT IN THE
FREQUENCY DOMAIN

‰ French mathematician J. B. J. Fourier was born in 1768.


‰ He died of heart disease at the age of 59.
‰ Fourier Series: Any periodic function can be expressed
as the sum of complex exponentials (sines and/or
cosines) of different frequencies, each multiplied by a
different coefficient.
 f(x): single variable, continuous function with period T0

f ( x) = ∑c e
k = −∞
k
jΩ 0 kx
, Ω 0 = 2π / T0

‰ Fourier Transform: Non-periodic functions can also be


represented in terms of complex exponentials.
 f(x): single variable, continuous function.

∞ ∞
F (u ) = ∫ f ( x)e − j 2πux
dx f ( x) = ∫ F (u )e j 2πux du
−∞ −∞

Fourier transform pair


142
FOURIER SERIES

143
FOURIER TRANSFORM IN 2 VARIABLES

Fourier Transform can easily be extended to 2 variables:

∞ ∞
F (u, v) = ∫ ∫ f ( x, y )e − j 2π (ux + vy ) dxdy
−∞ −∞

∞ ∞
f ( x, y ) = ∫ ∫ F (u, v)e j 2π (ux +vy ) dudv
−∞ −∞

144
ONE-DIMENSIONAL DFT AND ITS INVERSE

f ( x ), x = 0,1,2,..., M − 1 : discrete function of one variable

M −1
1
F (u ) =
M
∑ f ( x )e
x =0
− j 2πux / M
, u = 0,1,..., M − 1

Approximately M2 summations and multiplications to compute

M −1
f ( x) = ∑ F (u )e j 2πux / M , x = 0,1,..., M − 1
u =0

The DFT and its inverse always exist.

145
FREQUENCY DOMAIN &
FREQUENCY COMPONENTS

e jθ = cosθ + j sin θ
cos(−θ ) = cosθ
M −1

∑ f ( x)[cos 2πux / M − j sin 2πux / M ], u = 0,1,2,..., M − 1.


1
⇒ F (u ) =
M
x =0

The domain over which the values of F(u) range is called the frequency domain.

Each of the M terms of F(u) is called a frequency component of the transform.

u & F(u): frequency domain & frequency components.


x and f(x): time domain and time components.

A useful analogy: Compare Fourier transform to a glass prism.

ƒ Glass prism: a physical device that separates light into its various color components.

ƒ Fourier transform: a mathematical prism that separates a function into various


components based on frequency content.

146
FOURIER TRANSFORM
IN POLAR COORDINATES

F (u ) =| F (u ) | e − jφ (u ) ,
⎡ I (u ) ⎤
where | F (u ) |= [ R 2 (u ) + I 2 (u )]1/ 2 and φ (u ) = tan −1 ⎢ ⎥
⎣ R (u ) ⎦
magnitude phase angle

P(u ) =| F (u ) | 2 = R 2 (u ) + I 2 (u )

power spectrum

147
A ONE-DIMENSIONAL EXAMPLE

It is common practice to multiply f(x)


M = 1024 by (-1)x before taking the transform.
A=1
K=8

Important things to note:

1. The height of the


spectrum doubled as
the area under the
curve in the x-domain
doubled.
2. The # of zeros in the
spectrum in the same
interval doubled as
the length of the
function doubled.

M = 1024
A=1
K = 16

148
SAMPLES IN SPATIAL AND FREQUENCY
DOMAINS

f ( x ), x = 0,1,..., M − 1 : M samples (equally spaced but arbitrary points)

f ( x0 ) : first point in the sequence

f ( x0 + ∆x), f ( x0 + k∆x),..., f ( x0 + [ M − 1]∆x)


f ( x) = f ( x0 + x∆x)
1
∆u =
F (u ), u = 0,1,..., M − 1 : M samples M∆x

F (0) : first point in the sequence

F (0 + ∆u ), F (0 + k∆u ),..., F (0 + [ M − 1]∆u )


F (u ) = F (u ∆ u )

149
TWO DIMENSIONAL DFT AND ITS INVERSE

1 M −1 N −1
F (u, v) = ∑
MN x =0
∑ f (x, y)e
y =0
− j 2π (ux / M + vy / N )
, u = 0,1,2,...,M −1, v = 0,1,2,...,N −1.

frequency variables

M −1 N −1
f ( x, y ) = ∑ ∑ F (u, v)e
u =0 v =0
j 2π ( ux / M + vy / N )
, x = 0,1,2..., M − 1, y = 0,1,2,..., N − 1.

spatial variables

⎡ I (u, v) ⎤
Polar coordinates: | F (u, v) |= [ R 2 (u, v) + I 2 (u, v)]1 / 2 φ (u, v) = tan −1 ⎢ ⎥
⎣ R(u, v) ⎦

Power spectrum: P(u , v) =| F (u , v) | 2 = R 2 (u , v) + I 2 (u , v)

150
SHIFTING OF ORIGIN

ℑ[ f ( x, y )(−1) x + y ] = F (u − M / 2, v − N / 2)

Fourier This shifts the origin


Transform of F(u,v) to (M/2,N/2).

M −1 N −1 If f(x,y) is an image, the value of DFT at


1
F (0,0) =
MN
∑ ∑
x =0 y =0
f ( x, y ) Ö the origin is equal to the average gray
level of the image.

dc component

f(x,y) is real: F (u , v) = F * (−u ,−v) Ö | F (u , v ) |=| F ( −u ,−v ) |

1 1
∆u = and ∆v = Spectrum is symmetric
M∆x M∆y

151
CENTERING THE SPECTRUM

Image was multiplied by (-1)x+y


prior to computing the transform.

152
RELATIONSHIP BETWEEN THE DFT COMPONENTS
AND SPATIAL CHARACTERISTICS OF AN IMAGE

Frequency is directly related to the rate of change.

DFT components can be associated with patterns of intensity variations in an image.

2 principal features:

1. Strong edges that


run approx. at
±450.
2. 2 white oxide
protrusions.

Correspond to the Corresponds to the


strong edges. short protrusion.

Corresponds to the
long protrusion.

153
BASIC STEPS IN
FREQUENCY DOMAIN FILTERING

(-1)x+y
Cropping to even dimensions Each component of H multiplies
Gray-level scaling both real and imaginary parts of
Conversion to floating point on input the corresponding component of F.
Conversion to 8-bit format on output
Etc.
Imaginary components
set to zero.

Complex Complex

Real filter
(zero-phase-shift filters)

1. Multiply the input image by (-1)x+y


2. Compute F(u,v)
3. Multiply F(u,v) by a filter function H(u,v)
4. Compute the inverse DFT
5. Obtain the real part
6. Multiply the real part by (-1)x+y

154
FILTERING IN THE FREQUENCY DOMAIN

‰ G(u,v) = H(u,v)F(u,v)
‰ H(u,v): real, zero-phase-shift filters
‰ Zero-phase-shift filters do not change the phase angle.
‰ In general, the components of F(u,v) are complex
quantities.
‰ Filtered image = ℑ−1 [G(u,v)]
‰ In general, the inverse DFT is complex.
‰ If the input image F and the filter H are real, the imaginary
components of inverse DFT should all be zero.
‰ Due to round-off errors, the inverse DFT has parasitic
imaginary components. These are ignored.

155
AN INTRODUCTORY EXAMPLE OF FILTERING

Assume we want the average value of an image to be zero.


Also assume the transform has been centered.

ƒ F(0,0) = 0
0 if (u,v) = (M/2, N/2)
ƒ H(u,v) =
1 otherwise

notch filter: constant function with a notch at the origin

In reality, the average


of the displayed image
cannot be zero.

The display of this


image is made
possible by scaling
(making the most
negative value 0, and
scaling all other
values up from that).

156
LOW-PASS AND HIGH-PASS FILTERS

Low-pass filter: A filter that attenuates high frequencies while passing low frequencies.

High-pass filter: A filter that attenuates low frequencies while passing high frequencies.

blurring

Circularly
symmetric

sharpening

157
ADDING A CONSTANT TO A HIGH-PASS FILTER

A constant is added to the HP filter so that it does not completely eliminate F(0,0).

Notice the improvement!

158
CONVOLUTION

M −1 N −1
1
f ( x, y ) * h ( x, y ) =
MN
∑∑ f (m, n)h( x − m, y − n)
m =0 n =0

Discrete convolution For each displacement (x,y), computes the sum


of 2 functions of products over all values of m and n.

f ( x, y ) * h( x, y ) ⇔ F (u , v) H (u , v)
Convolution theorem
f ( x , y ) h ( x , y ) ⇔ F (u , v ) * H (u , v )

159
IMPULSE FUNCTION OF STRENGTH A

An impulse function of strength A


located at coordinates (x0,y0) Ö Aδ ( x − x0 , y − y0 )

M −1 N −1
Definition: ∑∑ s( x, y) Aδ ( x − x , y − y ) = As( x , y )
x =0 y =0
0 0 0 0

Unit impulse M −1 N −1
located at the
origin:
∑∑ s( x, y)δ ( x, y) = s(0,0)
x =0 y =0

DFT of the unit M −1 N −1


1 1
impulse located
at the origin:
F (u, v) =
MN
∑∑ δ ( x, y)e − j 2π (ux / M +vy / N ) =
x =0 y =0 MN

real constant

160
FOURIER TRANSFORM PAIRS

f ( x , y ) * h ( x , y ) ⇔ F ( u , v ) H (u , v )
δ ( x, y ) * h( x, y ) ⇔ ℑ[δ ( x, y )]H (u , v )
1 1
h ( x, y ) ⇔ H (u , v)
MN MN

h ( x , y ) ⇔ H (u , v )
Hence, filters in the spatial and frequency domains
constitute a Fourier transform pair.

If both filters are of the same size, it is more efficient


to do the filtering in the frequency domain.

However, we use much smaller filters in the spatial domain.

161
GAUSSIAN FILTERS

/ 2σ 2
H (u ) = Ae −u
2

Fourier transform pair


−2π 2σ 2 x 2
h( x) = 2π σAe

Both are real.

Gaussian curves are intuitive and easy to manipulate.

They behave reciprocally w.r.t one another.

H(u) has a broad profile Ö h(x) has a narrow profile.

H(u) has a narrow profile Ö h(x) has a broad profile.

162
LOWPASS AND HIGHPASS GAUSSIAN FILTERS

/ 2σ 12 / 2σ 22
H (u ) = Ae −u − Be −u
2 2

The narrower the frequency


domain filter, the more it
will attenuate the low
frequencies, resulting in
increased blurring.

Fourier transform pair 2 examples

We can implement lowpass


filtering in the spatial
domain by using a mask The spatial filter has both
negative and positive
with all positive
coefficients.
coefficients.

h( x) = 2π σ 1 Ae −2π σ 12 x 2
− 2π σ 2 Be −2π σ 22 x 2
2 2

163
SMOOTHING AND SHARPENING FILTERS

‰ Smoothing filters
 Ideal lowpass filters (very sharp)
 Butterworth lowpass filters
 Gaussian lowpass filters (very smooth)

Butterworth filter parameter: filter order

• High values: filter has the form of the ideal filter.

• Low values: filter has the form of the Gaussian filter.

‰ Sharpening filters
 Ideal highpass filters
 Butterworth highpass filters
 Gaussian highpass filters

164
2-D IDEAL LOWPASS FILTERS

1 if D(u,v) ≤ D0 Nonnegative
quantity
H(u,v) =
0 if D(u,v) > D0

Distance from (u,v) to the


center of the frequency rectangle

Image size: MxN

Center of the frequency rectangle: (u,v) = (M/2, N/2)

Distance to the center: D(u,v) = [(u – M/2)2 + (v – N/2)2]1/2

The lowpass filters in this chapter are radially symmetric.

165
2-D IDEAL LOWPASS FILTERS

Cutoff frequency

166
2-D IDEAL LOWPASS FILTERS AS A FUNCTION
OF CUTOFF FREQUENCIES

M −1 N −1

∑∑
2
Total image power: PT = P(u , v), where P(u , v) = F (u , v)
u =0 v =0

Summation is taken
A circle of radius r ⎡ ⎤
α = 100 ⎢∑∑ P(u, v) / PT ⎥
over the values of
encloses α percent (u,v) that lie inside
of the power: ⎣u v ⎦ the circle or on its
boundary.

Radius %

5 92.0
15 94.6
30 96.4
80 98
230 pixels 99.5

167
ILP FILTERING WITH RADII 5,15,30,80,230

Ringing is Increase in filter radius


characteristics of
ideal filters

Less power removal

Ideal lowpass
filtering is not very
practical but they Less blurring
can be
implemented on a
computer to study
their behavior.

168
BLURRING AS A CONVOLUTION PROCESS

Gray-scale profile of a
horizontal scan line
through the center
of the spatial filter.

Obtained using the


filter to its right.

Gray-scale profile of a
diagonal scan line
through the center
of the filtered image.

169
BUTTERWORTH LOWPASS FILTERS

1
H (u , v) =
1 + [ D (u , v) / D0 ] 2 n

BLPF of order n D(u, v) = [(u − M / 2) 2 + (v − N / 2) 2 ]1 / 2

BLPF transfer function does not have a sharp


discontinuity that establishes a clear cutoff.

170
BLPF WITH ORDERS 1 THROUGH 4

171
BUTTERWORTH FILTERING WITH RADII 5,15,30,80,230

Ringing is not visible in


any of these images.

172
BLPFS OF ORDER 1,2,5,20

no ringing mild ringing exhibits the characteristics


no negative values small negative values of the ILPF

To faciliate comparisons,
additional enhancing with
a gamma transformation
was applied to all images.

173
GAUSSIAN LOWPASS FILTERS

− D 2 ( u ,v ) / 2σ 2
H (u, v) = e σ is a measure of the spread
of the Gaussian curve.

D(u , v) = [(u − M / 2) 2 + (v − N / 2) 2 ]1 / 2
A = 1 (to be consistent with the other filters)

The inverse Fourier transform of the Gaussian lowpass filter is also Gaussian.

Ö A spatial Gaussian filter will have no ringing.

174
GLPFS WITH DIFFERENT σ VALUES

σ = D0

175
GAUSSIAN FILTERING WITH RADII 5,15,30,80,230

no ringing

176
PRACTICAL APPLICATION OF LPF:
MACHINE PERCEPTION

Machine recognition systems have GLPF with D0 = 80


difficulty in reading broken characters.

177
PRACTICAL APPLICATION OF LPF:
PRINTING & PUBLISHING

Smoothed image looks


soft and pleasing!

178
PRACTICAL APPLICATION OF LPF:
PROCESSING SATELLITE AND AERIAL IMAGES

Boundaries between water More blurring to make


bodies were caused by large features
loop currents. Scan lines are reduced. recognizable..

179
SHARPENING FREQUENCY DOMAIN FILTERS

Blurring is achieved by attenuating the HF components of DFT of an image.

Sharpening is achieved by attenuating the LF components of DFT of an image.

Only zero-phase-shift filters are considered here.

H hp (u , v ) = 1 − H lp (u , v )

When LP attenuates Transfer function of the


frequencies, HP passes them. corresponding LP filter

Sharpening filters:

ƒ Ideal highpass filters


ƒ Butterworth highpass filters
ƒ Gaussian highpass filters

180
3 TYPES OF SHARPENING FILTERS

The Butterworth filter


represents a transition
between the sharpness
of the IF and the
smoothness of the GF.

181
CORRESPONDING SPATIAL DOMAIN FILTERS

To obtain the spatial domain representation of a frequency domain filter:

1. Multiply H(u,v) by (-1)u+v for centering


2. Compute the inverse DFT
3. Multiply the real part by (-1)x+y

182
IDEAL HIGHPASS FILTERS

0 if D(u,v) ≤ D0
H(u,v) =
1 if D(u,v) > D0

183
BUTTERWORTH HIGHPASS FILTERS

1
H (u , v) =
1 + [ D0 / D(u , v)]2n

184
GAUSSIAN HIGHPASS FILTERS

− D 2 ( u ,v ) / 2 D02
H (u , v) = 1 − e results are smoother than
with the previous 2 filters.

185
CHAPTER 5: IMAGE RESTORATION

‰ The ultimate goal is to improve an image in some


predefined sense.
‰ Restoration
 The given image is degraded.
 Degradation is modeled.
 Inverse process is applied.
‰ No focus on sensor, digitizer, and display degradations.
‰ Some restoration techniques are best formulated in the
spatial domain while others are better suited for the
frequency domain.
 Spatial domain: additive noise.
 Frequency domain: blurring

186
A MODEL OF IMAGE
DEGRADATION/RESTORATION PROCESS

Linear & Degraded


Input Restored
position-invariant image
image image

g ( x , y ) = h ( x, y ) * f ( x, y ) + η ( x , y )

G (u , v) = H (u , v) F (u , v) + N (u , v)

187
NOISE MODELS

‰ Principle sources of noise in digital images arise during


acquisition and transmission.
 Acquisition: performance of image sensors is affected by a
variety of factors (environmental conditions, etc.) and by the
quality of the sensing elements.
 Transmission: images are corrupted due to interference in
the channel.
‰ Spatial properties of noise
‰ Frequency properties of noise
 White noise: Fourier spectrum of noise is constant.
‰ Assumption: noise is independent of spatial
coordinates.
 Partially invalid in some applications.
 No correlation between pixel values and noise.
 Exception: spatially periodic noise.

188
NOISE PROBABILITY DENSITY FUNCTIONS

We are concerned with the


statistical behavior of the gray-level
µ = a + πb / 4
b( 4 − π )
values of the noise component of
σ2 =
4 the model.

These values may be considered


random variables characterized by a
b
probability density function (PDF).
µ=
a
1
b µ=
σ2 = a
a2
1
σ2 =
a2 Most common PDFs:

1. Gaussian (normal) noise


2. Rayleigh noise
3. Erlang (Gamma) noise
µ=
a+b 4. Exponential noise
2 5. Uniform noise
(b − a ) 2
σ =
2

12 6. Impulse (salt-and-pepper) noise

189
MODELING A BROAD RANGE OF NOISE
CORRUPTIONS

‰ The 6 PDFs are useful tools.


‰ Gaussian noise arises in an image due to factors such
as electronic circuit noise and sensor noise due to poor
illumination and/or high temperature.
‰ Rayleigh density is helpful in characterizing noise
phenomena in range imaging.
‰ Gamma and exponential densities find application in
laser imaging.
‰ Impulse noise is found in situations where quick
transients, such as faulty switching, take place during
imaging.
‰ Uniform density may be the least descriptive of practical
situations but it is quite useful as the basis for numerous
random number generators.

190
A TEST PATTERN

composed of simple, constant areas


that span the gray scale from black
to near white in 3 increments.

191
NOISY IMAGES AND THEIR HISTOGRAMS:
GAUSSIAN, RAYLEIGH, AND GAMMA

The parameters of
the noise were
chosen in each
case so that the
histogram
corresponding
to the 3 gray levels
in the test pattern
would start to
merge.

192
NOISY IMAGES AND THEIR HISTOGRAMS:
EXPONENTIAL, UNIFORM, AND IMPULSE

The parameters of
the noise were
chosen in each
case so that the
histogram
corresponding
to the 3 gray levels
in the test pattern
would start to
merge.

Extra peak: the


noise components
were pure black
and white, and the
lightest
component of the
test pattern is
light gray.

193
PERIODIC NOISE

Image is severely corrupted by


spatial sinusoidal noise of
various frequencies.

The Fourier Transform of a pure


sinusoid is a pair of conjugate
impulses located at the
conjugate frequencies of the
sine wave.

If the amplitude of a sine wave


in the spatial domain is strong
Periodic noise arises enough, we would see in the
typically from electrical or spectrum of the image a pair of
electromechanical impulses for each sine wave in
interference during the image.
acquisition.

This is the only spatially


dependent noise
considered here.

194
ESTIMATION OF NOISE PARAMETERS

‰ Periodic noise
 Inspection of the Fourier spectrum
 Inspection of the image (possible only in simple cases)
 Automated analysis
„ Noise spikes are exceptionally pronounced.
„ Some knowledge is available about the general location of the
frequency components.
‰ Noisy PDFs
 Parameters may be partially known from sensor specs
 Imaging system available
„ Capture a set of images of flat environments.
 Images are available
„ Crop small patches of reasonably constant gray level.
„ Obtain the histogram.
„ Compute mean and variance.
„ Gaussian PDF: Completely determined by the mean and variance.
„ Impulse noise: actual probability of occurrence of white and black
pixels is needed.
„ Others: Use the mean and variance to solve for a and b.
195
ESTIMATION OF PDF PARAMETERS
FROM SMALL PATCHES

How to use the data from the image strips?

• Calculate the mean and variance of the gray levels.


• Identify the closest PDF.
• If the shape is Gaussian, the mean and variance specifies the Gaussian PDF.
• For other shapes, use the mean and variance to solve for a and b.
• For impulse noise, obtain the probability of occurrence of white & black pixels.

196
SPATIAL DOMAIN FILTERING
FOR ADDITIVE NOISE

g ( x , y ) = f ( x , y ) + η ( x, y ) additive noise

G (u, v) = F (u, v) + N (u, v)

Mean filters

Arithmetic mean filter


Geometric mean filter
Harmonic mean filter
Contraharmonic mean filter

Order-statistics filters

Median filter
Max & min filters
Midpoint filter
Alpha-trimmed mean filter

Adaptive filters
Adaptive local noise reduction filter
Adaptive median filter

197
MEAN FILTERS

1
fˆ ( x , y ) =
mn
∑ g ( s, t )
( s , t )∈S xy
Arithmetic mean

1
⎡ ⎤ mn
Geometric mean
ˆf ( x , y ) = ⎢
∏ g (s, t )⎥ (comparable to arithmetic

⎣⎢ ( s ,t )∈ S xy ) ⎦⎥
mean but tends to lose less
image detail)

mn
fˆ ( x , y ) = Harmonic mean
(works well for salt noise but
1

( s , t )∈ S xy g (s, t )
fails for pepper noise. OK for
other types of noise as well)
Q=0 Ö arithmetic mean
Q=-1 Ö harmonic mean

∑ g ( s, t )
( s ,t )∈S xy
Q +1
Contraharmonic mean
(+Q: eliminates pepper noise
fˆ ( x, y ) =
∑ g ( s, t ) Q -Q: eliminates salt noise
Order not simultaneously!)
( s ,t )∈S xy of the filter

198
ARITHMETIC AND GEOMETRIC MEAN FILTERS

mean=0, variance=400

Geometric mean filter did


not blur the image as
much as the arithmetic
mean filter

199
SPATIAL FILTERING FOR ADDITIVE NOISE

pepper noise salt noise

Better job of
cleaning the
background at the
expense of blurring
the dark areas!

200
CONTRAHARMONIC FILTERING
WITH THE WRONG SIGN

Wrong sign for Q Wrong sign for Q

201
ORDER-STATISTICS FILTERS

Effective for
bipolar and
unipolar impulse fˆ ( x, y ) = median {g ( s, t )} Median filter
noise ( s ,t )∈S xy

Useful for finding


the brightest
points in an image
fˆ ( x , y ) = max { g ( s , t )} Max filter
( s ,t )∈S xy

Useful for finding


the darkest points
in an image
fˆ ( x, y ) = min {g ( s, t )} Min filter
( s ,t )∈S xy

⎡ ⎤
ˆf ( x, y) = 1 ⎢
{g ( s, t )} + min {g ( s, t )}⎥
Works best for
randomly max Mid-point filter
distributed noise 2 ⎣⎢( s ,t )∈S xy ( s , t )∈S xy ⎦⎥
d = 0 Ö arithmetic
1
d = (mn-1)/2 Ö median
Other d: Useful for
multiple types of noise
fˆ ( x, y) = ∑ g r ( s, t )
mn − d ( s ,t )∈S xy
Alpha-trimmed
mean filter

202
3 PASSES OF MEDIAN FILTER
FOR IMPULSE NOISE

1ST pass
Significant
Improvement!

2nd pass 3rd pass


barely visible # of passes
noise points! should be as
low as
possible!

203
MAX & MIN FILTERS FOR PEPPER NOISE

Pepper noise is Better job!


reasonably removed.
The filter also removed
The filter also removed some white points from
some dark points from the borders of the light
the borders of the dark objects.
objects.

204
REDUCTION OF NOISE
WITH 4 TYPES OF FILTERS

Uniform noise Additional salt-and-


(mean=0, pepper noise
variance=800) (Pa=Pb=0.1)

Arithmetic mean Geometric mean


(no good) (no good)

Median Alpha-trimmed mean


(much better) (better than median)

205
ADAPTIVE FILTERS

‰ The filters discussed so far are applied to an image


independent of how image characteristics vary from one
point to another.
‰ 2 simple adaptive filters
 Adaptive, local noise reduction filter
 Adaptive median filter
‰ Their behavior changes based on statistics of the image
inside the filter region.
‰ Advantage: superior performance
‰ Disadvantage: increase in filter complexity

206
ADAPTIVE, LOCAL NOISE REDUCTION FILTER

Local region Sxy

g(x,y): value of noisy image at (x,y)


σ η 2: variance of the noise
mL: local mean of pixels in Sxy
σ L 2: local variance of pixels in Sxy

Behavior of the filter:

1. σ η2 = 0 Ö g(x,y)=f(x,y) – trivial case with zero noise


2. σL2 >> ση2 Ö ~g(x,y) – edges should be preserved
3. σL 2 ≈ ση 2 Ö mL – local noise is reduced by averaging

ˆf ( x, y ) = g ( x, y ) − σ [ g ( x, y ) − m ]
2
η
Definition: L
σ2
L
Analysis of ση 2:

Needs to be known or estimated (but we seldom have exact knowledge).


Tacit assumption: ση2 ≤ σL2 (because Sxy⊂ g(x,y))
ση2 > σL2 Ö set ratio=1 (this makes the filter nonlinear but prevents meaningless results)
Another approach: ση2 > σL2 Ö allow negative values and rescale (results in loss of dynamic range in the image)
207
COMPARISON OF ADAPTIVE FILTER WITH
ARITHMETIC AND GEOMETRIC MEAN FILTERS

Best results:

Noise reduction is
comparable but the
restored image is
much sharper!

208
ADAPTIVE MEDIAN FILTER

Zmin = min gray level value in Sxy


zmax = max gray level value in Sxy
zmed = median of gray levels in Sxy
zxy = gray level at (x,y)
Smax = max allowed size of Sxy

The filter works in 2 levels:

Level A (to check if zmed is an impulse)

A1 = zmed – zmin
A2 = zmed – zmax
If A1 > 0 and A2 < 0, goto level B
Else increase the window size 3 main purposes:
If window size ≤ Smax, repeat level A
Else output zxy 1. To remove impulsive noise
2. To provide smoothing of non-
Level B (to check if zxy is an impulse) impulsive noise
3. To reduce distortion (e.g., excessive
B1 = zxy – zmin thinning or thickening of object
B2 = zxy – zmax boundaries)
If B1 > 0 and B2 < 0, output zxy
Else output zmed

209
A SIMPLE EXAMPLE ADAPTIVE MEDIAN FILTER

center
point 10 20 20 zmin = 10

20 15 20 zmax = 100

20 25 100 zmed = 20

A1 = 20-10 = 10
A2 = 20-100 = -80
A1 > 0 & A2 < 0: zmin < zmed < zmax Hence, zmed cannot be an impulse.
Go to level B

B1 = 15-10 = 5
B2 = 15-100 = -85
B1 > 0 & B2 < 0: zmin < zxy < zmax Hence, zxy cannot be an impulse.

Output zxy = 15 these intermediate-level points are not changed

(B1 > 0 & B2 < 0) is false: zxy = zmin or zxy = zmax

Output zmed = 20
210
COMPARISON OF MEDIAN AND ADAPTIVE
MEAN FILTERS

Median filter Adaptive median filter

Noise reduction is
comparable but the filter
preserved sharpness!

211
FREQUENCY DOMAIN FILTERING
FOR PERIODIC NOISE

‰ Bandreject filters: remove or attenuate a band of


frequencies about the origin of the Fourier Transform.

‰ Bandpass filters: pass or strengthen a band of


frequencies about the origin of the Fourier Transform .

‰ Notch filters: reject or pass frequencies in predefined


neighborhoods about a center frequency.

212
BANDREJECT FILTERS

W
1 if D (u , v) < D0 −
2
2
1 ⎡ D 2 ( u ,v ) − D02 ⎤
W W − ⎢ ⎥
H (u , v ) = 1
0 if D0 − ≤ D(u, v) ≤ D0 + H (u , v) = H (u , v) = 1 − e
2 ⎢⎣ D ( u ,v )W ⎥⎦
2 2 2n
⎡ D(u , v)W ⎤
W 1+ ⎢ 2 2 ⎥
1 if D (u , v) > D0 −
2 ⎣ D (u, v) − D0 ⎦

A principal application: noise removal in situations where the general location of the noise
components in the frequency domain is approximately known.

213
APPLICATION OF A BANDREJECT FILTER

Restoration
is evident!

Note that it would not be possible to get equivalent results by


spatial domain filtering using small convolution masks!
214
BANDPASS FILTERS

H bp (u , v ) = 1 − H br (u , v )

Performs the
opposite operation
of a bandreject
filter.

Performing straight bandpass filtering on an image is not a common


procedure because it generally removes too much image detail.

Bandpass filtering is quite useful in isolating the effect on an image of


selected frequency bands.

215
APPLICATION OF A BANDPASS FILTER

Most image detail is lost


but the remaining
information is very useful
as it shows a noise
pattern that is close to
that of the noise that
corrupted the image!

Generated by:
• using the bandpass filter corresponding to
the bandreject filter in the previous
example
• taking the inverse transform

216
NOTCH FILTERS

‰ Notch filters must appear in symmetric pairs about the


origin in order to obtain meaningful results.
‰ The notch filter located at the origin is an exception.
‰ The # of pairs of notch filters is arbitrary.
‰ The shape of the notch areas is also arbitrary.
‰ Two classes
 Notch reject filters
 Notch pass filters
‰ Notch reject filters
 Ideal notch reject filters
 Butterworth notch reject filters
 Gaussian notch reject filters
‰ Notch pass filters
 Ideal notch pass filters
 Butterworth notch pass filters
 Gaussian notch pass filters
217
NOTCH REJECT FILTERS

0 D1 (u , v) ≤ D0 or D2 (u, v) ≤ D0
H (u, v) =
1 otherwise

1 ⎡ D ( u ,v ) D2 ( u ,v ) ⎤
1 − ⎢ 1 ⎥
H (u , v ) = 2 ⎣⎢ D02 ⎦⎥
⎡ D 02 ⎤
n
H (u, v) = 1 − e
1+ ⎢ ⎥
D
⎣ 1 ( u , v ) D 2 ( u , v ) ⎦

Note that these


3 filters become
highpass filters
if u0=v0=0.

D1 (u , v) = [(u − M / 2 − u0 ) 2 + (v − N / 2 − v0 ) 2 ]1/ 2
D2 (u , v) = [(u − M / 2 + u0 ) 2 + (v − N / 2 + v0 ) 2 ]1/ 2

218
NOTCH PASS FILTERS

H np (u , v) = 1 − H nr (u , v)

Performs the
opposite operation
of a notch reject
filter.

Note that the 3 filters become lowpass filters if u0 = v0 = 0.

219
APPLICATION OF A NOTCH PASS FILTER

Notch filtering can do a


better job to reduce the
scan lines without
introducing undesirable
blurriness.

We start with a simple


Spectrum of
ideal notch pass filter
the input
(superimposed on the
image
spectrum) to get an idea
about the noise
contribution.

Now that we have a


Inverse suitable notch pass
transform of filter, it can be used to
notch-pass obtain the
filtered result corresponding notch
reject filter. Hence, the
filtered image!

220
LINEAR, POSITION-INVARIANT DEGRADATIONS

‰ Linear, position-invariant techniques


 Many types of degradations can be approximated by linear,
position-invariant processes.
 Tools of linear system theory become available.
‰ Nonlinear, position-dependent techniques
 More general and usually more accurate
 Very difficult to solve or no known solutions
‰ Before restoration: g(x,y) = H[f(x,y)] + η(x,y)
‰ Position-invariant operator
 H[f(x-α, y-β)] = g(x-α, y-β), for any f(x,y), and any α & β.
 The response at any point in the image depends only on the value
of the input at that point, not on its position.
‰ Impulse response of H: H[δ(x-α, y-β)] = h(x,α,y,β)
‰ If the impulse response of a linear system is known, we can
compute the response to any input f!
 g(x,y) = h(x,y)*f(x,y)+ η(x,y) or G(u,v) = H(u,v)F(u,v)+ N(u,v)
 η(x,y): random values, independent of position.
221
DEGRADATION FUNCTION ESTIMATION

‰ Degradations are modeled as being the result of convolution.


‰ Restoration seeks to find filters that apply the process in
reverse.
‰ Linear image restoration: image deconvolution
‰ Restoration filters: deconvolution filters
‰ 3 principal ways to estimate the degradation function
 Observation
 Experimentation
 Mathematical modeling
‰ Image restoration using a degradation function
 Blind deconvolution
 The true degradation function is seldom known completely.

222
ESTIMATION BY IMAGE OBSERVATION

‰ Suppose we are given a degraded image without any


knowledge about the degradation function H.
‰ One way to estimate the function is to gather information
from the image itself.
‰ Estimation of the degradation function in a blurred image
 Look for areas of strong signal content (to reduce the effect
of noise).
 Choose a small section of the image with simple structures
(e.g., part of an object and the background).
 Construct an unblurred image of the same size and
characteristics of the observed subimage.
g s ( x, y ) : observed subimage G (u , v)
H s (u , v) = s
fˆs ( x, y ) : constructed subimage Fˆs (u , v)

 Suppose a plot of H s (u , v ) looks like a Butterworth filter.


 Construct a function H (u , v ) on a larger scale but with the
same shape.

223
ESTIMATION BY EXPERIMENTATION

‰ If equipment similar to the equipment used to acquire the


degraded image, it is possible to obtain an accurate
estimation of the degradation.
‰ Estimation of the degradation function
 Use various system settings until the image is degraded as
closely as possible to the image we want to restore.
 Obtain the impulse response of the degradation by imaging
an impulse.
 H(u,v) = G(u,v)/A

224
ESTIMATION BY MODELING

‰ Provides insight into the restoration problem.


‰ In some cases, models take into account environmental
conditions.

− k ( u 2 + v 2 )5 / 6
H (u, v) = e

A degradation model
based on the
k = 0.0025 physical
characteristics of
atmospheric
turbulence.

Sometimes, Gaussian
LPF is used to model
mild, uniform
k = 0.001 k = 0.00025 blurring.

225
AN EXAMPLE OF MODELING

T Model derived from an image


H (u , v) = sin[π (ua + vb)]e − jπ ( ua + vb ) that has been blurred by
π (ua + vb) uniform linear motion between
the image and the sensor.

226
INVERSE FILTERING

‰ Direct inverse filtering: The simplest approach to restoration.


‰ The following derivation shows that we cannot recover the
undegraded image exactly because N(u,v) is a random
function whose Fourier Transform is not known!
G (u , v)
Fˆ (u , v) =
H (u , v)

G (u , v ) = H (u , v ) F (u , v ) + N (u , v )
N (u , v)
⇒ Fˆ (u , v) = F (u , v) +
H (u, v)

‰ More bad news: If the degradation has zero or very small


values, the ratio N(u,v)/H(u,v) may dominate the estimate.
 To circumvent the problem, we can limit the filter frequencies to
values near the origin.
 H(0,0) = Mean of h(x,y): usually the highest value of H(u,v).
227
AN EXAMPLE OF INVERSE FILTERING

Degradation function:

H (u, v) = e − k [(u − M / 2) + ( v − N / 2 ) 2 ]5 / 6
2

with k = 0.0025

The cutoff was implemented


by applying a Butterworth
lowpass function of order 10.

228
MINIMUM MEAN SQUARE ERROR (WIENER)
FILTERING

‰ Inverse filtering makes no explicit provision for handling


noise.
‰ Now we consider both images and noise as random
processes.
ˆ
‰ The objective: Find an estimate f of the uncorrupted image f
such that the mean squared error between them is minimized.
‰ Assumptions
 The image and the noise are uncorrelated.
 The image or the noise has zero mean.
 The gray levels in the estimate are a linear function of the levels
in the degraded image.
‰ The restored image is given by the inverse Fourier transform
of Fˆ (u , v) .
‰ Noise = zero Ö Wiener filter = inverse filter!

229
DERIVATION OF THE WIENER FILTER

Minimize e = E{( f − fˆ ) }
2 2

⎡ 1 | H (u , v) |2 ⎤
⇒ F (u , v) = ⎢
ˆ ⎥G (u , v)
⎢⎣ H (u, v) | H (u, v) | + Sη (u, v) / S f (u, v) ⎥⎦
2

H (u, v) = transform of the degradatio n function


G (u,v) = transform of the degraded image
Sη (u , v) =| N (u , v) |2 = power spectrum of the noise
S f (u , v) =| F (u, v) |2 = power spectrum of the undegraded image

If the 2 power spectrums are not known, use Sη (u , v) / S f (u , v) = K

230
AN EXAMPLE OF WIENER FILTERING

Degraded input image

K was chosen
interactively to yield
the best possible visual
results.

Previous results for comparison

231
AN EXAMPLE OF WIENER FILTERING

Blurred image with


additive Gaussian
noise (mean=0,
variance=650)

Noise variance is
reduced by one
order of magnitude

Noise variance is
reduced by five
orders of magnitude

Wiener filter with

H ( u , v ) = e − k [( u − M / 2 ) + ( v − N / 2 ) 2 ]5 / 6
2

where k = 0.0025.

K was chosen interactively.


232
CHAPTER 6: COLOR IMAGE PROCESSING

‰ The use of color image processing is motivated by two


factors:
 Color is a powerful descriptor that often simplifies object
identification and extraction from a scene.
 Humans can discern thousands of color shades and intensities
(compared to ~only 2 dozen shades of gray!).
‰ Two major areas:
 Full-color processing: images are acquired with a full-color
sensor.
 Pseudo-color processing: a color is assigned to a particular
monochrome intensity or range of intensities.
‰ In the past, most digital image color processing was done
at the pseudo-color level.
‰ Today, full-color image processing techniques are used in
a broad range of applications.

233
COLOR FUNDAMENTALS

‰ Perception and interpretation of color is a


physiopsychological phenomenon that is not fully
understood.
‰ Sir Isaac Newton discovery in 1666.
‰ Color spectrum can be divided into 6 broad regions:
 Violet, blue, green, yellow, orange, and red.
‰ The colors that human perceive in an object are
determined by the nature of the light reflected from that
object.
 Green objects reflect light with wavelengths in 500-570nm
while absorbing most of the energy at other wavelengths.
‰ Characterization of light is central to the science of color.
 Achromatic light: intensity is the only attribute.
 Chromatic light: spans the EM spectrum from 400nm to
700nm.

234
COLOR SPECTRUM

235
CHROMATIC LIGHT

‰ 3 basic quantities to describe the quality of a chromatic light


source:
 Radiance (W): total amount of energy that flows from the light
source.
 Luminance (lm): the amount of energy an observer perceives
from a light source.
 Brightness: a subjective descriptor that is impossible measure
in partice.
‰ Example: light emitted from a source operating in the far
infrared region.
 Radiance: significant!
 Luminance: hardly perceived!
‰ The cones are responsible for color vision.
 65% are sensitive to red light.
 33% are sensitive to green light.
 2% are sensitive to blue light (blue cones are the most sensitive)

236
VISIBLE SPECTRUM

237
ABSORPTION OF LIGHT BY THE CONES

No single color may


be called red, green,
or blue.

238
PRIMARY AND SECONDARY COLORS OF LIGHT

Primary colors are


mixed to produce
secondary colors of
light (magenta, cyan,
yellow).

Secondary colors are


mixed to produce
primary colors of light
(red, green, blue).

239
COLOR TV

A color TV screen differs from a black-and-white


screen in three ways:

There are three electron beams that move


simultaneously across the screen. They are named
the red, green and blue beams.

The screen is not coated with a single sheet of


phosphor as in a black-and-white TV. Instead, the
screen is coated with red, green and blue
phosphors arranged in dots or stripes. If you turn
on your TV or computer monitor, and look closely
at the screen with a magnifying glass, you will be
able to see the dots or stripes.

On the inside of the tube, very close to the


phosphor coating, there is a thin metal screen
called a shadow mask. This mask is perforated with
very small holes that are aligned with the phosphor
dots (or stripes) on the screen.
When a color TV needs to create a red dot, it fires
the red beam at the red phosphor. Similarly for
green and blue dots.

To create a white dot, red, green and blue beams


are fired simultaneously -- the three colors mix
together to create white.

To create a black dot, all three beams are turned


off as they scan past the dot. All other colors on a
TV screen are combinations of red, green and
blue.

240
CIE CHROMATICITY DIAGRAM

A straight line
segment joining
any 2 points
defines all the
different color
variations that can
Any point within be obtained by
the diagram mixing these 2
colors additively.
represents some
mixture of
spectrum colors.

Points on the
boundary are
pure colors in
the visible Any color in the
spectrum. triangle can be
produced by
various
combinations of
the corner colors.

241
COLOR MODELS

‰ A color model is a specification of colors in a standard


way.
 A coordinate system and a subspace where each color is
represented by a single point.
‰ Most color models are oriented toward hardware or
applications.
‰ Hardware oriented models for DIP
 RGB (red, green, blue) model for color monitors and color
video cameras
 CMY (cyan, magenta, yellow) model for color printing
 CMYK (cyan, magenta, yellow, black) model for color
printing
 HSI (hue, saturation, intensity) decouples the color and
gray-scale information in an image.

242
RGB MODEL

Images represented
with the RGB color
model have 3
component images:

Red component
Green component
Blue component It is assumed that all color
values are normalized.
If 8 bits are used for
each pixel, we have Each color is represented by a
a 24-bit RGB image. point in or on the unit cube.

243
RGB 24-BIT COLOR CUBE

(28)3 = 16,777,216 colors

244
COLOR PLANES

A color image is
acquired using the
process in reverse
order.

R = 127
G = 0-255
B = 0-255

245
SAFE RGB COLORS

246
RBG SAFE-COLOR CUBE

Each surface has 36 colors: 6x36 = 216

247
CMY & CMYK COLOR MODELS

Secondary colors of light : Cyan, magenta, yellow


(primary colors of pigments)

Most devices (color printers, copiers, etc.) that deposit color pigments on
paper require CMY data input or perform an internal RGB to CMY
conversion.

RGB to CMY conversion (all color values are in the range [0,1]):

C 1 R light reflected from a surface coated with pure cyan does not contain red
M = 1 - G light reflected from a surface coated with pure magenta does not contain green
Y 1 B light reflected from a surface coated with pure yellow does not contain blue

Equal amounts of cyan, magenta, and yellow should produce black.


In practice, this combination produces muddy-looking black.
In order to produce true black, a fourth color is added to the model.

CMYK color model: cyan, magenta, yellow, and black

248
HSI COLOR MODEL

‰ RBG and CMY are suitable for hardware implementations.


‰ Unfortunately, they are not good for describing colors for
human interpretation.
‰ One does not refer to the color of a car by giving the % of
each of the primaries!
‰ Humans describe a color object by its hue, saturation, and
brightness.
 Hue: a color attribute that describes a pure color.
 Saturation: gives a measure of the degree to which pure color
is diluted by white light.
 Brightness: a subjective descriptor that is practically
impossible to measure.
‰ The HSI model decouples the intensity component from the
color-carrying information (hue & saturation).

249
CONCEPTUAL RELATIONSHIP BETWEEN RGB
AND HSI COLOR MODELS

In this plane segment, all the


points have the same hue but
different saturation and intensity.
This line is the (1,1,1)
intensity axis
joining black
and white
vertices.

The intensity
component of
this color As the planes moves up
point can be and down, the boundaries
determined defined by the
intersection of each plane
by passing a with the faces of the cube
plane (0,0,0) have either a triangular or
perpendicular hexagonal shape.
to the axis
and
containing
the point. Conclusion: H, S, and I values required to form the HSI
space can be obtained from the RGB cube.

250
HUE AND SATURATION IN THE HSI MODEL

View obtained by
looking at the
RBG cube down
its gray-scale It is not unusual to see
axis. HSI planes defined in
terms of a hexagon, a
triangle, or even a circle.

251
TRIANGULAR AND CIRCULAR COLOR PLANES
IN THE HSI MODEL

mid-point of the
vertical intensity axis

252
CONVERTING COLORS FROM RGB TO HSI

The RGB values have been normalized to the range [0,1].

⎧ ⎫
θ if B ≤ G ⎪⎪
1
[( R − G ) + ( R − B ) ] ⎪⎪
H = with θ = cos −1 ⎨ 2
1/ 2 ⎬
360 − θ if B > G ⎪
⎪⎩
[
( R − G ) 2
+ ( R − B )(G − B ) ] ⎪
⎪⎭
Can be normalized
to the range [0,1] by
diving all the values
by 360.
3
S = 1− [min( R, G, B)]
( R + G + B)

1
I = ( R + G + B)
3

253
CONVERTING COLORS FROM HSI TO RGB

The HSI values are given in the interval [0,1].

The applicable equations depend on the values of H.

RG sector (00 ≤ H < 1200):

⎡ S cos H ⎤
B = I (1 − S ) R = I ⎢1 + ⎥ G = 3I − ( R + B )
⎣ cos(60 − H ) ⎦
0

GB sector (1200 ≤ H < 2400): first subtract 1200 from it.

⎡ S cos H ⎤
R = I (1 − S ) G = I ⎢1 + ⎥ B = 3I − ( R + G )
⎣ cos(60 − H ) ⎦
0

BR sector (2400 ≤ H ≤ 3600): first subtract 2400 from it.

⎡ S cos H ⎤
G = I (1 − S ) B = I ⎢1 + ⎥ R = 3 I − (G + B )
⎣ cos(60 − H ) ⎦
0

254
PSEUDO-COLOR IMAGE PROCESSING

‰ Pseudo-color image processing: assignment of colors


to gray values based on a specified criterion.
‰ The principal use of pseudo-color is for human
visualization and interpretation of gray-scale events in
images.
‰ A principal motivation for using color is that humans can
discern thousands of color shades and intensities!
‰ Intensity slicing and color coding is a simple example of
pseudo-color image processing.
 The image is a 3-D function.
 Planes parallel to the coordinate plane are used.
‰ More general transformations achieve a wider range of
pseudo-color enhancement results.

255
INTENSITY SLICING

A different color is
assigned to each side of
the plane.

Algorithm:

[0,L-1]: gray-scale

L0: black
lL-1: white

P planes: l1,l2,…,lP

P planes partition the gray-


scale into P+1 intervals:
V1,V2,…,VP+1
Color assignments:

f(x,y) = ck if f(x,y) ∈ Vk

256
INTENSITY SLICING INTO 8 COLORS

A different color is assigned to each


region without regard for the
meaning of the gray levels in the
image.

257
INTENSITY SLICING INTO 2 COLORS

When there is a porosity or crack


in a weld, the full strength of the
X-rays going through the object
saturates the sensor on the other
side of the object.

Hence, gray levels of value 255 in


an 8-bit image coming from such
a system automatically imply a
problem with the weld!

A simple color coding:


One color to level 255.
Another color to all other gray
levels.

If a human is the ultimate


judge in inspecting welds,
this simple color coding
would result in lower error
rates!

258
INTENSITY SLICING INTO MULTIPLE COLORS

Average monthly
rainfall over a
period of 3 years

Much
easier to
interpret

259
3 INDEPENDENT COLOR TRANSFORMATIONS

260
PSEUDO-COLOR ENHANCEMENT: AN EXAMPLE

Why do we have the same Why do we have the same


color for the explosives and color for the explosives and
background? garment bag?

261
COMBINATION OF SEVERAL MONOCHROME
IMAGES INTO A SINGLE COLOR COMPOSITE

262
3 MONOCHROME IMAGES ARE COMBINED

visible red
visible green

visible blue near infrared

First 3 images
are combined Red component was
into an RGB replaced with the
image infrared image

Note that the infrared band


is strongly responsive to
the biomass components in
a scene

263
COMBINING IMAGES FROM A SPACECRAFT

One way to
combine the
sensed image
data is by how
they show This image was
differences in obtained by combining
surface chemical several of the sensor
composition. images from the Galileo
spacecraft.

Bright red
depicts material An analysis of
newly ejected individual images
from an active would not convey
volcano. similar information.

Surrounding
yellow materials
are older sulfur
deposits.

264
BASICS OF FULL-COLOR IMAGE PROCESSING

Two major categories of full-


color IP approaches
Example:
Each component image is
processed individually, and a Suppose the process is neighborhood
composite color image is averaging.
formed from the components.
Same result would be obtained using the
Color pixels (which are vectors) scalar and vector methods.
are directly processed.

Two conditions have to be


satisfied for per-color-
component and vector-based
processing to be equivalent.

The process has to be


applicable to both scalars and
vectors.

The operation on each


component of a vector must be
independent of the other
components.
265
COLOR TRANSFORMATIONS - FORMULATION

‰ g(x,y) = T[f(x,y)]: pixels values are triplets or quartets.


‰ si = Ti(r1,r2,…,rn):
 si and ri: variables denoting the color components
 {T1,T2,…,Tn}: set of transformation functions
 For RGB color space, n=3.
 For CMYK color space, n=4.
‰ Example:
 g(x,y)=kf(x,y), 0<k<1: intensity modification
 Any color space can be used.
 HSI color space: s3=kr3, s1=r1, s2=r2
 RGB color space: si=kri, i=1,2,3
 CMY color space: si=kri+(1-k), i=1,2,3
 In this case, although the HSI transformation involves the
fewest # of operations, the computations required to convert
an RGB or CMY(K) image more than offsets the advantages!

266
COLOR-SPACE COMPONENTS OF A
FULL-COLOR IMAGE

CMYK

RGB

HSI

267
MODIFIED INTENSITY OF THE FULL-COLOR
IMAGE

Labels (H,S) and I


should be switched.

268
COLOR COMPLEMENTS

The hues directly opposite


one another on the color
circle are called
complements.

Complements are
analogous to gray-scale
negatives: they are useful
in enhancing detail
embedded in dark region of
a color image.

The RGB complement transformation functions


used here do not have a straightforward HSI color
space equivalent.

269
COLOR SLICING

Highlighting a specific range of colors in an image is useful for


separating objects from their surroundings.

The most straightforward approach is to extend the gray-level slicing


techniques.

si = Ti(r1,r2,…,rn): si is a function of all ri.

Colors of interest are enclosed by a cube:

0.5 if [|rj – aj| > W/2]any 1 ≤ j ≤ n


si = i = 1,2,…,n
ri otherwise

Colors of interest are enclosed by a sphere:

0.5 if Σ(rj – aj)2 > (R0)2


si = i = 1,2,…,n
ri otherwise

The width of the cube and the radius of the sphere were determined
interactively.
270
AN EXAMPLE OF COLOR SLICING

Edible parts of the strawberries are separated!

In each of the 2 cases below, a prototype red with RGB color


coordinate (0.6863,0.1608,0.1922) was selected from the most
prominent strawberry.

Reds within an Reds within an


RGB cube RGB sphere

271
HISTOGRAM PROCESSING

It is generally unwise to
histogram equalize the
components of a color
image independently.
This results in erroneous
color.

A more logical approach


is to spread the color
intensities uniformly,
leaving the colors
themselves (e.g., hues)
unchanged.

The HSI color space is


ideally suited to this
approach.

The intensity The saturation component is


component is increased after histogram
histogram equalized. equalization
272
COLOR IMAGE SMOOTHING

1
c ( x, y ) =
K
∑ c ( x, y )
( x , y )∈S xy

1
K
∑ R ( x, y )
( x , y )∈S xy
Smoothing by
neighborhood
1 averaging can be
c ( x, y ) = K
∑ G ( x, y ) carried out using
( x , y )∈S xy either individual color
planes or the RGB
1 color vectors.

K
∑ B ( x, y )
( x , y )∈S xy

273
AN RGB IMAGE AND ITS COLOR PLANES

274
HSI PLANES

275
SMOOTHED IMAGES

By smoothing only the intensity plane,


the pixels in the smoothed image
maintain their original hue and
saturation – and their original color!

276
COLOR IMAGE SHARPENING

∇ 2 R ( x, y )
The Laplacian of a full-
color image can be
∇ 2 [c( x, y )] = ∇ 2 G ( x, y ) obtained by computing
the Laplacian of each
component plane
separately.
∇ 2 B ( x, y )
The Laplacian

277
SHARPENED IMAGES

Obtained by combining the Laplacian of the


intensity plane with the unmodified hue and
saturation planes.

278
COLOR SEGMENTATION IN HSI COLOR SPACE

Assume we want to segment an


image based on color, and to
carry out the process on
individual planes.

HSI space:

Hue plane conveniently


represents the color.

Saturation plane is used as a


masking image.

Intensity plane is seldom used


because it carries no color
information. The product
image is
thresholded
The binary mask is generated by with T = 0.9
thresholding the saturation plane
with T = 0.1x(maximum value in
the saturation plane).

279
COLOR SEGMENTATION IN RGB COLOR SPACE

RGB color vectors generally result in better segmentation results.

A set of sample points representative of the colors of interest is given.

Obtain an estimate of the average color that will be segmented.

Classify each pixel in the given image according to the specified range.

One measure of similarity is the Euclidean distance:

D(z,a) = ||z-a|| = [(z-a)T(z-a)]1/2 = [(zR-aR)2 + (zG-aG)2 + (zB-aB)2]1/2

D(z,a) ≤ D0 D(z,a)=[(z-a)TC-1(z-a)]1/2 ≤ D0 Computationally much simpler!

280
AN EXAMPLE OF COLOR SEGMENTATION
IN RGB COLOR SPACE

Compute the mean vector a


using the points contained
within the rectangle.

Center the box at a.

Compute the standard


deviation of the R,G,B values
of the sample points.

Determine the dimension


along each axis by
computing 1.25xSD.

σR: SD of red components

Dimension along the R-axis:

(aR – 1.25 σR) to (aR + 1.25σR)

281
COLOR EDGE DETECTION

The gradient introduced in Chapter 3 is not defined for vector quantities.

So, computing the gradient on individual planes, and then using the
results to form a color image will lead to erroneous results.

We need to define the gradient for c(x,y) = [R(x,y) G(x,y) B(x,y)]T

A method has been proposed by Di Zenzo in 1986:

1 ⎡ 2 g xy ⎤
Direction of max rate of change: θ = tan −1 ⎢ ⎥
2 ⎢⎣ ( g xx − g yy ) ⎥⎦
1
⎧1
Value the rate of change: F (θ ) = ⎨ [ ] ⎫
( g xx + g yy ) + ( g xx − g yy ) cos 2θ + 2 g xy sin 2θ ⎬
2

⎩2 ⎭

282
AN EXAMPLE OF COLOR EDGE DETECTION
USING 2 APPROACHES

The edge
detail is
more
complete!

Both
approaches
yielded
reasonable
results. Is the
extra detail
worth the added
computational
burden of the
vector
approach?

283
COMPONENT GRADIENT IMAGES

284
NOISE IN COLOR IMAGES

The noise models discussed in Chapter 5 are applicable to color


images.

Gaussian noise
Rayleigh noise
Erlang noise
Exponential noise
Uniform noise
Impulse (salt and pepper) noise

Usually, the noise content of a color image has the same


characteristics in each color channel.

However, it is possible for color channels to be affected differently


by noise.

285
GAUSSIAN NOISE IN A COLOR IMAGE

Fine grain noise tends


to be less noticeable in
color images.

286
NOISY RGB IMAGE CONVERTED TO HSI

The degradation in hue and saturation planes is due to the


nonlinearity of cos and min operations!

The intensity
plane is
slightly
smoother than
any of the 3
RGB noisy
planes.
1
I = ( R + G + B)
3
Compare

Image
averaging
reduces
random noise!

287
ONE NOISY RGB CHANNEL
AFFECTS ALL HSI PLANES

Salt & pepper noise

Probability of salt: 0.05


Probability of pepper: 0.05

The noise spreads


from the green
channel to all the HSI
planes.

288
EXCEPTIONS FOR COLOR VECTOR
PROCESSING

Filtering of full-color images can be carried out in 2 ways:

1. Per image basis


2. Color vector space

Example: Noise reduction with an averaging filter.

Other filters cannot be formulated in this manner.

Example: The class of order-statistics filters (which are non-linear)

How do we implement a median filter in color vector space?


How do we find a scheme for ordering vectors so that the median makes sense?
The process is considerably more complex when dealing with vectors!

The Plataniotis & Venetsanopoulos book is a good reference on:

• Vector ordering
• Some of the filters based on the ordering concept

289
COLOR IMAGE COMPRESSION

Compression reduces
the amount of data
required to represent a
digital image.

The data that are the


object of any
compression are the
components of each
color pixel.

Compressed with
JPEG 2000.

The compressed
image contains
only 1 data bit for
every 230 bits of
data in the original
image.

290
CHAPTER 7: IMAGE COMPRESSION

‰ Every day, an enormous amount of digital information is


stored, processed and transmitted.
‰ The entire catalog of the Library of Congress (the
world’s largest library) is available electronically.
‰ The storage and communications requirements for
information storage are immense!
‰ Image compression addresses the problem of reducing
the amount of data required to represent a digital image.
‰ Two broad categories of compression techniques:
 Lossless techniques
„ Useful in image archiving (e.g., storage of legal or medical
records)
„ No information is lost.
 Lossy techniques
„ Useful in applications such as broadcast TV,
videoconferencing, and facsimile transmission.
„ A certain amount of information loss is acceptable.

291
MULTIMEDIA STORAGE REQUIREMENTS
(WINDOW OF 640X480 PIXELS)

‰ Text
 2 bytes for every 8x8 pixel character
 # of characters/page = (640x480)/(8x8) = 4,800 bytes
 Storage/screen page = 4,800x2 = 9,600 bytes = 9.4 KB
‰ Vector images
 A typical image with 500 lines
 Each line is defined by its coordinates in x & y directions,
and by an 8-bit attribute field.
 Coordinates in the x direction require 10 bits: log2(640).
 Coordinates in the y direction require 9 bits: log2(480).
 Bits/line = 9 + 10 + 9 + 10 + 8 = 46 bits.
 Storage/screen page = 500x46/8 = 2,875 bytes = 2.8 KB
‰ Bit-mapped images
 256 different colors
 Storage/screen page = 640x480x1 = 307,200 bytes = 300 KB

292
MULTIMEDIA STORAGE REQUIREMENTS
(WINDOW OF 640X480 PIXELS)

‰ Speech of telephone quality


 Sampled at 8 kHz, and quantized using 8 bits/sample => 64 Kbits/s
 Storage/second = 64/8 = 64 Kbits = 8 KB
‰ Stereo audio of CD quality
 Sampled at 44.1 kHz, and quantized using 16 bits/sample.
 Storage/second = (2x44,100x16)/8 = 176,400 bytes = 172 KB
‰ Video
 25 frames/second
 1 luminance later, 2 chrominance layers
 PAL standard: 625 lines x 833 pixels
 CCIR 601 (studio standard)
„ Luminance (Y) sampled at 13.5 MHz
„ Chrominance (R-Y & B-Y) sampled at 6.75 MHz
 Storage/second = 640x480x25x3 = 23,040,000 bytes = 22,500 KB

293
JPEG

‰ Joint Photographic Experts Group (JPEG)


‰ A standardized lossy image compression algorithm
‰ Based on DCT (Discrete Cosine Transform)
‰ Designed for compressing either full-color or gray-scale
images of natural, real-world scenes.
‰ GIF compression
 Best for images with flat areas of color (e.g., logos, line
drawings, simple cartoons).
 Lossless
 Based on LZW (Lempel-Ziv-Welsh)
‰ Each color layer is handled separately.
‰ The resolution of individual layers may be different.
‰ Video color models: YUV, YIQ, YCbCr
‰ Chroma subsampling: 4:4:4, 4:2:2, 4:2:0

294
FOUR MODES OF OPERATIONS

‰ Sequential baseline
 A simple and efficient algorithm.
 Adequate for most applications.
 The image is scanned in a raster scan fashion l-to-r/t-to-b.
‰ Progressive
 The image is encoded in multiple scans at the same spatial
resolution.
‰ Hierarchical
 The image is encoded at multiple spatial resolutions.
 Lower resolution images may be displayed without having to
decompress the image at a higher spatial resolution.
 Can be implemented using sequential, progressive, or lossless
modes.
‰ Lossless
 The image is encoded to guarantee exact recovery of every sample
value.
 Compression efficiency is inherently lower than those of lossy
methods.
295
SEQUENTIAL BASELINE

DCT Quantization
Color
components
8x8
blocks

Quantization
tables
Coding
tables

Header
Tables
DPCM
Entropy
coding
Data RLC

296
DCT & INVERSE DCT

DCT

1 N −1 N −1
(2i + 1)uπ (2 j + 1)vπ
F (u, v) = C (u )C (v)∑∑ f (i, j ) cos cos
2N i =0 j =0 2N 2N

1
C ( x) = if x = 0, else 1 if x > 0
2

Inverse DCT

~ 1 N −1 N −1
( 2i + 1)uπ (2 j + 1)vπ
f (i, j ) =
2N
∑∑ C (u )C (v) F (u, v) cos
u =0 v =0 2N
cos
2N

1
C ( x) = if x = 0, else 1 if x > 0
2

297
A CASE STUDY: TEST IMAGE

298
A CASE STUDY: AN 8X8 BLOCK

52 55 61 66 70 61 64 73

63 59 66 90 109 85 69 72

62 59 68 113 144 104 66 73

63 58 71 122 154 106 70 69

67 61 68 104 126 88 68 70

79 65 60 70 77 68 58 75

85 71 64 59 55 61 65 83

87 79 69 68 65 76 78 94

299
A CASE STUDY: LEVEL SHIFTING

-76 -73 -67 -62 -58 -67 -64 -55

-65 -69 -62 -38 -19 -43 -59 -56

-66 -69 -60 -15 16 -24 -62 -55

The quantity 2n-1 -65 -70 -57 -6 26 -22 -58 -59


is subtracted
from each pixel
value. -61 -67 -60 -24 -2 -40 -60 -58

n=8 => 2n-1= 128 -49 -63 -68 -58 -51 -65 -70 -53

-43 -57 -64 -69 -73 -67 -63 -45

-41 -49 -59 -60 -63 -52 -50 -34

300
A CASE STUDY: APPLICATION OF DCT

-415 -29 -62 25 55 -20 -1 3

7 -21 -62 9 11 -7 -6 6

-46 8 77 -25 -30 10 7 -5

-50 13 35 -15 -9 6 0 3

11 -8 -13 -2 -1 1 -4 1

-10 1 3 -3 -1 0 2 -1

-4 -1 2 -1 2 -3 1 -2

-1 -1 -1 -2 -1 -1 0 -1

301
A CASE STUDY: NORMALIZATION MATRIX

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

Z (u , v) 18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99

302
A CASE STUDY: QUANTIZATION

-26 -3 -6 2 2 0 0 0

1 -2 -4 0 0 0 0 0

-3 1 5 -1 -1 0 0 0

DCT coefficients -4 1 2 -1 0 0 0 0
are quantized
using the formula
1 0 0 0 0 0 0 0
⎡ T (u , v) ⎤
Tˆ (u , v) = round ⎢ ⎥
⎣ Z (u , v) ⎦ 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

38 consecutive
zeros! 0 0 0 0 0 0 0 0

303
A CASE STUDY: ZIGZAG REORDERING

[-26 -3 1 -3 -2 -6 2 -4 1 -4 1 1 5 0 2 0 0 -1 2 0 0 0 0 0 -1 -1 EOB]

a special EOB
Huffman code word
indicates that the
remainder of the
coefficients are
zeros.

304
A CASE STUDY: PREPARATION FOR
ENTROPY CODING

‰ Entropy coding is lossless.


‰ DC and AC coefficients are treated differently.
‰ Differential Pulse Code Modulation (DPCM) on DC coefficients
 Each DPCM-coded DC coefficient is represented by a pair of
symbols: (CATEGORY, AMPLITUDE)
 CATEGORY: indicates the # of bits needed to represent the
coefficient.
 AMPLITUDE: contains the actual bits.
 The 1’s complement notation is used for negative numbers.
‰ Run-length coding (RLC) on AC coefficients
 RLC replaces each AC coefficient by a pair (RUNLENGTH, VALUE)
 RUNLENGTH: indicates the # of zeros in the run.
 VALUE: the next nonzero coefficient.
 The special pair (0,0) indicates the EOB after the last nonzero AC
coefficient.

305
A CASE STUDY: DPCM ON DC COEFFICIENTS

‰ Each 8x8 block has only one DC coefficient


‰ The DC coefficient is unlikely to change drastically within
a short distance.
‰ DPCM is an ideal scheme for coding the DC coefficients
‰ d0 = DC0, di = DCi+1 - DCi
‰ Coding of the first difference
 -26 (-17) = -9
 From Table 8.17, -9 lies in category 4
 For category K, additional K bits are needed.
 Positive difference: K LSBs
 Negative difference: K LSBs – 1 => 0111 – 1 = 0110
 DPCM coded DC code word = 1010110
amplitude
category

306
A CASE STUDY: RLC ON AC COEFFICIENTS

‰ Each 8x8 block has 63 AC coefficients.


‰ Most image blocks tend to have small high frequency
coefficients.
‰ These small coefficients are zeroed out by quantization.
‰ Coding of the first AC coefficient
 (RUNLENGTH, VALUE) = (0,-3)
 From Table 8.17, -3 lies in category 2
 From Table 8.19, (0/2) maps to 01
 For category K, additional K bits are needed.
 Positive difference: K LSBs
 Negative difference: K LSBs – 1 => 01-1 = 00
 RLC coded AC code word = 0100

307
A CASE STUDY: CODING CATEGORIES

308
A CASE STUDY: DC BASE CODES

309
A CASE STUDY: AC BASE CODES

310
A CASE STUDY: AC BASE CODES

311
A CASE STUDY: COMPLETELY CODED ARRAY

1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001

[-26 -3 1 -3 -2 -6 2 -4 1 -4 1 1 5 0 2 0 0 -1 2 0 0 0 0 0 -1 -1 EOB]

001 100101 11100110 110110 0110 11110100 000 1010

# of bits needed to store the 8x8 block = 64x8 = 512

# of bits after JPEG compression = 92

Compression ratio = 512/92 => 5.6:1

312
A CASE STUDY: DECOMPRESSION BEGINS

-26 -3 -6 2 2 0 0 0

1 -2 -4 0 0 0 0 0

-3 1 5 -1 -1 0 0 0

-4 1 2 -1 0 0 0 0

1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

313
A CASE STUDY: DENORMALIZATION

-416 -33 -60 32 48 0 0 0

12 -24 -56 0 0 0 0 0

-42 13 80 -24 -40 0 0 0

-56 17 44 -29 0 0 0 0

18 0 0 0 0 0 0 0

T& (u , v ) = Tˆ (u , v ) Z (u , v ) 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

314
A CASE STUDY: INVERSE DCT

-70 -64 -61 -64 -69 -66 -58 -50

-72 -73 -61 -39 -30 -40 -54 -59

-68 -78 -58 -9 13 -12 -48 -64

-59 -77 -57 0 22 -13 -51 -60

-54 -75 -64 -23 -13 -44 -63 -56

-52 -71 -72 -54 -54 -71 -71 -54

-45 -59 -70 -68 -67 -67 -61 -50

-35 -47 -61 -66 -60 -48 -44 -44

315
A CASE STUDY: LEVEL SHIFTING

58 64 67 64 59 62 70 78

56 55 67 89 98 88 74 69

60 50 70 119 141 116 80 64

The quantity 2n-1 69 51 71 128 149 115 77 68


is added from
each pixel value.
74 53 64 105 115 84 65 72
n=8 => 2n-1= 128
76 57 56 74 75 57 57 74

83 69 59 60 61 61 67 78

93 81 67 62 69 80 84 84

316
A CASE STUDY: DIFFERENCE

52 55 61 66 70 61 64 73 58 64 67 64 59 62 70 78

63 59 66 90 109 85 69 72 56 55 67 89 98 88 74 69

62 59 68 113 144 104 66 73 60 50 70 119 141 116 80 64

63 58 71 122 154 106 70 69 69 51 71 128 149 115 77 68

67 61 68 104 126 88 68 70 74 53 64 105 115 84 65 72

79 65 60 70 77 68 58 75 76 57 56 74 75 57 57 74

85 71 64 59 55 61 65 83 83 69 59 60 61 61 67 78

87 79 69 68 65 76 78 94 93 81 67 62 69 80 84 84

-6 -9 -6 2 11 -1 -6 -5

7 4 -1 1 11 -3 -5 3

2 9 -2 -6 -3 -12 -14 9

-6 7 0 -4 -5 -9 -7 1

-7 8 4 -1 11 4 3 -2

3 8 4 -4 2 11 1 1

2 2 5 -1 -6 0 -2 5

-6 -2 2 6 -4 -4 -6 10

317
JPEG 2000

‰ Major limitation of JPEG: coding performance degrades


rapidly at low bit rates (especially below 0.25 bpp).
‰ The latest series of standards from the JPEG committee.
‰ Designed to address a variety of applications including
color facsimile, printing, scanning, digital photography,
remote sensing, and medical imagery.
‰ Based on the Discrete Wavelet Transform (DWT).
‰ Addresses a number of problems including:
 Low bit rate compression
 Lossless and lossy compression
 Computer-generated imagery
 Error resilience for transmission in noisy environments
‰ Core compression algorithm: EBCOT (Embedded Block
Coding with Optimized Truncation)

318
JPEG 2000: BLOCK DIAGRAM

Image Compressed
data data

Entropy
DWT Quantization
coding

Tiling DWT on each tile

DC level
shifting

319
1-DIM DISCRETE WAVELET TRANSFORM (DWT)

1
Wϕ ( j 0 , k ) =
M
∑ f ( x)ϕ
x
j0 , k (x) : approximation coefficients

1
Wψ ( j , k ) =
M
∑ f ( x)ψ
x
j ,k (x) : detail coefficients


1 1
f ( x) =
M
∑Wϕ ( j
k
0 , k )ϕ j0 ,k ( x) +
M
∑∑Wψ ( j, k )ψ
j = j0 k
j ,k ( x)

f ( x ), ϕ j0 ,k ( x ), and ψ j ,k ( x ) are functions of the discrete variable x = 0,1,2,..., M − 1.

Normally, j = 0, and M is a power of 2.


0
j
Summations are performed over x = 0,1,2,..., M - 1, j = 0,1,2,..., J - 1, k = 0,1,...,2 − 1.

320
1-DIM FAST WAVELET TRANSFORM (FWT)

Computationally efficient implementation of DWT.

321
2-DIM DISCRETE WAVELET TRANSFORM (DWT)

ϕ ( x, y ) : 2 - dim scaling function

ψ H ( x, y ), ψ V ( x, y ), ψ D ( x, y ) : 2 - dim wavelets

ϕ ( x, y ) = ϕ ( x)ϕ ( y ) : separable scaling function

ψ H ( x, y ) = ψ ( x)ϕ ( y ) : measures variations along columns

ψ V ( x, y ) = ϕ ( x)ψ ( y ) : measures variations along rows

ψ ( x, y ) = ψ ( x)ψ ( y )
D
: measures variations along diagonals

322
2-DIM DISCRETE WAVELET TRANSFORM (DWT)

ϕ j ,m,n ( x, y ) = 2 j / 2 ϕ (2 j x − m, 2 j x − n)
i = {H , V , D}
ψ j ,m,n ( x, y ) = 2 ψ (2 x − m, 2 x − n)
j/2 j j

M −1 N −1
1
Wϕ ( j0 , m, n) =
MN
∑∑ f ( x, y)ϕ
x =0 y =0
j0 , m , n ( x, y )

M −1 N −1
1
Wψ ( j , m, n) =
i

MN
∑∑ f ( x, y )ψ
x =0 y =0
j ij , m , n , m , n
( x, y )

1 1
f ( x, y ) =
MN
∑∑Wϕ ( j0 , m, n)ϕ j0 ,m,n( x, y ) +
m n MN
∑ ∑∑∑Wψ ( j, m, n)ψ
i = H ,V , D j = j0 m n
i i
j ,m,n ( x, y )

j
Normally, j = 0, and M = N = 2 .
0
j
Summations are performed over x = 0,1,2,..., M - 1, j = 0,1,2,..., J - 1, m,n = 0,1,...,2 − 1.

323
2-DIM FAST WAVELET TRANSFORM (FWT)

1st and 2nd level decomposition

324
A FWT EXAMPLE

128x128
computer-
generated 1st level
image decomposition

2nd level 3rd level


decomposition decomposition

325
CHAPTER 8: MORPHOLOGICAL IMAGE PROCESSING

‰ Morphology: a branch of biology that deals with the form


and structure of animals and plants.
‰ Mathematical morphology: extraction of image
components that are useful in the representation and
description of regional shape.
 Boundaries
 Skeletons
 Convex hull
‰ The language of mathematical morphology is set theory.
‰ Sets in mathematical morphology represent objects in an
image.
‰ In binary images, the sets are members of Z2.
‰ Until this chapter, the I/O of DIP methods were images.
‰ Now, we have a transition to DIP methods where I are
images but O are attributes extracted from those images.

326

BASIC CONCEPTS FROM SET THEORY

‰ A ∈ Z2.
‰ a = (a1, a2) ∈ A.
‰ Empty set: the set with no elements.
‰ w = (w1,w2) & C = {w|w = -d, d ∈ D}: C is the set of elements w
s.t. w is formed by multiplying each of the 2 coordinates of all
the elements of D by -1.
‰ A ⊆ B: A is a subset of B.
‰ C = A U B: C is the union of A and B.
‰ D = A ∩ B: D is the intersection of A and B.
‰ A ∩ B = ∅: A and B are disjoint sets.
‰ Ac = {w|w ∉ A}: The complement of A.
‰ A – B = {w|w ∈ A, w ∉ B} = A ∩ Bc : The difference of A and B.
‰ B̂ = {w|w = -b, b ∈ B}: Reflection of B.
‰ (A)z = {c|c = A + z, a ∈ A}: Translation of A by z = (z1,z2)

327
BASIC SET OPERATIONS

328
3 BASIC LOGICAL OPERATIONS

Other logical operations can be constructed using the above


definitions.

Example:

p q p XOR q

0 0 0
0 1 1
1 0 1
1 1 0

329
LOGICAL OPERATIONS BETWEEN
BINARY IMAGES

p q p AND q

0 0 0
0 1 0
1 0 0
1 1 1

330
DILATION

‰ Dilation and erosion: Operations fundamental to


morphological processing.
‰ Dilation expands an image, and erosion shrinks it.
‰ Many morphological algorithms in this chapter are based
on these 2 primitives.
‰ A ⊕ B = {z | ( Bˆ ) z ∩ A ≠ ∅}: dilation of A and B
 Obtain the reflection of B about its origin, and shift this
reflection by z.
 The set of all displacements z s.t. B̂ and A overlap by at least
one element.
 B: the structural element
‰ Alternative definitions of dilation

 A ⊕ B = {w ∈ Z 2 | w = a + b, a ∈ A, b ∈ B}

 A ⊕ B = U ( A) b
b∈B

331
2 EXAMPLES OF DILATION

The structural element B and


its reflection are equal
because B is symmetric w.r.t
its origin.

332
AN APPLICATION OF DILATION:
BRIDGING GAPS

333
EROSION

‰ AOB = {z | ( B) z ⊆ A} : Erosion of A by B.
 The set of all points z s.t. B, translated by z, is contained in A.
‰ Alternative definitions of erosion

 A O B ={w∈Z 2 | w+ b∈ A, ∀b∈B}

 AOB = I ( A) −b
b∈B

‰ Dilation and erosion are duals of each other w.r.t. set


complementation and reflection:

AOB = Ac ⊕ Bˆ

334
2 EXAMPLES OF EROSION

335
AN APPLICATION OF EROSION

In general, dilation does not fully restore eroded objects.

Erosion Dilation
Structure element: Structure element:
13X13 pixels 13X13 pixels
(all 1’s) (all 1’s)
336
OPENING & CLOSING

‰ opening and closing: 2 other important morphological


operations.
‰ Opening A by B: A o B = ( A O B ) ⊕ B
 Erosion of A by B, followed by a dilation of the result by B.
 Generally smoothes the contour of an object
 Breaks narrow isthmuses
 Eliminates thin protrusions
‰ Closing of A by B: A • B = ( A ⊕ B ) Ο B
 Dilation of A by B, followed by an erosion of the result by B.
 Also tends to smooth sections of contours.
 Generally fuses narrow breaks and long thin gulfs
 Eliminates small holes
 Fills gaps in the contour

337
GEOMETRIC INTERPRETATION OF OPENING

the union
of all
translates
of B that
fit into A.

338
GEOMETRIC INTERPRETATION OF CLOSING

A point w ∈ A•B ⇔ (B)z ∩ A ≠ ∅

339
OPENING & CLOSING OPERATIONS

The bridge and A circular structural


the protruding element results in
elements were smoothing in parts of
eliminated. the object.

Erosion

Opening
Outward pointing of A by B
corners were rounded;
Dilation inward pointing
corners were not
affected.

The size of the


Dilation intrusion was
significantly
reduced.
Closing
of A by B
Inward pointing
corners were rounded;
Erosion outward pointing
corners were not
affected.

340
PROPERTIES SATISFIED BY
OPENING & CLOSING

‰ Opening and closing are duals of each other w.r.t. set


complementation and reflection:
( A • B ) c = ( A c o Bˆ )
‰ The opening operation satisfies the following properties:
 A ° B is a subset of A.
 If C ⊆ D, then (C ° B) ⊆ (D ° B).
 (A ° B) ° B = A ° B
‰ The closing operation satisfies the following properties:
 A is a subset of A • B.
 If C ⊆ D, then (C • B) ⊆ (D • B).
 (A • B) • B = A • B

341
A MORPHOLOGICAL FILTER:
OPENING FOLLOWED BY CLOSING

Morphological
operations can be
used to construct
filters. The size
of the
The noise manifests dark
itself as light spots
elements on a dark increased
background and as in size.
dark elements on the
light components of
the fingerprint.

The opening operation


reduced the noise
components in size or
deleted them The noise
completely. However, has been
it created new gaps eliminated
between the with
fingerprint ridges. minimal
distortion
to the
fingerprint.
Most of the gaps were restored
but the ridges were thickened.
342
HIT-OR-MISS TRANSFORMATION

Local background
Set A is the union
of X w.r.t. W
of 3 disjoint sets

A O X: set of all
locations of the
origin of X at which
X found a match
(hit) in A.
Generalized notation:
B = (B1, B2) Ac O (W-X): set of
all locations of the
B1= X
origin of (W-X) at
B2= (W – X) which (W-X) found a
match (hit) in Ac.
A ∗ B = (A O B1) ∩ (Ac O B2)

Contains all the origin points at


which B1 found a match in A and
B2 found a match in Ac.

343
SOME BASIC MORPHOLOGICAL ALGORITHMS

‰ Boundary extraction

‰ Region filling

‰ Extraction of connected components

‰ Convex hull

‰ Thinning

‰ Thickening

‰ Skeletons

‰ Pruning

344
BOUNDARY EXTRACTION

Commonly used but


0’s not unique

1’s

When the origin of B is on the edge of the β ( A) = A − ( AΟB )


set A, part of B may be outside the image.

How do we handle this situation?

Normal treatment: Assume that values


outside the borders of the image are 0.

345
AN APPLICATION OF BOUNDARY EXTRACTION

The boundary is
1 pixel thick

1’s 0’s
1 1 1
1 1 1 Structural element
1 1 1

346
REGION FILLING

A subset
whose
elements are
8-connected
boundary
The objective is to fill the
points of a entire region with 1’s.
region.

X0 = p X k = ( X k −1 ⊕ B ) I A c , k = 1,2,3,...

The algorithm terminates


at step k when X k = X k −1

The union of X k and A contains


the filled region and its boundary.

347
AN APPLICATION OF REGION FILLING

Such an image may be the result of thresholding into 2 levels a scene


containing polished spheres.

We will eliminate the reflections by region filling.

348
EXTRACTION OF CONNECTED COMPONENTS

Extraction of connected components in a binary image is central to many automated


image analysis applications.

The shape of B assumes 8-


connectivity between pixels

Y is a
connected X k = ( X k −1 ⊕ B) I A, k = 1,2,3,...
component
in A.
X0 = p

The algorithm terminates


at step k when X k = X k −1.

Y = Xk

349
AN APPLICATION OF
CONNECTED COMPONENT EXTRACTION

Connected components are frequently used for automated inspection.

Analysis of the size of the


remaining objects.

4 of the
connected
components
are dominant
in size.

350
CONVEX HULL

‰ A set A is said to be convex if the straight line segment


joining any 2 points lies entirely within A.
‰ The convex hull H of an arbitrary set S is the smallest
convex set containing S.
‰ The set difference H-S is called the convex deficiency of
S.
‰ The convex hull and convex deficiency are useful for
object description.
‰ The procedure for obtaining the convex hull, C(A), of a
set S:
 Starting with X0i , iteratively apply the hit-or-miss transform to
A with Bi. When the iteration converges, perform the union
with A, and call the result Di.
 C(A) = union of all Di’s.

351
CONVEX HULL: AN EXAMPLE

Algorithm for obtaining the


convex hull, C(A), of a set A:

B i , i = 1,2,3,4 : 4
structuring elements

X ki = ( X k −1 ∗ B i ) U A,
i = 1,2,3,4, and k = 1,2,3,...

X0 = A

X 01 = A D i = X conv
i
with X k = X k −1
X =A
2
0 4
Initial points
X =A
3 C ( A) = U D i
0
i =1
X =A
4
0

352
SHORTCOMING OF THE ALGORITHM FOR
OBTAINING THE CONVEX HULL

Shortcoming of the algorithm: the convex hull can grow beyond the
min dimensions required to guarantee convexity.

A simple approach to reduce this effect: limit growth so that it does


not extend past the vertical and horizontal dimensions of the original
set of points.

Boundaries of greater complexity can be used to limit growth even


further in images with more detail, e.g., the max dimensions of the
original set of points along the horizontal, vertical, and diagonal
directions can be used.

353
EXTENSIONS TO GRAY SCALE IMAGES

‰ The following basic operations will be extended to gray


scale images:
 Dilation
 Erosion
 Opening
 Closing
‰ These operations will be used to develop several
morphological algorithms.
 Boundary extraction
 Region partitioning
 Smoothing
 Sharpening
‰ f(x,y): input image, b(x,y): a structuring element

354
GRAY SCALE DILATION

( f ⊕ b)( s, t ) = max{ f ( s − x, t − y ) + b( x, y ) | ( s − x), (t − y ) ∈ D f ; ( x, y ) ∈ Db }


D f : domain of f
Db : domain of b

f and b are functions, and not sets.

Gray scale: ( s − x ), (t − y ) ∈ D f ; ( x, y ) ∈ Db
analogous
Binary: the 2 sets have to overlap by at least 1 element

The general effect of performing dilation on a gray scale image:

• The values of b > 0 ⇒ the output image tends to be brighter.

• Dark details are either reduced or eliminated, depending on how their


values and shapes relate to b used for dilation.

355
1-D EXAMPLE OF GRAY SCALE DILATION

( f ⊕ b)( s ) = max{ f ( s − x) + b( x) | ( s − x) ∈ D f ; x ∈ Db }

Conceptually, f
Unlike the
sliding by b is binary case, f,
not different not b, is
from b sliding by shifted.
f.

The actual
mechanics of
gray scale
dilation is easier
to visualize if b
is the function
that slides past f.

At each position
of b, the value of
dilation at that
point is the max
of f+b in the
interval spanned
by b.

356
GRAY SCALE EROSION

( f Ο b)( s, t ) = min{ f ( s + x, t + y ) − b( x, y ) | ( s + x), (t + y ) ∈ D f ; ( x, y ) ∈ Db }


D f : domain of f
Db : domain of b

Gray scale: ( s + x), (t + y ) ∈ D f ; ( x, y ) ∈ Db


analogous
Binary: b has to be completely contained by A.

The general effect of performing erosion on a gray scale image:

• The values of b > 0 ⇒ the output image tends to be darker.

• Bright details that are smaller in area than the structuring element are
reduced, with the degree of reduction determined by the gray level
values surrounding the bright detail, and by the shape and values of b.

357
1-D EXAMPLE OF GRAY SCALE EROSION

( f Ο b)( s ) = max{ f ( s + x) − b( x) | ( s + x) ∈ D f ; x ∈ Db }
Conceptually, f
sliding by b is not Unlike the binary
case, f, not b, is
different from b shifted.
sliding by f.

At each position of
b, the value of
erosion at that
point is the min of
f-b in the interval
spanned by b.

( f Ο b) c ( s, t ) = ( f c ⊕ bˆ)( s, t )
Gray scale dilation and erosion are duals w.r.t.
function complementation and reflection.
f c = − f ( x, y ) & bˆ = b(− x,− y )

358
AN EXAMPLE OF GRAY SCALE
DILATION & EROSION

The structural element: The dilated image is brighter.


The sizes of the dark features are
• Flat-top reduced.
• Parallelepiped
• Unit height
• 5x5 pixels
The eroded image is darker.
The sizes of the small, bright
features are reduced.

359
OPENING AND CLOSING FOR
GRAY SCALE IMAGES

f o b = ( f Ο b) ⊕ b : The opening of image f by subimage b

f • b = ( f ⊕ b ) Οb : The closing of image f by subimage b

( f • b) c = f c o bˆ : The opening & closing of gray scale images are


duals w.r.t. function complementation and reflection.

Opening and closing of images have a simple geometric interpretation:

Suppose we view an image function f(x,y) in 3D perspective.

The image appears as a discrete surface: f at (x,y).

Opening: rolling a ball against the underside of the surface.

Closing: rolling a ball on top of the surface.

360
GEOMETRIC INTERPRETATION OF
OPENING & CLOSING

To simplify the
illustration, a scan line of
a gray scale image is
shown as a continuous
function.

Opening: rolling a ball


against the underside
of the surface.

All the peaks that were


narrow w.r.t. the
diameter of the ball
were reduced in
amplitude and
sharpness.

Closing: rolling a ball


on top of the surface.

The peaks are left in


their original form.

361
OPENING AND CLOSING OF
A GRAY SCALE IMAGE

Original image
f o b = ( f Ο b) ⊕ b f • b = ( f ⊕ b)Οb

The initial erosion The subsequent


removes the small dilation The initial dilation The subsequent
details but it also increases the removes the dark erosion darkens
darkens the overall intensity details but it also the image
image. without brightens the without
reintroducing image. reintroducing
the details the details
removed by removed by
erosion. dilation.

Opening is Closing is
generally used to generally used to
remove small remove dark
light details from details from an
an image while image while
leaving the leaving bright
overall gray features
levels and larger relatively
bright features undisturbed.
relatively
undisturbed.

362
MORPHOLOGICAL SMOOTHING

One way to achieve The structural element:


smoothing is to
perform a • Flat-top
morphological opening • Parallelepiped
followed by a closing. • Unit height
• 5x5 pixels
The net result of these
2 operations is to
remove or attenuate
both bright and dark
artifacts or noise.

363
MORPHOLOGICAL GRADIENT

dilation erosion

g = ( f ⊕ b) − ( f Ο b)

The morphological The structural element:


gradient highlights
sharp gray level • Flat-top
transitions in the • Parallelepiped
input image. • Unit height
• 5x5 pixels

364
TOP-HAT TRANSFORMATION

Original image h = f − ( f o b)

Note the enhancement of detail in the


background region below the lower part
of the horse’s head.

365
TEXTURAL SEGMENTATION

Closing tends to remove dark details from an image.

Procedure:

1. Close the image by using successively larger structuring elements.


2. When the size of the structuring element corresponds to that of the small blobs, the small blobs are
removed from the image, leaving a light background in the area previously occupied by them.
3. At this point, only the larger blobs and the light background on the left and between the large blobs
themselves remain.
4. A single opening is performed with a structuring element that is large in relation to the separation between
the large blobs.
5. This operation removes the light patches between the large blobs, leaving a large region on the right
consisting of the large dark blobs and the now equally dark patches between these blobs.
6. The process has produced a light region on the left and a dark region on the right. Boundary
7. A simple threshold yields the boundary between the 2 textural regions. superimposed
on the original
image

366
GRANULOMETRY

Granulometry: a field that deals principally with determining the size distribution of particles in an image.

Procedure:

1. Opening operations with structural elements of increasing size are performed on the original image.
2. The difference between the original image and its opening is computed after each pass when a different
structural element is completed.
3. At the end of the process, these differences are normalized, and then used to construct a histogram of
particle size distribution.

The approach is based on the idea that opening operations of a particular size have the most effect on
regions of the input image that contain particles of similar size. Thus, a measure of the relative number of
such particles is obtained by computing the difference between the input and output images.

Indicates the presence of 3


predominant particle sizes.

Light objects of 3
different sizes.

The objects are


overlapping and too
cluttered to enable
detection of
individual particles.

367
CHAPTER 9: IMAGE SEGMENTATION

‰ Segmentation subdivides an image into its constituent


regions or objects.
‰ Image segmentation algorithms are generally based on
one of 2 basic properties of intensity values:
 Discontinuity
 Similarity
‰ Algorithms based on discontinuity: the approach is to
partition an image based on abrupt changes in intensity
(e.g., edges).
‰ Algorithms based on similarity: the approach is to
partition an image into regions that are similar according
to a predefined set of criteria.
‰ Segmentation bridges the gap between low-level image
processing and high-level image processing.

368
DETECTION OF DISCONTINUITIES

‰ 3 basic types of gray-level discontinuities:


 Points
 Lines
 Edges
‰ The most common way to look for discontinuities is to use
a mask.

w1 w2 w3 z1 z2 z3

w4 w5 w6 3x3 mask z4 z5 z6 3x3 region of the image

w7 w8 w9 z7 z8 z9

‰ Response of the mask at any point in the image: R = Σwizi

Gray level of the


pixel associated
with wi

369
POINT DETECTION

A point is detected at the location on which the mask is centered if |R| ≥ T, T > 0.

The sum of the


coefficients is 0.

A single
black pixel
embedded
within the
porosity.

Application T = 90% of the highest


of the mask absolute pixel value in (c)
370
LINE DETECTION

Mask responses

R1 R2 R3 R4

At a certain point, |Ri| > |Rj|, for all j ≠ i: the point is more
likely associated with a line in the direction of mask i.

371
AN EXAMPLE OF LINE DETECTION

We are interested in
finding all the lines
that are 1 pixel
thick, and are
oriented at -450.

T = the max
Absolute
value in the
value of
image
the result

372
EDGE DETECTION

‰ Edge detection is the most common approach for


detecting meaningful discontinuities in gray level.
‰ Intuitively, an edge is a set of connected pixels that lie on
the boundary of 2 regions.
‰ We will use first and second digital derivatives for edge
detection in images.
‰ Difference between an edge and a boundary:
 Edge is local.
 Boundary is global.
‰ In practice, edges are blurred.
 The degree of blurring is determined by a number of factors.
„ Quality of the acquisition system
„ Sampling rate
„ Illuminations conditions
‰ Hence, edges are modeled as having a ramplike profile.

373
MODELING OF A EDGE

The thickness of the edge


is determined by the
length of the ramp.

Sharp edges tend Blurred edges


to be thin. tend to be thick.

374
DETAIL NEAR AN EDGE

First derivatives: the magnitude


can be used to detect the presence
of an edge at a point.
Produces 2 values
for every edge (an
Second derivatives: the sign can undesirable feature).
be used to determine whether an
edge pixel lies on the dark or light
side of an edge. Crossing
near the
midpoint of
the edge
375
SENSITIVITY OF DERIVATIVES TO NOISE

Image segment 1st D 2nd D


Although the noise
is almost invisible Noise free
in the images, both
derivatives are
sensitive to noise.

Gaussian noise: mean =0, σ =0.1

Gaussian noise: mean =0, σ =1.0

Gaussian noise: mean =0, σ =10.0

376
DEFINITION OF AN EDGE POINT

‰ To be classified as a meaningful edge point, the transition


in gray level associated with that point has to be
significantly stronger than the background at that point.
‰ Two definitions of edge point
 Its 1st order derivative > T.
 Zero crossing of its 2nd order derivative.
‰ Edge: a set of edge points that are connected according
to a predefined criterion of connectedness.
‰ Edge segment: the edge is short relative to the
dimensions of the image.
‰ The edge definitions do NOT guarantee success in finding
edges in an image. They simply give us a formalism to
look for them.

377
THE GRADIENT

∇f = [Gx2+Gy2]1/2
(not preferred)
2x2 filters
∇f ≈ |Gx|+|Gy|
(more attractive)
Gx = (z9-z5) Gy = (z8-z6)
α(x,y) =tan-1(Gy/Gx)

direction

Gx = (z7+z8+z9)-(z1+z2+z3) Gy = (z3+z6+z9)-(z1+z4+z7)

3x3 filters

Gx = (z7+2z8+z9)-(z1+2z2+z3) Gy = (z3+2z6+z9)-(z1+2z4+z7)

378
DIAGONAL EDGE DETECTION

379
AN EXAMPLE OF HORIZONTAL & VERTICAL
EDGE DETECTION

The roof tile, horizontal brick joints, and


horizontal segments of the windows.
1200x1600 pixels

The corner near the


wall, the lamp post,
vertical brick joints,
and vertical segments
of the windows.

The directionality of the 2


components is evident.

380
PRINCIPLE EDGE DETECTION

The contribution made to image detail by the wall bricks is significant.


This level of detail is often not desirable. One way to reduce it to smooth the image.

The response of each


mask now shows
almost no contribution
due to the bricks.

The principle edges


dominate the result.

Note that averaging


caused the response of
all edges to be weaker.

381
EMPHASIS ON DIAGONAL EDGE DETECTION

If it is important to emphasize the diagonal


edges, a diagonal mask can be used.

The stronger diagonal response is evident.

The response in H & V directions is weaker.

382
THE LAPLACIAN

z1 z2 z3

z4 z5 z6 3x3 region of the image

z7 z8 z9

0 -1 0

-1 4 -1 ∇2f = 4z5 –(z2+z4+z6+z8)


0 -1 0

-1 -1 -1

-1 8 -1 ∇2f = 8z5 –(z1+z2+z3+z4+z6+z7+z8+z9)


(diagonal neighbors included)
-1 -1 -1

383
ROLE OF THE LAPLACIAN IN SEGMENTATION

‰ In general, the Laplacian is not used in its original form for


edge detection for several reasons:
 It is unacceptably sensitive to noise.
 Its magnitude produces double edges (an undesirable effect).
 It is unable to detect edge direction.
‰ Hence, the role of the Laplacian in segmentation consists of:
 Using its zero-crossing property for edge location
 Using it to check whether a pixel is on the dark or light side of an
edge
r2

‰ Consider the smoothing function h(r ) = −e 2σ 2
, r 2 = x 2 + y 2 , σ : SD

r2
⎡ r 2 − σ 2 ⎤ − 2σ 2
‰ The Laplacian of h: ∇ h(r ) = − ⎢
2
⎥e (Laplacian of a Gaussian)
⎣ σ
4

384
THE LAPLACIAN OF A GAUSSIAN (LoG)

The second derivative is a linear operation: convolving an image with ∇2h is the
same as convolving the image with h first and then computing the Laplacian of the
result.

A mask that
approximates ∇2h.

The coefficients sum to zero.

385
COMPARISON OF 2 APPROACHES
FOR EDGE FINDING

Sobel gradient

Comparison between (b) and (g):

1. Edges in the zero-crossing image are


Spatial thinner.
smoothing Laplacian
mask 2. The edges determined by zero-
function crossings form closed loops (a
serious drawback of the method).
3. The computation of zero-crossings
presents a challenge in general.
More sophisticated techniques are
often required to obtain acceptable
results.
4. Zero-crossing methods are of
interest because of their noise
reduction capabilities and potential
for rugged performance.
5. Because of the noted limitations, the
gradient-based edge finding
LoG Thresholded Zero-crossings techniques are still used more
LoG frequently.
386
EDGE LINKING & BOUNDARY DETECTION

‰ The edge detection methods normally yield pixels lying only


on edges.
‰ In practice, this set of pixels seldom characterizes an edge
completely for several reasons:
 Noise
 Breaks in the edge from non-uniform illumination
 Other reasons that introduce spurious intensity discontinuities.
‰ Edge detection algorithms are followed by linking
procedures to assemble edge pixels into meaningful edges.
‰ We’ll look at several basic approaches.
 Local processing
 Global processing via the Hough Transform
 Global processing via graph-theoretic techniques

387
LOCAL PROCESSING

‰ A simple approach for linking edge points is to analyze the


characteristics of pixels in a small neighborhood (3x3, 5x5)
about every point (x,y) labeled an edge point.
‰ All points that are similar according to a predefined set of
criteria are linked.
‰ This forms an edge of pixels that share the specified criteria.
‰ 2 principal properties for establishing similarity:
 The strength of the response of the gradient operator
 The direction of the gradient operator
‰ An edge pixel at (x0,y0) in the predefined neighborhood of (x,y)
is similar in magnitude to the pixel at (x,y) if
| ∇f(x,y) - ∇f(x0,y0)| ≤ E, E >0.
‰ An edge pixel at (x0,y0) in the predefined neighborhood of (x,y)
has an angle similar to the pixel at (x,y) if
|α(x,y) - α(x0,y0)| ≤ A, A >0.
‰ The point at (x0,y0) is linked to the pixel at (x,y) if both criteria
are satisfied.
388
AN EXAMPLE OF LOCAL PROCESSING

The objective is to find rectangles whose sizes make them candidates for license plates.

Gy

E = 25
Gx A = 15

One of the few rectangles detected.


In the US, the width-to-height ratio is 2:1.
389
THRESHOLDING

Image thresholding has intuitive properties and simple implementation: It


enjoys a central position in applications of image segmentation.

We now introduce thresholding in a more formal way.

f(x,y) > T: an object point T1 < f(x,y) < T2: (x,y) ∈ object class 1
f(x,y) < T: a background point f(x,y) > T2: (x,y) ∈ object class 2
f(x,y) < T1: (x,y) ∈ the background

An image An image
composed of composed of 2
light objects types of light
on a dark objects on a
background. dark
background.

In general, segmentation problems requiring multiple thresholds


are best solved using region growing methods.

390
THRESHOLDING FUNCTION

‰ Thresholding may be viewed as an operation that involves


tests against a function T of the form
T = T[x, y, p(x,y), f(x,y)]

Denotes some local property of (x,y)


e.g., average gray level of a
neighborhood centered at (x,y)

‰ A thresholded image is defined as


1 if f(x,y) > T
g(x,y) =
0 if f(x,y) ≤ T
‰ T depends only on f: the threshold is global.
‰ T depends on both f and p: the threshold is local.
‰ T depends on f, p, and (x,y): the threshold is dynamic or
adaptive.

391
THE ROLE OF ILLUMINATION

Simple image formation model:

f(x,y) = i(x,y)r(x,y) r(x,y)


0 < i(x,y) < ∞ Easy to segment
0 < r(x,y) < 1

Histogram of r(x,y)
i(x,y)

i(x,y)r(x,y)
Difficult to segment

392
BASIC GLOBAL THRESHOLDING

The chosen threshold T


achieved a clean
segmentation by eliminating
the shadows, and leaving
only the objects themselves.

This type of global


thresholding is expected to be
successful in highly
controlled environments.

For example, in
industrial
applications,
illumination can
be controlled.

393
AUTOMATIC DETERMINATION OF T

‰ In the previous example, the value of T was chosen by a


visual inspection of the histogram.
‰ Algorithm to determine T automatically
1. Select an initial estimate for T.
2. Segment the image using T.
„ G1: all pixel values > T
„ G2: all pixel values ≤ T
3. Compute µ 1 and µ 2 in regions G1 and G2.
4. Compute a new threshold value: T = 0.5(µ1+ µ 2)
5. Repeat steps (2)-(4) until ∆T ≤ T0.
‰ Estimation of T0
 Background and object occupy comparable areas: T0 is the
average gray level of the image.
 Objects are small compared to the area occupied by the
background (or vice versa): T0 is a value midway between the
max and min gray levels.

394
AN EXAMPLE OF AUTOMATIC
DETERMINATION OF T

Note the
clear valley

T0 = 0
T = 125
After 3 iterations: T = 125.4

395
BASIC ADAPTIVE THRESHOLDING

‰ Imaging factors (uneven illumination, etc.) can transform


a perfectly segmentable histogram into a histogram that
cannot be partitioned effectively by a single global
threshold.
‰ How do we handle such situations?
‰ One approach is to divide the image into subimages, and
utilize a different threshold for each subimage.
‰ Key issues
 How do we subdivide the image?
 How do we estimate the threshold for each subimage?
‰ The threshold used for each pixel depends on the
location of the pixel.
‰ Hence, this type of thresholding is adaptive.

396
AN EXAMPLE OF BASIC
ADAPTIVE THRESHOLDING

The global
threshold is
manually placed
in the valley of the
histogram.

Each subimage
with σ 2 > 100:
Subimages not
containing a T automatically
boundary: σ 2 < 75 determined.

T0: midway
Subimages between the min
containing a and max gray
boundary: σ 2 > 100 levels.

All subimages with σ 2 < 100:


16 subimages
Treated as one composite image, and a single T
estimated using the same algorithm.
397
SUBDIVISION OF A PROBLEM SUBIMAGE

clearly
bimodal

almost
unimodal

398
THRESHOLDS BASED ON SEVERAL VARIABLES

‰ Multispectral thresholding: generalization of gray level


thresholding
‰ In color images, each pixel is characterized by 3 RGB
values.
‰ 3D histogram:
 An image with 3 RGB components, each having 16 possible
levels.
 16x16x16 cube
 Each cell of the cube contains the # of pixels whose RGB
components have values corresponding to the coordinates
defining the location of the cell.
 Each entry is divided by the total # of pixels in the image to
form a normalized histogram.
‰ Color segmentation can be based on any color model.
 HSI model: segmentation using the H and S components is
attractive because it involves 2D data clusters.

399
AN EXAMPLE OF COLOR SEGMENTATION

The color image has 3 16- Thresholding about one Thresholding about a
level RGB components. of the histogram clusters cluster close to the red
corresponding to facial axis.
tones.

T: a distance of one cell


Pixels with components outside the cell: black
Pixels with components inside the cell: white
400
REGION-BASED SEGMENTATION

‰ The objective of segmentation is to partition an image into


regions.
‰ Previous approaches
 Finding boundaries between regions based on discontinuities in
gray levels
 Thresholding based on the distribution of pixel properties
‰ We will now discuss techniques that are based on finding the
regions directly.
‰ Basic formulation
 R: the entire image region; n subregions: R1, R2, …, Rn
 URi = R
 Ri is a connected region, i = 1,2,…,n
 Ri∩Rj = ∅ for all i and j, i ≠ j
 P(Ri) = TRUE, i = 1,2,…,n, where P(Ri) is a logical predicate
defined over the points in Ri
 P(Ri URj ) = FALSE for i ≠ j

401
REGION GROWING

‰ Region growing: a procedure that groups pixels or


subregions into larger regions based on predefined criteria.
‰ Basic approach
 Start with a set of “seed” points
 Grow regions by appending to each seed those neighboring
pixels with similar properties
‰ Selection of the seed points
 Often depends on the nature of the problem
 No information is available:
„ Compute at every pixel the same set of properties that will be used to
assign pixels to regions during the growing process.
„ If the result of the computations shows cluster of values, the pixels
whose properties place them near the centroid of these clusters can
be used as seeds.
‰ Selection of similarity criteria depends on (1) problem under
consideration, and (2) type of image data available.
‰ Connectivity or adjacency information is often needed.
‰ Formulation of a stopping rule
402
AN EXAMPLE OF REGION GROWING

Several cracks
Histogram of (a) and porosities

65

Selection of initial seeds:

All pixels with a value of 255

Selection of criteria for region growing:

1. |any pixel – seed| < 65

2. To be included in a region, the pixel


has to be 8-connected to at least one
pixel in that region.

No stopping rules were needed: the criteria for region


growing were sufficient to isolate the features of interest.
403
REGION SPLITTING AND MERGING

‰ The previous method grows regions from a set of seed


points.
‰ An alternative is to subdivide an image initially into a
set of arbitrary, disjointed regions, and then merge
and/or split the regions in an attempt to satisfy the
conditions stated earlier.
‰ Split/merge algorithm
 R: the entire region, P: the selected predicate
 If P(R) = FALSE, divide the image into 4 quadrants.
 At any step
„ Split into 4 disjoint quadrants any region Ri for which
P(Ri ) = FALSE.
„ Merge any adjacent regions Rj and Rk for which
P(RjURk) = TRUE.
„ Stop when no further merging or splitting is possible.
‰ Several variations of the algorithm are possible.

404
QUADTREE REPRESENTATION

The root correspond


to the entire image

A quadtree: a tree in
which nodes have exactly
4 descendants

405
AN EXAMPLE OF REGION
SPLITTING AND MERGING

P(Ri) = TRUE if at least 80% of the pixels in Ri have the property |zj – mi| ≤ 2σi

Gray level of Mean gray


jth pixel in Ri level in Ri

If P(Ri) = TRUE, the values of all pixels in Ri are set to mi.


The shading and the stem
were erroneously eliminated.

segmentation T is placed midway


is perfect. between the two
principal peaks of
the histogram.
406
SEGMENTATION BY MORPHOLOGICAL
WATERSHEDS

‰ We have discussed segmentation based on 3 principal concepts:


 Detection of discontinuities
 Thresholding
 Region processing
‰ Each approach have advantages and disadvantages
 Speed is an advantage of global thresholding
 The need for postprocessing is a disadvantage in methods based on detecting
discontinuities in gray levels
‰ Segmentation by watersheds embodies many of the concepts of the other 3
approaches, and often produces more stable segmentation results.
‰ Topographic and hydrology concepts have proven useful in the development
of region segmentation methods.
‰ A monochrome image is considered to be an altitude surface.
 High-amplitude pixels correspond to ridge points.
 Low-amplitude pixels correspond to valley points.
‰ If a drop of water were to fall on any point of the altitude surface, it would
move to a lower altitude until it reached a local minimum.
‰ The accumulation of water in the vicinity of a local minimum is called a
catchment basin.
‰ All points that drain into a common catchment basin are part of the same
watershed.

407
CHAPTER 10: DIGITAL IMAGE WATERMARKING

‰ Data Hiding: process of embedding data


(watermark) into multimedia such as image, video,
and audio.
 Invisibility: degree of distortion introduced by the
watermark.
 Robustness: resistance against attacks and normal
A/V processes.
„ noise, filtering, resampling, scaling, rotation, cropping
„ lossy compression
„ A-D-A & D-A-D conversions
 Capacity: amount of data that can be represented by
the embedded watermark
‰ Typical use of watermarks
 identification of the origin of content
 tracing illegally distributed copies of the content
 disabling unauthorized access to the content
408
WATERMARKING SYSTEM

attack
W

detection
Image, I Iw Iw *
embedding ……….

distribution
extraction

Key, k

409
CLASSIFICATION

Criterion Class Brief description

Domain Pixel Pixels values are modified to embed the


type watermark.

Transform Transform coefficients are modified to


embed the watermark. Recent popular
transforms are Discrete Cosine Transform
(DCT), Discrete Wavelet Transform (DWT),
and Discrete Fourier Transform (DFT).

Watermark Pseudo random number Allows the detector to statistically check


type (PRN) sequence (having a the presence or absence of a watermark.
normal distribution with zero A PRN sequence is generated by feeding
mean and unity variance) the generator with a secret seed.

Visual watermark The watermark is actually reconstructed,


and its visual quality is evaluated.

Information Non-blind Both the original image and the secret


type key(s)
Semi-blind The watermark and the secret key(s)

Blind Only the secret key(s)

410
POPULAR TRANSFORMS

‰ In transform coding, the signal is mapped from the


spatial domain into the transform domain.
‰ Each transform
 decorrelates the pixels
 results in energy compaction
 concentrates the energy in the low-frequency zone of
the transform domain.
‰ Discrete Cosine Transform (DCT)
 Used in JPEG
‰ Discrete Wavelet Transform (DWT)
 Used in JPEG 2000
‰ Discrete Fourier Transform (DFT)
‰ DCT, DWT & DFT are commonly used in digital
watermarking.

411
DCT DOMAIN WATERMARKING

‰ Watermark embedding
 W: watermark to be embedded.
 X: sequence of pixel values.
 XD and YD: row-concatenated DCT coefficients of X and Y.
 a = scaling factor: Determines the intensity of the watermark.

YD (i ) = X D (i )(1 + aW )
‰ Watermark extraction
 W*: extracted version of the watermark.
 ZD: possibly forged watermarked image.

1 ZD (i) W ⋅W *
W (i) =
*
1 ==> S(W ,W ) =
*
a XD (i) W ⋅W *
 T = user-defined threshold.
 If S > T, image is authentic.

412
DCT DOMAIN WATERMARKING

S>T

413
SCALING FACTOR a = 0.1, 0.5, 1.0, 5.0

Original image

a = 0.1 a = 0.5 a = 1.0 a = 5.0

414
CONCLUSIONS

‰ The scaling factor a is a critical system parameter.


 If a is too small, the image is not distorted but the robustness
of the scheme is low.
 If a is too large, the image is distorted but the robustness of
the scheme is high.
‰ Modification of low-frequency coefficients
 Distorts the image
 Gives the hacker a clue about where the watermark is
embedded
‰ Modification of high-frequency coefficients
 No distortion
 The watermark cannot be detected after attacks like JPEG
compression.

415
DWT DOMAIN WATERMARKING

‰ Each level decomposition produces 4 bands.


 LL, HL, LH, HH
‰ The LL band can further be decomposed for another level
of decomposition.
‰ This process is continued until the desired number of
levels determined by the application is reached.
‰ Depending on the watermarking scheme, some of the
DWT coefficients are modified to embed the watermark.

LL2 HL2
HL1 DWT decomposition
LH2 HH2 with 2 levels

LH1 HH1

416
MULTIPLE WATERMARKING
IN THE DWT DOMAIN - FIRST PAPER

‰ A recent paper on DWT-based multiple watermarking


 R. Mehul and R. Priti, “Discrete Wavelet Transform
Based Multiple Watermarking Scheme,” Proceedings
of IEEE Region 10 Technical Conference on Convergent
Technologies for the Asia-Pacific, Bangalore, India,
October 14-17, 2003.
 Embedding in low frequencies
„ robust with respect to attacks that have low pass
characteristics (filtering, lossy compression)
 Embedding in middle and high frequencies
„ robust with respect to nonlinear deformations of the gray
scale (contrast/brightness adjustment, gamma correction,
and histogram equalization)
 Experiments
„ Binary watermark “CO” is embedded in the second level LL
„ Binary watermark “EP” is embedded in the second level HH
 Idea is good but implementation has a serious flaw!
417
MULTIPLE WATERMARKING IN THE DWT
DOMAIN - GENERALIZATION

‰ P. Tao and A. M. Eskicioglu, “A Robust Multiple


Watermarking Scheme in the DWT Domain,” Optics East
2004 Symposium, Internet Multimedia Management
Systems V Conference, Philadelphia, PA, October 25-28,
2004.
‰ A generalization of Mehul & Priti paper
‰ Where is the watermark embedded?
 First level decomposition
 Second level decomposition
‰ What is the visual watermark?
 BC (embedded in all 4 quadrants)
 A (for rewatermarking)
‰ Comparison of extraction from first and second
decomposition levels

418
ALGORITHM

‰ Input: Cover image, I = (aij, 0 ≤ i,j ≤ 2n) , and


Binary visual watermark, W = (wij ∈ {0,1}, 0 ≤ i,j ≤ n).
‰ Process:
embedding

 Using two-dimensional separable dyadic DWT, obtain the desired level of


decomposition of the cover image I.
 Modify the DWT coefficients Vij in the LL, HL, LH, and HH bands:
Vwk,ij = Vijk + α kWij , i,j = 1,…,n, and k = 1,2,3,4.
 Apply inverse DWT to obtain the watermarked cover image, I w

‰ Output: Watermarked cover image


‰ Input: Watermarked cover image
‰ Process:
 Using two-dimensional separable dyadic DWT, obtain the desired level of I w
*
extraction

decomposition of the watermarked (and possibly attacked) cover image


 Extract the binary visual watermark from the LL, HL, LH, and HH bands:
Wij* = (Vw*,kij − Vijk ) / α k , i,j = 1,…,n, and k = 1,2,3,4.
 If Wij* > 0.5 , then Wij* = 1 else Wij* = 0
‰ Output: Binary visual watermark
419
COVER IMAGE AND VISUAL WATERMARKS

1.000 0.977
(c) (e) (f)

0.985 0.986
(a) (b) (d) (g) (h)

(a) Cover image Goldhill


(b) Watermarked Goldhill (PSNR = 42.40)
(c) Watermark 1 is used in 13 attacks
(d) Watermark 2 is used in rewatermarking and collusion attacks
(e)-(h) The watermarks extracted from the LL, HL, LH, and HH bands,
respectively, before the attacks. The number below each watermark is
the Similarity Ratio (SR) value.

SR = S/(S+D)
S: # of matching pixel values in compared images
D: # of different pixel values in compared images.

420
ATTACKS – 1ST LEVEL

JPEG 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen


(Matlab): 34.96 (Matlab): 29.63 (Matlab): 17.42 (Matlab): 11.44 (Photoshop): 31.21

JPEG 50 (Matlab): 33.11 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
(Matlab): 29.74 (Matlab): 18.87 (Matlab): 11.88 (Matlab): 38.51

JPEG 25 (Matlab): 31.27 Rescale 512 -> 256 -> 512 Gamma Correction 1.5 Pixelate 2 (mosaic) Collusion
(Matlab): 19.81 (Matlab): 17.90 (Photoshop): 30.13 (Matlab): 45.35

421
RECOVERED WATERMARKS – 1ST LEVEL

JPEG Quality 75 Blur (3,3) Histogram Equalization Rotate 200 (Matlab) Sharpen (Photoshop)

JPEG Quality 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark

JPEG Quality 25 Rescale 512 -> 256 -> 512 Gamma correction 1.5 Pixelate 2 (mosaic) Collusion

422
ATTACKS – 2nd LEVEL

JPEG 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen


(Matlab): 34.85 (Matlab): 29.60 (Matlab): 17.42 (Matlab): 11.45 (Photoshop): 31.41

JPEG 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark
(Matlab): 33.04 (Matlab): 29.73 (Matlab): 18.88 (Matlab): 11.88 ( Matlab): 37.76

JPEG 25 Rescale 512 -> 256 -> 512 Gamma Correction 1.5 Pixelate 2 (mosaic) Collusion
(Matlab): 31.22 (Matlab): 19.80 (Matlab): 17.89 (Photoshop): 30.08 (Matlab): 44.08

423
RECOVERED WATERMARKS – 2nd LEVEL

JPEG Quality 75 Blur (3,3) Histogram Equalization Rotate 200 Sharpen

JPEG Quality 50 Gaussian Noise [0 0.001] Intensity Adj. ([0 0.8],[0 1]) Crop Rewatermark

JPEG Quality 25 Rescale 512 -> 256 -> 512 Gamma correction 1.5 Pixelate 2 (mosaic) Collusion

424
CONCLUSIONS

‰ Watermarks embedded in the lowest frequencies are more robust


against one group of attacks.
 JPEG compression, blurring, adding Gaussian noise, rescaling, rotation,
cropping, pixelation, and sharpening
‰ Watermarks embedded in the highest frequencies are more robust
against another group of attacks.
 histogram equalization, intensity adjustment, and gamma correction
‰ The watermarks extracted from the four bands are identical for the
remaining attacks.
 rewatermarking and collusion
‰ Embedding in the mid frequencies is also resistant to certain attacks.
 histogram equalization, intensity adjustment, gamma correction, and
sharpening
‰ Watermark embedding at the first level of decomposition appears
advantageous:
 All the frequency bands are modified.
 The quality of corresponding images are almost identical.
 The extracted watermarks are more textured with better subjective visual
quality.

425
DWT-SVD DOMAIN WATERMARKING

E. Ganic and A. M. Eskicioglu, “Robust DWT-SVD Domain Image


Watermarking: Embedding Data in All Frequencies,” ACM Multimedia and
Security Workshop 2004, Magdeburg, Germany, September 20-21, 2004.

2-D Discrete Wavelet Transform (DWT)


1. Each level decomposition produces 4 bands – LL, HL, LH, and HH.
2. The LL band can further be decomposed for another level of decomposition.

LL2 HL2

HL1

LH2 HH2

LH1 HH1

Singular Value Decomposition (SVD)


1. Every matrix A can be decomposed into a product of 3 matrices: A = UΣVT.
2. Columns of U: left singular vectors of A.
3. Columns of V: right singular vectors of A.
4. Diagonal entries of Σ : singular values of A.

A = λ1U1V1 + λ2U2V2 + … + λrUrVr 426


EMBEDDING AND EXTRACTION

Watermark embedding
1. Using DWT, decompose the cover image into 4 subbands: LL, HL, LH, and HH.
k
k k kT
2. Apply SVD to each subband image: A =U Σ V
a a a
T
3. Apply SVD to the visual watermark: W =U Σ V
W W W

4. Modify the singular values in each subband: λ*i k = λik + α k λ wi , i = 1,..., n


*k k * k kT
5. Construct the watermarked image: A =U Σ V
a a a

Watermark extraction

1. Decompose the watermarked cover image into 4 subbands: LL, HL, LH, and HH.
2. Apply SVD to each subband image: *k k * k kT
A =U Σ V
a a a
3. Extract the singular values from each subband: λ kwi = (λ*i k − λik ) / α k , i = 1,..., n
k k T
4. Construct the four visual watermarks using the singular vectors: W =U Σ V
W W W

427
COVER IMAGE AND VISUAL WATERMARK

Cover image Lena Cameraman Watermarked Lena Constructed Watermarks

428
ATTACKS

Blur 5x5 (xnview) Noise 0.3 (xnview) Pixelate 2 (mosaic) (Photoshop) JPEG 30:1 (xnview)

JPEG2000 50:1 (xnview) Sharpen 80 (xnview) Rescale 256 (xnview) Rotate 200 (Photoshop)

Crop on both sides (Photoshop) Contrast -25 (Photoshop) Histogram Equalization (Photoshop) Gamma correction 0.60 (ImageReady)

429
EXTRACTED WATERMARKS

Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic)

0.885 -0.235 0.865 0.207 1.000 -0.448

-0.184 -0.420 0.271 0.277 -0.380 -0.424

JPEG 30:1 JPEG2000 50:1 Sharpen 80

0.993 0.003 0.989 0.065 0.528 0.553

0.141 -0.381 0.064 -0.402 0.631 0.699

430
EXTRACTED WATERMARKS

Rescale 512Æ 256 Æ 512 Rotate 200 Crop on both sides

0.940 -0.256 -0.353 -0.003 -0.979 0.982

-0.211 -0.437 0.963 -0.335 0.945 0.985

Contrast -20 Histogram Equalization Gamma Correction 0.60

0.158 0.017 0.586 0.657 -0.942 0.946

0.376 0.738 0.716 0.823 0.987 0.997

431
BEST EXTRACTIONS

Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic) JPEG 30:1

JPEG2000 50:1 Sharpen 80 Rescale 512Æ 256 Æ 512 Rotate 200

Crop on both sides Contrast -20 Histogram Equalization Gamma Correction 0.60

432
PURE SVD DOMAIN WATERMARKING

Gaussian Blur 5x5 Gaussian Noise 0.3 Pixelate 2 (mosaic) JPEG 30:1

JPEG2000 50:1 Sharpen 80 Rescale 512Æ 256 Æ 512 Rotate 200

Crop on both sides Contrast -20 Histogram Equalization Gamma Correction 0.60

433
CONCLUSIONS

‰ Watermark embedding in the LL subband is resistant to


one group of attacks.
 Gaussian blur, Gaussian noise, pixelation, JPEG
compression, JPEG2000 compression, and rescaling
‰ Watermark embedding in the HH subband is resistant to
another group of attacks.
 sharpening, cropping, contrast adjustment, histogram
equalization, and gamma correction
‰ Watermark embedding in the LH subband is resistant to
the rotation attack.
‰ Observers can evaluate the quality of constructed
watermarks either subjectively or objectively.
 Subjective evaluation: the reference watermark is
compared with the watermark constructed after an attack.
 Objective evaluation: statistical measures like Pearson’s
correlation coefficient can be used, not requiring the
singular vectors of the watermark image.

434
DFT DOMAIN WATERMARKING

‰ DWT or DCT domain semi-blind or blind watermarking


schemes are not robust against geometric attacks (rotation,
translation, scaling, etc.).
‰ Because of the properties of the DFT, recent research has
focused on DFT-based watermarking.
‰ Three approaches
 Both a template and a watermark are embedded (Pereira & Pun,
2000).
„ The template contains no information but is ude to detect the
transformation applied to the watermarked image.
 No template, the watermark is resistant to most geometric attacks
(Caldelli, Barni, Bartolini & Piva, 2000)
 Circular watermark (Solachidis & Pitas, 1999; Licks & Jordan,
2000)
„ To guarantee that the inverse DFT is real, the watermark must satisfy
the following property: W(u,v)=W(N-u,N-v)

435
EMBEDDING MULTIPLE CIRCULAR WATERMARKS
IN THE DFT DOMAIN

‰ In two papers, a circularly symmetric watermark is


embedded in the DFT domain
 Solachidis and Pitas [1999]: a 2D circularly symmetric
sequence in a ring covering the middle frequencies
 Licks and Jordan [2000]: use a watermark in the form of a
circle with a radius that corresponds to higher frequencies of
the image
‰ E. Ganic and A. M. Eskicioglu, “A DFT-Based Semi-Blind
Multiple Watermarking Scheme for Images,” 4th New York
Metro Area Networking Workshop, The Graduate Center,
CUNY, September 10, 2004.
 We extend the multiple watermarking idea by inserting two
circular watermarks in the DFT domain.
 Test image: 256x256 Lena
 Two radii:
„ 32 (corresponds to lower frequencies)
„ 96 (corresponds to higher frequencies)

436
MAJOR DISADVANTAGE OF CIRCULAR WATERMARKS

Watermarked Lena Magnitudes of DFT coefficients

437
TEST IMAGE

Original Lena Watermarked Lena

438
ATTACKS

JPEG Gaussian noise blurring Gamma correction

Resizing Histogram equalization Contrast adjustment Rotation

Cropping

Scaling

439
DETECTION

‰ Presence of the watermark is detected using the correlation


N N
c = ∑∑ W (u , v) M w* (u , v)
u =1 v =1

‰ Decision rule
 H0: the image is watermarked with W if c > T
 H1: the image is not watermarked with W if c < T
‰ Threshold T = (µ0 + µ1)/2
 0: the expected values of the Gaussian probability density
functions (pdfs) associated with the hypotheses H0
 1: the expected values of the Gaussian probability density
functions (pdfs) associated with the hypotheses H1
‰ Detection anomalies
 False positives: detecting the watermark in an unmarked image
 False negatives: not detecting the watermark in a marked image

440
EXPERIMENTAL RESULTS:
THRESHOLDS AND FALSE NEGATIVES

Radius = 96 Radius = 32
T % T %
JPEG 0.086 48 0.228 12
Gaussian noise 0.110 37 0.206 18
blurring 0.120 51 0.228 13
resizing 0.093 55 0.227 13
histogram equalization 0.272 1 0.267 14
contrast adjustment 0.273 0 0.232 11
gamma correction 0.271 0 0.231 11
scaling 0.251 1 0.233 11
rotation 0.142 35 0.174 42
cropping 0.154 21 0.150 34

441
EXPERIMENTAL RESULTS:
THRESHOLDS AND FALSE POSITIVES

Radius = 96 Radius = 32
T % T %
JPEG 0.086 40 0.228 7
Gaussian noise 0.110 24 0.206 13
blurring 0.120 41 0.228 8
resizing 0.093 45 0.227 8
histogram equalization 0.272 0 0.267 4
contrast adjustment 0.273 0 0.232 6
gamma correction 0.271 0 0.231 6
scaling 0.251 0 0.233 7
rotation 0.142 23 0.174 26
cropping 0.154 8 0.150 31

442
CONCLUSIONS

‰ Embedded in higher frequencies


 the percentage of false negatives or positives is higher for one
group of attacks
„ JPEG, Gaussian noise, blurring, and resizing
 the percentage of false negatives or positives is lower for
another group of attacks
„ histogram equalization, contrast adjustment, gamma
correction, scaling, rotation, and cropping
‰ Embedding in lower frequencies
 the percentage of false negatives or positives is lower for one
group of attacks
„ JPEG, Gaussian noise, blurring, and resizing
 the percentage of false negatives or positives is higher for
another group of attacks
„ histogram equalization, contrast adjustment, gamma
correction, scaling, rotation, and cropping
‰ For all attacks, the percentages of false positives are lower
than false negatives.

443
AN EFFECTIVE BLIND WATERMARKING SYSTEM
FOR COLOR IMAGES

‰ J. Kusyk and A.M. Eskicioglu, “An Effective Blind Watermarking


Scheme for Color Images through Comparison of DFT
Coefficients,” ACM Multimedia and Security Workshop 2005, New
York, NY, August 1-2, 2005.
‰ Proposed algorithm
‰ Attacks
 Cropping
 Histogram equalization
 Low pass filtering
 Gamma correction
 Gaussian noise
 Intensity adjustment
 JPEG compression
 Resizing
 Rotation
 Rescaling
 Fragmentation
‰ Experiments
‰ Conclusions
444
EMBEDDING

‰ Compute the DFT of the NxN cover image.


‰ Move the origin to the center.
‰ Obtain the magnitudes of DFT coefficients.
‰ Divide the NxN matrix of magnitudes into four (N/2)x(N/2) matrices Mul, Mur, Mll,
Mlr. ul: upper left, ur: upper right, ll: lower left, lr: lower right.
‰ Define three frequency bands: low, middle, and high.
‰ Embed a visual binary watermark in these three bands by determining the
embedding locations.
‰ In each band:
 Choose a magnitude a in matrix Mul, and the corresponding magnitude b in matrix Mur.
 Compute the mean m = (a+b)/2, and choose the value of the parameter p.
 Embedding bit 1: If a < m-(p/2*m) then do not modify a and b
 else a=m-(p/2*m) and b=m+(p/2*m)
 Embedding bit 0: If a > m+(p/2*m) then do not modify a and b
 else a=m+(p/2*m) and b=m-(p/2*m)
‰ Copy the modified magnitudes in matrix Mul to matrix Mlr.
‰ Copy the modified magnitudes in matrix Mur to matrix Mll.
‰ Obtain the DFT coefficients of the entire image using the modified magnitudes.
‰ Compute the inverse DFT.

445
EXTRACTION

‰ Compute the DFT of the NxN watermarked (and possibly attacked) image.
‰ Move the origin to the center.
‰ Obtain the magnitudes of DFT coefficients.
‰ Divide the NxN matrix of magnitudes into four (N/2)x(N/2) matrices Mul, Mur, Mll,
Mlr.
‰ Use the three frequency bands and the embedding locations defined in the
embedding process: low, middle, and high.
‰ In each band, if a > b then bit = 0 else bit = 1.

446
EMBEDDING THE WATERMARK
INTO LUMINANCE LAYER

p: 120%
12x12 WM Low frequency area
Modify every
4th magnitude

p: 60-90%
Mid frequency area
(180,180)

p: 10-60% High frequency area


(100,100)

Mul Mur
(20,20)

Mll Copy the


modified
Copy the
modified
Mlr
magnitudes magnitudes

Cover image: 512X512

447
WATERMARKING A FULL COLOR IMAGE
USING MATLAB

Cover image: 512x512 Lena Watermarked Lena: PSNR=46.83dB

Cover image DFT coefficient magnitudes Watermarked image DFT coefficient magnitudes
448
EXTRACTIONS FROM UNATTACKED
LUMINANCE LAYER

SR = 0.986

High frequency band

SR = 1.000

Mid frequency band

Unattacked cover image


SR = 1.000

Similarity Ratio (SR) = S/(S+D)


S: # of matching pixel values in compared images
D: # of different pixel values in compared images. Low frequency band

449
EXTRACTIONS FROM UNWATERMARKED
LUMINANCE LAYER

SR = 0.521

High frequency band

SR = 0.458

Mid frequency band

Unwatermarked cover image


SR = 0.486

Low frequency band

450
EXTRACTIONS FROM CROPPED
LUMINANCE LAYER

SR = 0.910

High frequency band

SR = 0.944

Mid frequency band

Cropping
SR = 0.826

Low frequency band

451
EXTRACTIONS FROM HISTOGRAM EQUALIZED
LUMINANCE LAYER

SR = 0.896

High frequency band

SR = 0.979

Mid frequency band

Histogram equalization
SR = 0.972

Low frequency band

452
EXTRACTIONS FROM LOW PASS FILTERED
LUMINANCE LAYER

SR = 0.465

High frequency band

SR = 0.625

Mid frequency band

Low pass filtering (3X3)


SR = 0.993

Low frequency band

453
EXTRACTIONS FROM GAMMA CORRECTED
LUMINANCE LAYER

SR = 0.958

High frequency band

SR = 0.972

Mid frequency band

Gamma correction (1.5)


SR = 0.951

Low frequency band

454
EXTRACTIONS FROM GAUSSIAN NOISY
LUMINANCE LAYER

SR = 0.535

High frequency band

SR = 0.806

Mid frequency band

Gaussian noise (0, 0.001)


SR = 0.958

Low frequency band

455
EXTRACTIONS FROM INTENSITY ADJUSTED
LUMINANCE LAYER

SR = 0.958

High frequency band

SR = 0.993

Mid frequency band

Intensity adjustment ([0 0.8], [0,1])


SR = 0.986

Low frequency band

456
EXTRACTIONS FROM JPEG COMPRESSED
LUMINANCE LAYER

SR = 0.458

High frequency band

SR = 0.486

Mid frequency band

JPEG compression (Q=50)


SR = 0.979

Low frequency band

457
EXTRACTIONS FROM RESIZED
LUMINANCE LAYER

SR = 0.549

High frequency band

SR = 0.667

Mid frequency band

Resizing (512 → 256 → 512)


SR = 0.993

Low frequency band

458
EXTRACTIONS FROM ROTATED
LUMINANCE LAYER

SR = 0.618

High frequency band

SR = 0.924

Mid frequency band

Rotation (50)
SR = 0.931

Low frequency band

459
EXTRACTIONS FROM SCALED
LUMINANCE LAYER

SR = 0.819

High frequency band

SR = 0.958

Mid frequency band

SR = 0.972

Low frequency band


Scaling (512 → 1024)
460
EXTRACTIONS FROM FRAGMENTED &
TRANSLATED LUMINANCE LAYER

SR = 0.986

High frequency band

SR = 1.000

Mid frequency band

Fragmentation and translation

SR = 1.000

Low frequency band

461
EXTRACTIONS FROM COLLUDED
LUMINANCE LAYER

SR = 0.736

High frequency band

Watermarked with “X” (46.83dB) Watermarked with “I” (47.43dB)


SR = 0.708

Mid frequency band

SR = 0.701

3 images added & scaled Low frequency band


Collusion
462
EXTRACTIONS FROM REWATERMARKED
LUMINANCE LAYER

SR = 0.938 SR = 1.000

High frequency band

Watermarked with “X” (46.83dB)


SR = 0.993 SR = 0.993

Mid frequency band

SR = 1.000 SR = 1.000

Low frequency band


Rewatermarked with “I” (43.41dB)
463
BEST EXTRACTIONS

Cropping H. equalization Low pass filtering Gamma correction Gaussian noise

SR = 0.944 SR = 0.979 SR = 0.993 SR = 0.972 SR = 0.958


Int. adjustment JPEG compression Resizing Rotation (40) Scaling

SR = 0.993 SR = 0.979 SR = 0.993 SR = 0.951 SR = 0.972


Fragmentation

SR = 1.000
464
CONCLUSIONS

‰ A DFT-based blind watermarking scheme is presented.


 Magnitudes of DFT coefficients are compared.
 A binary logo is embedded in the luminance layer.
‰ The same watermark is embedded in three frequency bands.
‰ The proposed blind scheme is robust and secure.
 Robustness
„ For each attack, there are three extractions.
„ The extraction with the best visual quality is used.
 Security
„ The hacker does not know the embedding location or what is embedded.
„ In a binary watermark, each pixel value may be either “0” or “1”.
„ For a 512x512 cover image, brute force attack needs 2255x255 trials.
‰ In our implementation, the value of p is the smallest for embedding in the
location closest to the origin. As the embedding location moves away from the
origin, p is assigned a larger value.
‰ Extractions from the lowest frequency band are best for one group of attacks.
 low pass filtering, adding Gaussian noise, JPEG compression, resizing, rotation, and
scaling
‰ Extractions from the middle frequency band are best for another group of
attacks.
 Cropping, histogram equalization, gamma correction, intensity adjustment
‰ Extractions from the fragmented image are identical to extractions from the
unattacked image.

465

You might also like