0% found this document useful (0 votes)
1 views53 pages

Lecture 01- Introduction to Digital Image Processing

Uploaded by

SO LIDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views53 pages

Lecture 01- Introduction to Digital Image Processing

Uploaded by

SO LIDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

INTRODUCTION TO DIGITAL

IMAGE PROCESSING
Lecture 01
2

Outline
❑ Basic Concepts and Terminology
❑ Examples of Typical Image Processing Operations
❑ Components of a Digital Image Processing System
❑ Machine Vision Systems
❑ Light, Color, and Electromagnetic Spectrum
❑ Image Acquisition
❑ Image Digitization
❑ Digital Image Representation
❑ Image File Formats
❑ Basic Terminology
❑ Overview of Image Processing Operations
3

Basic Concepts and Terminology


What Is an Image?
❑ An image is a 2-D visual representation of an object, a person, or
a scene produced by an optical device such as a mirror, a lens, or a
camera.
What Is a Digital Image?
❑ A digital image is a representation of a two-dimensional image
using a finite number of points, usually referred to as picture
elements, pels or pixels.
What Is Digital Image Processing?
❑ Digital image processing can be defined as the science of
modifying digital images by means of a digital computer.
4

Examples of Typical Image Processing Operations


❑ Sharpening: A technique by which the edges and fine details of
an image are enhanced for human viewing.

Image sharpening: (a) original image; (b) after sharpening


5

Examples of Typical Image Processing Operations


❑ Noise Removal: Image processing filters can be used to reduce
the amount of noise in an image before processing it any further.

Noise removal: (a) original (noisy) image; (b) after removing noise
6

Examples of Typical Image Processing Operations


❑ Deblurring: An image may appear blurred for many reasons,
ranging from improper focusing of the lens to an insufficient
shutter speed for a fast-.moving object.

Deblurring: (a) original (blurry) image; (b) after removing the (motion) blur.
7

Examples of Typical Image Processing Operations


❑ Edge Extraction: Extracting edges from an image is a
fundamental preprocessing step used to separate objects from one
another before identifying their contents.

Edge extraction: (a) original image; (b) after extracting its most relevant edges
8

Examples of Typical Image Processing Operations


❑ Binarization: In many image analysis applications, it is often
necessary to reduce the number of gray levels in a monochrome
image to simplify and speed up its interpretation. Reducing a
grayscale image to only two levels of gray (black and white) is
usually referred to as binarization.

Binarization: (a) original grayscale image; (b) after conversion to a black-and-white version
9

Examples of Typical Image Processing Operations


❑ Blurring: It is sometimes necessary to blur an image in order to
minimize the importance of texture and fine detail in a scene, for
instance, in cases where objects can be better recognized by their
shape.

Blurring: (a) original image; (b) after blurring to remove unnecessary details
10

Examples of Typical Image Processing Operations


❑ Contrast Enhancement: In order to improve an image for
human viewing as well as make other image processing tasks (e.g.,
edge extraction) easier, it is often necessary to enhance the
contrast of an image.

Contrast enhancement: (a) original image; (b) after


histogram equalization to improve contrast
11

Examples of Typical Image Processing Operations


❑ Object Segmentation and Labeling: The task of segmenting
and labeling objects within a scene is a prerequisite for most
object recognition and classification systems

Object segmentation and labeling: (a) original image;


(b) after segmenting and labeling individual objects
12

Components of a Digital Image Processing System


❑ The system is built around a computer in which most image
processing tasks are carried out, but also includes hardware and
software for image acquisition, storage, and display

Components of a digital image processing system


13

Components of a Digital Image Processing System


Hardware
❑ The hardware components of a digital image processing system
typically include the following:
❑ Acquisition Devices: Responsible for capturing and digitizing images or
video sequences. For example, scanners, cameras, and camcorders, which can
be interfaced with the main computer in a number of ways (USB, FireWire,
Camera Link, or Ethernet. In cases where the cameras produce analog video
output, an image digitizer, usually known as frame grabber, is used to
convert it to digital format).
❑ Processing Equipment: The main computer itself, in whatever size, shape,
or configuration. Responsible for running software that allows the
processing and analysis of acquired images.
❑ Display and Hardcopy Devices: Responsible for showing the image
contents for human viewing. Examples include color monitors and printers.
14

Components of a Digital Image Processing System


❑ Storage Devices: Magnetic or optical disks responsible for long-term
storage of the images.
Software
❑ There are many image processing tool such as
❑ MATLAB has become very popular with engineers, scientists,
and researchers in both industry and academia, due to many
factors, such as the availability of toolboxes containing specialized
functions for many application areas, ranging from data
acquisition to image processing.
15

Machine Vision Systems


❑ Here is a practical example application of a machine vision
system : recognizing license plates at a high-way toll booth.

Diagram of a machine vision system


16

Machine Vision Systems


❑ The goal is to be able to extract the alphanumeric contents of
the license plate of a vehicle passing through the toll booth in an
automated and unsupervised way, that is, without need for human
intervention (this is called a problem domain).
❑ Additional requirements could include 24/7 operation (under
artificial lighting), all-weather operation, minimal acceptable
success rate, and minimum and maximum vehicle speed.
❑ The acquisition block is in charge of acquiring one or more
images containing a front or rear view of the vehicle that includes
its license plate (using a CCD camera and controlling the lighting
conditions)
❑ The goal of the preprocessing stage is to improve the quality
of the acquired image (contrast improvement, brightness
correction, and noise removal)
17

Machine Vision Systems


❑ The segmentation block is responsible for partitioning an
image into its main components: relevant foreground objects and
background.
❑ (1) extracting the license plate from the rest of the original image; and
❑ (2) segmenting characters within the plate area
❑ The feature extraction block (also known as representation
and description) consists of algorithms responsible for encoding
the image contents in a concise and descriptive way (color or
intensity distribution, texture, shape, etc., …)
❑ This feature vector is used as an input to the pattern
classification (also known as recognition and interpretation)
stage (for example, minimum distance classifiers, probabilistic
classifiers, neural networks).
18

Machine Vision Systems


❑ All modules are connected to a large block called knowledge
base.
❑ The role of such knowledge base in the last stages is quite
evident (e.g., the knowledge that the first character must be a digit
may help disambiguate between a “0” and an “O” in the pattern
classification stage).
❑ The human visual system (HVS) and a machine vision system
(MSV) have different strengths and limitations and the designer of
an MVS must be aware of them.
❑ A careful analysis of these differences provides insight into why
it is so hard to emulate the performance of the human visual
system using cameras and computers. Three of the biggest
challenges stand out:
19

Machine Vision Systems


1. The HVS can rely on a very large database of images and associated
concepts that have been captured, processed, and recorded during a lifetime.
Although the storage of the images themselves is no longer an expensive
task, mapping them to high-level semantic concepts and putting them all in
context is a very hard task for an MVS, for which there is no solution
available.
2. The very high speed at which the HVS makes decisions based on visual
input. Although several image processing and machine vision tasks can be
implemented at increasingly higher speeds (often using dedicated hardware
or fast supercomputers), many implementations of useful algorithms still
cannot match the speed of their human counterpart and cannot, therefore,
meet the demands of real-time systems.
3. The remarkable ability of the HVS to work under a wide range of
conditions, from deficient lighting to less-than-ideal perspectives for viewing
a 3D object. This is perhaps the biggest obstacle in the design of machine
vision systems, widely acknowledged by everyone in the field. In order to
circumvent this limitation, most MVS must impose numerous constraints on
the operating conditions of the scene, from carefully controlled lighting to
removing irrelevant distractors that may mislead the system to careful
placing of objects in order to minimize the problems of shades and
occlusion.
20

Light, Color, and Electromagnetic Spectrum


❑ The existence of light—or other forms of electromagnetic (EM)
radiation—is an essential requirement for an image to be created,
captured, and perceived.
Light and Electromagnetic Spectrum

Electromagnetic spectrum
21

Light, Color, and Electromagnetic Spectrum


❑ Light can be described in terms
of electromagnetic waves or
particles, called photons.
❑ The human visual system (HVS)
is sensitive to photons of
wavelengths between 400 and 700
nm, where 1𝑛𝑚 = 10−9 𝑚.
Types of Images
❑ Images can be classified into
three categories according to the
type of interaction between the
source of radiation, the properties Recording the various types of interaction
of the objects involved, and the of radiation with objects and surfaces
relative positioning of the image
sensor.
22

Light, Color, and Electromagnetic Spectrum


❑ Reflection Images: These are the result of radiation that has
been reflected from the surfaces of objects. The radiation may be
ambient or artificial. Most of the images we perceive in our daily
experiences are reflection images. The type of information that
can be extracted from reflection images is primarily about surfaces
of objects, for example, their shapes, colors, and textures.
❑ Emission Images: These are the result of objects that are self-
luminous, such as stars and light bulbs (both within the visible
light range), and—beyond visible light range—thermal and
infrared images.
❑ Absorption Images: These are the result of radiation that
passes through an object and results in an image that provides
information about the object’s internal structure. The most
common example is X-ray image
23

Light, Color, and Electromagnetic Spectrum


Light and Color Perception
❑ Light is a particular type of EM radiation that can be sensed by the human eye.
❑ Colors perceived by humans are determined by the nature of the light
reflected by the object, which is a function of the spectral properties of the
light source as well as the absorption and reflectance properties of the object.
❑ Each of these components produces a
different color experience, ranging from what
we call red at one end to violet at the other.
❑ The radiance (physical power) of a light
source is expressed in terms of its spectral power
distribution (SPD)
❑ The human perception of each of these
light sources will vary—from the yellowish
nature of light produced by tungsten light
bulbs to the extremely bright and pure red
laser beam.
24

Light, Color, and Electromagnetic Spectrum

Spectral power distributions of common physical light sources


25

Light, Color, and Electromagnetic Spectrum


Color Encoding and Representation
❑ Color can be encoded using three numerical components and appropriate
spectral weighting functions.
❑ The simplest way to encode color in cameras and displays is by using the red
(R), green (G), and blue (B) values of each pixel.
❑ Human perception of light—and, consequently, color—is commonly
described in terms of three parameters:
1. Brightness: The subjective perception of (achromatic) luminous intensity, or
“the attribute of a visual sensation according to which an area appears to
emit more or less light”
2. Hue: “The attribute of a visual sensation according to which an area appears
to be similar to one of the perceived colors, red, yellow, green and blue, or a
combination of two of them”
3. Saturation: “The colorfulness of an area judged in proportion to its bright-
ness”, which usually translates into a description of the whiteness of the
light source
26

Image Acquisition

Image acquisition, formation, and digitization


Image Sensors
❑ The main goal of an image sensor is to convert EM energy into
electrical signals that can be processed, displayed, and interpreted
as images.
27

Image Acquisition
❑ In single-CCD cameras, colors are
obtained by using a tricolor imager
with different photo sensors for
each primary color of light (red,
green, and blue), usually arranged
in a Bayer pattern.
❑ In those cases, each pixel actually
records only one of the three
primary colors; to obtain a full-
color image, a demosaicing
algorithm—which can run inside
the actual camera, before recording
the image in JPEG format, or in a
separate computer, working on the
raw output from the camera— is
used to interpolate a set of
complete R, G, and B values for The Bayer pattern for single-CCD cameras
each pixel.
28

Image Acquisition
❑More expensive cameras use three CCDs, one for each color, and
an optical beam splitter.
❑ Beam splitters have been around since the days of Plumbicon
tubes. They are made of prisms with dichroic surfaces, that is,
capable of reflecting light in one region of the spectrum and
transmitting light that falls elsewhere.

The beam splitter for three-CCD color cameras


29

Image Acquisition
❑ An alternative
technology to CCDs is
CMOS. CMOS chips
have the advantages of
being cheaper to
produce and requiring
less power to operate
than comparable CCD
chips. Their main
disadvantage is the
increased susceptibility
to noise, which limits
their performance at
low illumination levels. X3 color sensor
30

Image Digitization
❑ Digitization involves two processes : sampling (in time or space)
and quantization (in amplitude).

❑ The result of the digitization process is a pixel array, which is a


rectangular matrix of picture elements whose values correspond
to their intensities (for monochrome images) or color components
(for color images).
31

Image Digitization

Pixel arrays of several imaging standards


Sampling
❑ Sampling is the process of measuring the value of a 2D function
at discrete intervals along the 𝑥 and 𝑦 dimensions.
32

Image Digitization
❑ A system that has equal horizontal and vertical sampling densities
is said to have square sampling.
❑ Several imaging and video systems use sampling lattices where
the horizontal and the vertical sample pitch are unequal, that is,
nonsquare sampling.
❑ Two parameters must be taken into account when sampling
images:
1. The sampling rate, that is, the number of samples across the
height and width of the image.
2. The sampling pattern, that is, the physical arrangement of the
samples. A rectangular pattern, in which pixels are aligned
horizontally and vertically into rows and columns, is by far the
most common form, but other arrangements are possible, for
example, the hexagonal and log-polar sampling patterns
33

Image Digitization
❑ If sampling takes place at a rate lower than twice the highest
frequency component of the signal (the Nyquist criterion), there
will not be enough points to ensure proper reconstruction of the
original signal, which is referred to as under sampling or aliasing.

1D aliasing explanation
Quantization
❑ Quantization is the process of replacing a continuously varying
function with a discrete set of quantization levels.
34

Image Digitization
❑ In the case of images, the function is 𝑓(𝑥, 𝑦) and the quantization levels are
also known as gray levels.
❑ It is common to adopt 𝑁 quantization levels for image digitization, where 𝑁 is
usually an integral power of 2 that is, 𝑁 = 2𝑛 , where 𝑛 is the number of bits
needed to encode each pixel value.
❑ The case where 𝑛 = 28 = 256 produces images where each pixel is
represented by an unsigned byte, with values ranging from 0 (black) to 255
(white)

A mapping function for uniform


quantization (𝑁 = 4).
35

Image Digitization
Spatial and Gray-Level Resolution

Effects of sampling resolution on image quality:


(a) A 1944×2592 image, 256 gray levels, at a 1250 dpi resolution.
(b) The same image resampled at (b) 300 dpi; (c) 150 dpi; (d) 72 dpi.
36

Image Digitization

(a) A 480×640 image, 256 gray levels; (b–h) image requantized to 128, 64,
32, 16, 8, 4, and 2 gray levels.
37

Image Digitization
grayslice is a function to create indexed image from intensity image
by thresholding.
MATLAB Code
38

Digital Image Representation


❑ A digital image, whether it was obtained as a result of sampling
and quantization of an analog image or created already in digital
form, can be represented as a two-dimensional (2D) matrix of real
numbers.

a monochrome image
39

Digital Image Representation


❑ 𝑓(𝑥, 𝑦) to refer to monochrome images of size 𝑀 × 𝑁, where 𝑥
denotes the row number (from 0 to 𝑀 − 1) and 𝑦 represents the
column number (between 0 and 𝑁 − 1)

❑ The value of the 2-D function 𝑓(𝑥, 𝑦) at any given pixel of


coordinates 𝑥0 , 𝑦0 , denoted by 𝑓(𝑥0 , 𝑦0 ), is called the intensity
or gray level of the image at that pixel.
❑ The maximum and minimum values that a pixel intensity can
assume will vary depending on the data type and convention used.
40

Digital Image Representation


❑ 0.0 (black) to 1.0 (white) for double data type
❑ 0 (black) to 255 (white) for uint8 (unsigned integer, 8 bits) representation
❑ Images are represented in digital format in a variety of ways. At
the most basic level, there are two different ways of encoding the
contents of a 2D image in digital format
❑ raster (also known as bitmap) : use one or more two-dimensional arrays of
pixels. The greatest advantages of bitmap graphics are their quality and
display speed; their main disadvantages include larger memory storage
requirements and size dependence (e.g., enlarging a bitmap image may lead
to noticeable artifacts)
❑ vector : use a series of drawing commands to represent an image. It require
less memory and allow resizing and geometric manipulations without
introducing artifacts, but need to be rasterized for most presentation devices
41

Binary (1-Bit) Images


❑ Binary images are encoded as a 2D array, typically using 1 bit per
pixel, where a 0 usually means “black” and a 1 means “white”.
❑ usually suitable for images containing simple graphics, text, or line art, is its
small size

A binary image and the pixel values in a 6×6 neighborhood


42

Gray-Level (8-Bit) Images


❑ Gray-level (also referred to as monochrome) images are also
encoded as a 2D array of pixels, usually with 8 bits per pixel,
where a pixel value of 0 corresponds to “black,” a pixel value of
255 means “white,” and intermediate values indicate varying
shades of gray.

A gray scale image and the pixel values in a 6×6 neighborhood


43

Color Images
❑ Representation of color images is more complex and varied. The
two most common ways of storing color image contents are
❑ RGB representation: each pixel is usually represented by a 24-bit number
containing the amount of its red (R), green (G), and blue (B) components.
Each array element contains an 8-bit value.

Color image (a) and its R (b), G (c), and B (d) components
44

Color Images
❑ Indexed representation: a 2D array contains indices to a color palette or
lookup table (LUT).
❑ because of a problem of compatibility with the old hardware that may not be able
to display the 16 million colors simultaneously.

An indexed color image and the indices in a 4×4 neighborhood


45

Compression
❑ Since raw image representations usually require a large amount
of storage space (and proportionally long transmission times in
the case of file uploads/downloads), most image file formats
employ some type of compression.
❑ Compression methods can be
❑ lossy: when a tolerable degree of deterioration in the visual quality of the
resulting image is acceptable (for general-purpose photographic images)
❑ lossless: when the image is encoded in its full quality (dealing with line art,
drawings, facsimiles, or images in which no loss of detail may be tolerable ,
most notably, space images and medical images.
46

Image File Formats


❑ Most of the image file formats used to represent bitmap images
consist of a file header followed by (often compressed) pixel
data.
❑ The image file header stores information about the image, such
as image height and width, number of bands, number of bits per
pixel, and some signature bytes indicating the file type. In more
complex file formats, the header may also contain information
about the type of compression used and other parameters that are
necessary to decode (i.e., decompress) the image.
❑ The BIN format simply consists of the raw pixel data, without
any header (user of a BIN file must know the relevant image
parameters, such as height and width beforehand.
47

Image File Formats


❑ The PPM format and its variants (PBM for binary images, PGM
for grayscale images, PPM for color images, and PNM for any of
them) are widely used in image processing research and many free
tools for format conversion include them. The headers for these
image formats include a 2-byte signature, or “magic number,” that
identifies the file type, the image width and height, the number of
bands, and the maximum intensity value (which determines the
number of bpp per band).
❑ The Microsoft Windows bitmap (BMP) format is another
widely used and fairly simple format, consisting of a header
followed by raw pixel data.
❑ The JPEG format is the most popular file format for
photographic quality image representation. It is capable of high
degrees of compression with minimal perceptual loss of quality.
48

Image File Formats


❑ Graphics Interchange Format (GIF) uses an indexed
representation for color images (with a palette of a maximum of
256 colors), the LZW (Lempel–Ziv–Welch) compression
algorithm, and a 13-byte header.
❑ Tagged Image File Format (TIFF) is a more sophisticated
format with many options and capabilities, including the ability to
represent truecolor (24 bpp) and support for five different
compression schemes.
❑ Portable Network Graphics (PNG) is an increasingly popular
file format that sup-ports both indexed and truecolor images.
49

Basic Terminology
❑ Image Topology: It involves the investigation of fundamental
image properties, usually done on binary images and with the help
of morphological operators, such as number of occurrences of a
particular object, number of separate (not connected) regions, and
number of holes in an object, to mention but a few.
❑ Neighborhood: The pixels surrounding a given pixel constitute
its neighborhood, which can be interpreted as a smaller matrix
containing (and usually centered around) the reference pixel.

Concept of neighborhood of pixel p (from an image topology perspective):


(a) 4-neighborhood; (b) diagonal neighborhood; (c) 8-neighborhood.
50

Basic Terminology
❑ Adjacency: In the context of image topology, two pixels 𝑝 and 𝑞
are4-adjacent if they are 4-neighbors of each other and 8-
adjacentif they are 8-neighbors of one another. A third type of
adjacency, known as mixed adjacency(or simply m-adjacency), is
sometimes used to eliminate ambiguities (i.e., redundant paths)
that may arise when 8-adjacency is used.
❑ Paths: In the context of image topology, a 4-path between two
pixels 𝑝 and 𝑞 is a sequence of pixels starting with 𝑝 and ending
with 𝑞 such that each pixel in the sequence is 4-adjacent to its
predecessor in the sequence. Similarly, an 8-path indicates that
each pixel in the sequence is 8-adjacent to its predecessor.
51

Basic Terminology
❑ Connectivity: If there is a 4-path between pixels 𝑝 and 𝑞, they
are said to be 4-connected. Similarly, the existence of an 8-path
between them means that they are 8-connected.
❑ Components: A set of pixels that are connected to each other is
called a component. If the pixels are 4-connected, the expression
4-componentis used; if the pixels are 8-connected, the set is called
an 8-component.

Connected components: (a) original (binary) image; (b) results


for 8-connectivity; (c) results for 4-connectivity.
52

Basic Terminology
❑ Distances Between Pixels: The most common distance
measures between two pixels 𝑝 and 𝑞, of coordinates (𝑥0 , 𝑦0 ) and
(𝑥1 , 𝑦1 ), respectively, are as follows:
❑ Euclidean distance

𝐷𝑒 𝑝, 𝑞 = (𝑥1 − 𝑥0 )2 +(𝑦1 − 𝑦0 )2
❑ 𝐷4 (also known as Manhattan or city block) distance
𝐷4 𝑝, 𝑞 = 𝑥1 − 𝑥0 + 𝑦1 − 𝑦0
❑ 𝐷8 (also known as chess board) distance
𝐷8 𝑝, 𝑞 = max 𝑥1 − 𝑥0 , 𝑦1 − 𝑦0
❑ It is important to note that the distance between two pixels depends only
on their coordinates, not their values. The only exception is the 𝐷𝑚 distance,
defined as “the shortest 𝑚-path between two 𝑚-connected pixels.”
53

Overview of Image Processing Operations


❑ Operations in the Spatial Domain: Here, arithmetic
calculations and/or logical operations are performed on the
original pixel values. They can be further divided into three types:
❑ Global Operations: know as Point Operation, for example, contrast
adjustment.
❑ Neighborhood-Oriented Operations: known as local or area
operations, for example, spatial-domain filters.
❑ Operations Combining Multiple Images: Here, two or more images are
used as an input and the result is obtained by applying a (series of) arithmetic
or logical operator(s) to them.
❑ Operations in a Transform Domain: Here, the image
undergoes a mathematical transformation, such as Fourier
transform (FT) or discrete cosine transform (DCT), and the image
processing algorithm works in the transform domain. For
example, frequency-domain filtering techniques.

You might also like