Digital Image Processing PDF
Digital Image Processing PDF
icviet.vn
Preface1
1
Book: Image processing with Matlab Tutor : Prof. Mingyan Jiang (江铭炎)
Address: Shandong University, Information school building, 2 floor (room 203)
Email: jiangmingyan@sdu.edu.cn
网页: 信息学院=>精品课程=>数字图像处理
icviet.vn
access to a more detailed treatment of all the image processing concepts discussed
here, Possible to present theoretical material in a simple manner. To maintain a focus
on the software implementation aspects of image processing problem solutions. The
Image Processing Toolbox offers some significant advantages, not only in the
breadth of its computational tools, but also because it is supported under most
operating systems in use today. This book, its emphasis on showing how to develop
new code to enhance existing MATLAB and IPT functionality. This is an important
feature in image processing, is characterized by the need for extensive algorithm
development and experimental work.
With an introduction to the fundamentals of MATLAB functions and
programming, the book proceeds to address the mainstream areas of image processing.
The major areas covered include: intensity transformations, linear and nonlinear
spatial filtering, filtering in the frequency domain, image restoration and registration,
color image processing. The major areas covered include: wavelets, image data
compression, morphological image processing, image segmentation, region and
boundary representation and description, and object recognition. This material is
complemented by numerous illustrations of how to solve image processing problems
using MATLAB and IPT functions. If unction did not exist, a new function is written
and documented as part of the instructional focus of the book. Over 60 new functions
are included in the following chapters. These functions increase the scope of IPT by
approximately 35 percent for new image processing software solutions.
The material is presented in textbook format, not as a software manual. Although
the book is self-contained, we have established a companion Web site (see Section 1.5)
designed to provide support in a number of areas. For students following a course
study or individuals self study, the site contains tutorials and reviews on background
material, as well as projects and image databases, including all images in the book.
For instructors, the site contains classroom presentation materials images and graphics
used in the book. The site with image processing and IPT fundamentals is a useful
place for up-to-date references, new implementation techniques, and materials.
Download executable files of all the new functions developed in the text.
Significant effort to the selection of material is fundamental, whose value is
likely to remain applicable in a rapidly evolving body of knowledge.
Readers of the book will benefit from this effort and thus find the material timely
and useful in their work.
icviet.vn
Chapter1 Introduction
Preview
1.1 Background
icviet.vn
The reader has some familiarity with MATLAB, as well as rudimentary
knowledge of the basics of computer programming in a technically oriented language.
Because MATLAB is an array-oriented language, basic knowledge of matrix
analysis also is helpful.
The book is organized and presented in a textbook format, basic ideas of both
theory and software are explained prior to the development of any new programming
concepts.
The material is illustrated and clarified further by numerous examples ranging
from medicine and industrial inspection to remote sensing and astronomy.
1- 1
1- 2
icviet.vn
processing.
The toolbox functions, as well as the functions developed in the book, run under
most operating systems. Consult the book Web site (see Section 1.5) for a complete
list.
called the intensity or gray level of the image at that point. When x, y , and the
amplitude values of f are all finite, discrete quantities, we call the image a digital
image.
The field of digital image processing refers to processing digital images by
means of a digital computer.
1- 3
Note that a digital image is composed of a finite number of elements, each of
which has a particular location and value. These elements are referred to as picture
elements, image elements, pels, and pixels. Pixel is the term most widely used to
denote the elements of a digital image. We consider these definitions formally in
Chapter 2.
Vision is the most advanced of our senses, so it is not surprising that images play
the single most important role in human perception. Imaging machines cover almost
the entire electromagnetic (EM) spectrum, ranging from gamma to radio waves. These
include ultrasound, electron microscopy, and computer-generated images. Thus,
digital image processing encompasses a wide and varied field of applications.
There is no general agreement among authors regarding where image processing
stops and other related areas, such as image analysis and computer vision, starts. We
icviet.vn
believe this to be a limiting and somewhat artificial boundary. For example, under this
definition, even the trivial task of computing the average intensity of an image would
not be considered an image processing operation. On the other hand, there are fields
such as computer vision whose ultimate goal is to use computers to emulate human
vision, including learning and being able to make inferences and take actions based on
visual inputs. This area itself is a branch of artificial intelligence (AI), whose
objective is to emulate human intelligence. The area of image analysis (also called
image understanding) is in between image processing and computer vision.
There are no clear-cut boundaries in the continuum from image processing at one
end to computer vision at the other. However, one useful paradigm is to consider three
types of computerized processes in this continuum: low-, mid-, and high-level
processes.
1- 4
icviet.vn
or objects in an image. In this book digital image processing encompasses processes
whose inputs and outputs are images, in addition, encompasses processes that extract
attributes from images, up to and including the recognition of individual objects. As a
simple illustration to clarify these concepts, consider the area of automated analysis of
text. The processes of acquiring an image of the area containing the text,
preprocessing that image, extracting (segmenting) the individual characters,
describing the characters in a form suitable for computer processing, and recognizing
those individual characters, are in the scope of what we call digital image processing
in this book. Making sense of the content of the page may be viewed as being in the
domain of image analysis and even computer vision. Digital image processing, as we
have defined it, is used successfully in a broad range of areas of exceptional social
and economic value.
Every chapter contains the MATLAB and IPT material needed to implement the
image processing methods discussed. When a MATLAB or IPT function does not
exist to implement a specific method, a new function is developed and documented. a
complete listing of every new function is included in the book. The remaining eleven
chapters cover material in the following areas.
Chapter 2: Fundamentals. This chapter covers the fundamentals of MATLAB
notation, indexing, and programming concepts. This material serves as foundation for
the rest of the book.
Chapter 3: Intensity Transformations and Spatial Filtering. This chapter
covers in detail how to use MATLAB and IPT to implement intensity transformation
functions. Linear and nonlinear spatial filters are covered and illustrated in detail.
Chapter 4: Processing in the Frequency Domain. The material in this chapter
shows how to use IPT functions for computing the forward and inverse fast Fourier
transforms (FFTs), how to visualize the Fourier spectrum, and how to implement
filtering in the frequency domain. Shown also is a method for generating frequency
domain filters from specified spatial filters.
Chapter 5: Image Restoration. Traditional linear restoration methods, such as
the Wiener filter, are covered in this chapter. Iterative, nonlinear methods, such as the
Richardson-Lucy method and maximum-likelihood estimation for blind
deconvolution, are discussed and illustrated. Geometric corrections and image
registration also are covered.
Chapter 6: Color Image Processing. This chapter deals with pseudocolor and
full-color image processing. Color models applicable to digital image processing are
discussed, and IPT functionality in color processing is extended via implementation of
additional color models. The chapter also covers applications of color to edge
detection and region segmentation.
Chapter 7:Wavelets. In its current form, IPT does not have any wavelet
transforms. A set of wavelet-related functions compatible with the Wavelet Toolbox is
developed in this chapter that will allow the reader to implement all the
wavelet-transform concepts discussed in the Gonzalez-Woods book.
Chapter 8: Image Compression. The toolbox does not have any data
compression functions. In this chapter, we develop a set of functions that can be used
for this purpose.
Chapter 9: Morphological Image Processing. The broad spectrum of functions
available in IPT for morphological image processing are explained and illustrated in
this chapter using both binary and gray-scale images.
Chapter 10: Image Segmentation. The set of IPT functions available for image
segmentation are explained and illustrated in this chapter. New functions for Hough
transform processing and region growing also are developed.
Chapter 11: Representation and Description. Several new functions for object
representation and description, including chain-code and polygonal representations,
are developed in this chapter. New functions are included also for object description,
including Fourier descriptors, texture, and moment invariants. These functions
complement an extensive set of region property functions available in IPT.
Chapter 12: Object Recognition. One of the important features of this chapter
is the efficient implementation of functions for computing the Euclidean and
Mahalanobis distances. These functions play a central role in pattern matching. The
chapter also contains a comprehensive discussion on how to manipulate strings of
symbols in MATLAB. String manipulation and matching are important in structural
pattern recognition.
In addition to the preceding material, the book contains three appendices.
Appendix A: Contains a summary of all IPT and new image-processing
functions developed in the book. Relevant MATLAB function also are included. This
is a useful reference that provides a global overview of all functions in the toolbox
and the book.
Appendix B: Contains a discussion on how to implement graphical user
interfaces (GUIs) in MATLAB. GUIs are a useful complement to the material in the
book because they simplify and make more intuitive the control of interactive
functions.
Appendix B: Contains a discussion on how to implement graphical user
interfaces (GUIs) in MATLAB. GUIs are a useful complement to the material in the
book because they simplify and make more intuitive the control of interactive
functions.
Appendix C: New function listings are included in the body of a chapter when a
new concept is explained. Otherwise the listing is included in Appendix C. This is true
also for listings of functions that are lengthy. Deferring the listing of some functions
to this appendix was done primarily to avoid breaking the flow of explanations in text
material.
An important feature of this book is the support contained in the book Web site.
The site address is www.prenhall.com/gonzalezwoodseddins.
This site provides support to the book in the following areas: Downloadable
M-files, including all M-files in the book; Tutorials; Projects; Teaching materials;
Links to databases, including all images in the book; Book updates ; Background
publications.
The site is integrated with the Web site of the Gonzalez-Woods book:
www.prenhall.com/gonzalezwoods
which offers additional support on instructional and research topics.
www.imageprocessingplace.com
1.6 Notation
Equations in the book are typeset using familiar italic and Greek symbols, as in
f(x,y)= and Ø(u,v)= .All MATLAB function names and symbols are typeset in
monospace font, as in fft2(f), logical(A), and roipoly(f, c, r).
The first occurrence of a MATLAB or IPT function is highlighted by use of the
following icon on the page margin: function name.
Similarly, the first occurrence of a new function developed in the book is
highlighted by use of the following icon on the page margin: function name.
The symbol is used as a visual cue to denote the end of a function listing.
When referring to keyboard keys, we use bold letters, such as Return and Tab.
We also use bold letters when referring to items on a computer screen or menu, such
as File and Edit.
The MATLAB desktop is the main MATLAB application window. As Fig. 1.1
shows, the desktop contains five subwindows: the Command Window, the Workspace
Browser, the Current Directory Window, the Command History Window, and one or
more Figure Windows, which are shown only when the user displays a graphic.
The Command Window is where the user types MATLAB commands and
expressions at the prompt (>>) and where the outputs of those commands are
displayed. MATLAB defines the workspace as the set of variables that the user creates
in a work session. The Workspace Browser shows these variables and some
information about them. Double-clicking on a variable in the Workspace Browser
launches the Array Editor, which can be used to obtain information and in some
instances edit certain properties of the variable.
The Current Directory tab above the Workspace tab shows the contents of the
current directory, whose path is shown in the Current Directory Window. For example,
in the Windows operating system the path might be as follows: C:\MATLAB\Work,
indicating that directory “Work” is a subdirectory of the main directory “MATLAB,”
which is installed in drive C. Clicking on the arrow in the Current Directory Window
shows a list of recently used paths. Clicking on the button to the right of the window
allows the user to change the current directory.
1- 5
MATLAB uses a search path to find M-files and other MATLAB-related files,
which are organized in directories in the computer file system. Any file run in
MATLAB must reside in the current directory or in a directory that is on the search
path. By default, the files supplied with MATLAB and MathWorks toolboxes are
included in the search path. The easiest way to see which directories are on the search
path, or to add or modify a search path, is to select Set Path from the File menu on
the desktop, and then use the Set Path dialog box. It is good practice to add any
commonly used directories to the search path to avoid repeatedly having the change
the current directory.
The Command History Window contains a record of the commands a user has
entered in the Command Window, including both current and previous MATLAB
sessions. Previously entered MATLAB commands can be selected and re-executed
from the Command History Window by right-clicking on a command or sequence of
commands. This action launches a menu from which to select various options in
addition to executing the commands. This is a useful feature when experimenting with
various commands in a work session.
The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can
be a subwindow in the desktop. M-files are denoted by the extension .m, as in
pixeldup.m. The MATLAB editor window has numerous pull-down menus for tasks
such as saving, viewing, and debugging files. Because it performs some simple checks
and also uses color to differentiate between various elements of code, this text editor
is recommended as the tool of choice for writing and editing M-functions. To open the
editor, type edit at the prompt in the Command Window. Similarly, typing edit
filename at the prompt opens the M-file filename. m in an editor window, ready for
editing. As noted earlier, the file must be in the current directory, or in a directory in
the search path.
The principal way to get help online is to use the MATLAB Help Browser,
opened as a separate window either by clicking on the question mark symbol(?) on the
desktop toolbar, or by typing help-browser at the prompt in the Command Window.
The Help Browser is a Web browser integrated into the MATLAB desktop that
displays Hypertext Markup Language (HTML) documents. The Help Browser
consists of two panes, the help navigator pane, used to find information, and the
display pane, used to view the information.
Self-explanatory tabs on the navigator pane are used to perform a search. For
example, help on a specific function is obtained by selecting the Search tab, selecting
Function Name as the Search Type, and then typing in the function name in the
Search for field.
There are several ways to save and load an entire work session (the contents of
the Workspace Browser) or selected workspace variables in MATLAB. The simplest
is as follows.
To save the entire workspace, simply right-click on any blank space in the
Workspace Browser window and select Save Workspace As from the menu that
appears.
To load saved workspaces and/or variables, left-click on the folder icon on the
toolbar of the Workspace Browser window.
It is possible to achieve the same results described in the preceding paragraphs
by typing save and load at the prompt, with the appropriate file names and path
information.
All references in the book are listed in the Bibliography by author and date, as in
Soille [2003]. Most of the background references for the theoretical content of the
book are from Gonzalez and Woods [2002]. References that are applicable to all
chapters, such as MATLAB manuals and other general MATLAB references, are so
identified in the Bibliography.
Summary
Preview
2- 1
2- 2
2- 3
2- 4
2.2 Reading Images
2- 5
2- 6
2- 7
2- 8
2- 10
2.6 Image Types
Intensity images
Binary images
Indexed images
RGB images
2- 11
2- 12
2.9 Introduction to M-Function Programming
2- 13
2- 14
TABLE 2.5The image arithmetic functions supported by IPT.
2- 15
2- 16
Local Operates and Function
2- 17
2- 18
2- 19
2- 20
2- 21
Summary
Image representation
Some functions for image processing
Image types
Matlab operations in image
Chapter3 Intensity Transformations and Spatial Filtering
Preview
The term spatial domain refers to the image plane, Methods in this category are
based on direct manipulation of pixels in an image. Two important categories of
spatial domain processing: intensity (or gray-level) transformations and spatial
filtering( neighborhood processing, or spatial convolution).
We will develop and illustrate MATLAB formulations representative of
processing techniques in these two categories.
In order to carry a consistent theme, most of the examples are related to image
enhancement. This is a good way to introduce spatial processing because
enhancement is highly intuitive, especially to beginners in the field. These techniques
are general in scope and have uses in numerous other branches of digital image
processing.
3.1 Background
Spatial domain techniques operate directly on the pixels of an image. The spatial
domain processes are denoted by the expression:
g(x,y)=T [ f(x,y) ]
where f(x,y) is the input image, g(x,y) is the output (processed) image, T is an
operator on f ,defined over a specified neighborhood about point(x,y). T can operate
on a set of images, such as performing the addition of K images for noise reduction.
The principal approach for defining spatial neighborhoods about a point is to use
a square or rectangular region centered at as Fig. 3.1 shows.
3- 1
Operator T is applied at each location (x,y) to yield the output, g, at that location.
Only the pixels in the neighborhood are used in computing the value of g at (x,y) .
Its computational implementation in MATLAB requires that careful attention be
paid to data classes and value ranges.
3- 2
3- 3
3- 4
3- 5
The simplest form of the transformation T is when the neighborhood in Fig. 3.1
is of size 1*1 (a single pixel). In this case, the value of g at (x,y) depends only on the
intensity of f at that point, and T becomes an intensity or gray-level transformation
function.
Because they depend only on intensity values, and on (x,y), intensity
transformation functions in simplified form as:
s=T(r)
where r denotes the intensity of f and s the intensity of g, both at any
corresponding point (x,y) in the images.
3- 7
3.2.2 Logarithmic and Contrast-Stretching Transformations
3- 8
The function shown in Fig. 3.4(a) is called a contrast-stretching transformation
function because it compresses the input levels lower than m into a narrow range of
dark levels in the output image. Similarly, it compresses the values above m into a
narrow band of light levels in the output. The result is an image of higher contrast. In
fact, in the limiting case shown in Fig. 3.4(b), the output is a binary image. This
limiting function is called a thresholding function, which, as we discuss in Chapter 10,
is a simple tool used for image segmentation.
The function in Fig. 3.4(a) has the form
1
s T (r )
1 (m / r ) E
where r represents the intensities of the input image, s the corresponding
intensity values in the output image, and E controls the slope of the function.This
equation is implemented in MATLAB for an entire image as:
g = 1./(1 + (m./(double(f) + eps)).^E)
3- 9
Figure 3.5(a) is a Fourier spectrum with values in the range 0 to 1.5*E6
displayed on a linearly scaled, 8-bit system. Figure 3.5(b) shows the result obtained
using the commands
>> g = im2uint8(mat2gray(log(1 + double(f))));
>> imshow(g)
The visual improvement of g over the original image is quite evident.
The histogram of a digital image with L total possible intensity levels in the
range [0,G] is defined as the discrete function:
h(rk ) nk
where rk is the kth intensity level in the interval [0, G] and nk is the number of
pixels in the image whose intensity level is rk.The value of G is 255 for images of
class uint8, 65535 for images of class uint16, and 1.0 for images of class double.
Often, it is useful to work with normalized histograms, obtained simply by
dividing all elements of h(rk) by the total number of pixels in the image, which we
denote by n:
h( rk ) nk
p (rk )
n n
The core function in the toolbox for dealing with image histograms is imhist,
which has the following basic syntax:
h = imhist(f, b)
where f is the input image, h is its histogram h(rk) ,and b is the number of bins
used in forming the histogram ( b = 256 default).
We obtain the normalized histogram simply by using the expression:
p = imhist(f, b)/numel(f)
Recall from Section 2.10.3 that function numel(f) gives the number of elements
in array f (i.e., the number of pixels in the image).
>>imhist(f);
Figure 3.7(a) shows the result.
Histograms often are plotted using bar graphs, we can use the function
bar(horz, v, width)
where v is a row vector containing the points to be plotted, horz is a vector of the
same dimension as v that contains the increments of the horizontal scale, and width is
a number between 0 and 1.
3- 11
SOME FUNCTION
Set, gca, xtick, ytick, axis, xlabel ylabel, text, title, stem, plot, ylim, xlim, hold on
3- 12
3- 15
Figure 3.10(a) shows an image, f, of the Mars moon, Fig. 3.10(b) shows its
histogram, obtained using imhist(f). The image is dominated by large, dark areas,
resulting in a histogram characterized by a large concentration of pixels in the dark
end of the gray scale.
At first glance, one might conclude that histogram equalization would be a good
approach to enhance this image, so that details in the dark areas become more visible,
the result in Fig. 3.10(c), obtained using the command
>> f1 = histeq(f, 256);
The following M-function computes a Gaussian function normalized to unit area,
so it can be used as a specified histogram.
• function p = twomodegauss(m1, sig1, m2, sig2, A1, A2, k)
• function p = manualhist
• g = histeq(f, p);
3- 16
Figure 3.11(b) shows the result. The improvement over the histogram equalized
result in Fig. 3.10(c) is evident by comparing the two images. It is of interest to note
that the specified histogram represents a rather modest change from the original
histogram to obtain a significant improvement in enhancement. The histogram of Fig.
3.11(b) is shown in Fig. 3.11(c).The most distinguishing feature of this histogram is
how its low end has been moved closer to the lighter region of the gray scale, and thus
closer to the specified shape. However, that the shift to the right was not as extreme as
the shift in the histogram shown in Fig. 3.10(d), which corresponds to the poorly
enhanced image of Fig. 3.10(c).
The concept of linear filtering has its roots in the use of the Fourier transform for
signal processing in the frequency domain, a topic discussed in detail in Chapter 4. In
chapter 2, filtering operations are performed directly on the pixels of an image. Use of
the term linear spatial filtering differentiates this type of process from frequency
domain filtering.
The linear operations of interest consist of multiplying each pixel in the
neighborhood by a corresponding coefficient and summing the results to obtain the
response at each point (x,y). If the neighborhood is of size m*n ,mn coefficients are
required.
3- 17
The coefficients are arranged as a matrix, called a filter, mask, filter mask, kernel,
template, or window, with the first three terms being the most prevalent. For reasons
that will become obvious shortly, the terms convolution filter, mask, or kernel, also are
used.
There are two closely related concepts that must be understood clearly when
performing linear spatial filtering. One is correlation; the other is convolution.
Correlation is the process of passing the mask w by the image array f in the manner
described in Fig. 3.12. Mechanically, convolution is the same process, except that w is
rotated by 180° prior to passing it by f. These two concepts are best explained by
some simple examples.
3- 18
For convolution, we simply rotate w(x,y) by 180° and proceed in the same
manner as in correlation [Figs. 3.14(f) through (h)]. As in the one-dimensional
example discussed earlier, convolution yields the same result regardless of which of
the two functions undergoes translation. In correlation the order does matter, a fact
that is made clear in the toolbox by assuming that the filter mask is always the
function that undergoes translation. Note also the important fact in Figs. 3.14(e) and
(h) that the results of spatial correlation and convolution are rotated by 180° with
respect to each other. This is expected because convolution is nothing more than
correlation with a rotated filter mask.
3- 19
The toolbox implements linear spatial filtering using function imfilter:
g = imfilter(f, w, filtering_mode, boundary_options, size_options)
where f is the input image, w is the filter mask, g is the filtered result, and the
other parameters are summarized in Table 3.2.
The filtering_mode specifies whether to filter using correlation ('corr') or
convolution ('conv'). The boundary_options deal with the border-padding issue, with
the size of the border being determined by the size of the filter. These options are
explained further in Example 3.7. The size_options are either 'same' or 'full', as
explained in Figs. 3.13 and 3.14.
The most common syntax for imfilter is
g = imfilter(f, w, 'replicate')
This syntax is used when implementing IPT standard linear spatial filters.
When working with filters that are neither pre-rotated nor symmetric, and we
wish to perform convolution, we have two options. One is to use the syntax:
g = imfilter(f, w, 'conv', 'replicate')
The other approach is to preprocess w by using the function rot90(w, 2) to rotate
it 180°, and then use imfilter(f, w, 'replicate'). Of course these two steps can be
combined into one statement. The preceding syntax produces an image g that is of the
same size as the input.
3- 20
Figure 3.15(a) is a class double image, f, of size 512*512 pixels. Consider the
simple 31*31 filter
>> w = ones(31);
which is proportional to an averaging filter.
shows the result of performing the following filtering operation:
>> gd = imfilter(f, w);
imshow(gd, [ ])
>> gr = imfilter(f, w, 'replicate');
figure, imshow(gr, [ ])
>> gs = imfilter(f, w, 'symmetric');
figure, imshow(gs, [ ])
>> gc = imfilter(f, w, 'circular');
figure, imshow(gc, [ ])
>> f8 = im2uint8(f);
g8r = imfilter(f8, w, 'replicate');
figure, imshow(g8r, [ ])
3- 21
In this section we discuss linear and nonlinear spatial filters supported by IPT.
Additional nonlinear filters are implemented in Section 5.3.
The toolbox supports a number of predefined 2-D linear spatial filters, obtained
by using function fspecial, which generates a filter mask, w, using the syntax
w = fspecial('type', parameters)
where 'type' specifies the filter type, and parameters further define the specified
filter. The spatial filters supported by fspecial are summarized in Table 3.4, including
applicable parameters for each filter.
3- 23
We illustrate the use of fspecial and imfilter by enhancing an image with a
Laplacian filter. And Function fspecial('laplacian', alpha) implements a more general
Laplacian mask: Enhance the image in Fig. 3.16(a) using the Laplacian. This image is
a mildly blurred image of the North Pole of the moon. Enhancement in this case
consists of sharpening the image, we generate and display the Laplacian filter:
>> w = fspecial('laplacian', 0)
w=
0.0000 1.0000 0.0000
1.0000 –4.0000 1.0000
0.0000 1.0000 0.0000
Note that the filter is of class double, and that its shape with alpha = 0 is the
Laplacian filter discussed ,easily have specified this shape manually as
>> w = [0 1 0; 1 –4 1; 0 1 0];
Next we apply w to the input image, f, which is of class uint8:
>> g1 = imfilter(f, w, 'replicate');
>> imshow(g1, [ ])
Figure 3.16(b) shows the resulting image.
3- 24
>> f2 = im2double(f);
>> g2 = imfilter(f2, w, 'replicate');
>> imshow(g2, [ ])
The result, shown in Fig. 3.16(c), is more what a properly processed Laplacian
image should look like. Finally, we restore the gray tones lost by using the Laplacian
by subtracting (because the center coefficient is negative) the Laplacian image from
the original image:
>> g = f2 – g2;
>> imshow(g)
The result, shown in Fig. 3.16(d), is sharper than the original image.
Example 3.10:
Enhancement problems often require the specification of filters beyond those
available in the toolbox. The Laplacian is a good example. Usually, sharper
enhancement is obtained by using the 3*3 Laplacian filter that has a –8 in the center
and is surrounded by 1s, as discussed earlier.
>> f = imread('moon.tif');
>> w4 = fspecial('laplacian', 0); % Same as w in Example 3.9.
>> w8 = [1 1 1; 1 –8 1; 1 1 1];
>> f = im2double(f);
>> g4 = f – imfilter(f, w4, 'replicate');
>> g8 = f – imfilter(f, w8, 'replicate');
>> imshow(f)
>> figure, imshow(g4)
>> figure, imshow(g8)
3- 25
Summary
In addition to dealing with image enhancement, the material is the foundation for
numerous topics in subsequent chapters.
We will encounter spatial processing again in Chapter 5 in connection with
image restoration, where we also take a closer look at noise reduction and
noise-generating functions in MATLAB.
Some of the spatial masks that were mentioned briefly here are used extensively
in Chapter 10 for edge detection in segmentation applications. The concept of
convolution and correlation is explained again in Chapter 4 from the perspective of
the frequency domain.
Mask processing and the implementation of spatial filters will surface in various
discussions throughout the book.
We extend the discussion begun here and introduce additional aspects of how
spatial filters can be implemented efficiently in MATLAB.
3- 26
Chapter4 Frequency Domain Processing
Preview
| F (u , v) | [ R 2 (u , v) I 2 (u , v)]1/2
I (u , v)
(u, v) arctg
R (u , v)
Symmetry property:
f * ( x, y ) f ( x, y )
F * (u , v) F (u , v)
4- 1
f ( x, y )e j 2 ( u0 x v0 y )/ N F (u u0 , v v0 )
N
j 2 ( x y ) N
j 2 ( u0 x v0 y )/ N
e e 2
e j ( x y ) (1) x y
N N
f ( x, y )(1) x y F (u ,v )
2 2
4- 2
The preceding discussion for centering the transform by multiplying f(x,y) by
(-1).^(x+y) is an important concept that is included here for completeness.
When working in MATLAB, the approach is to compute the transform without
multiplication by (-1).^(x+y) and then to rearrange the data afterwards using function
fftshift. This function and its use are discussed in the following section.
4- 3
Floor, ceil
f=ifft2(F)
f=real(ifft2(F))
The foundation for linear filtering in both the spatial and frequency domains is
the convolution theorem,which are:
f(x,y)*h(x,y) ====> H(u,v)F(u,v)
f(x,y)h(x,y) ===> H(u,v)*F(u,v)
* indicates convolution of two functions , h the filter, H the filter transfer
function。
4- 4
Based on the convolution theorem, to obtain the corresponding filtered image in
the spatial domain, we simply compute the inverse Fourier transform of the product
H(u,v)F(u,v). It is important to keep in mind that the process just described is identical
to what we could obtain by using convolution in the spatial domain,as long as the
filter mask,h(x,y) is the inverse Fourier transform of H(u,v).
In practice,spatial convolution generally is simplified by using small masks that
attempt to capture the features of their frequency domain counterparts.
Wraparound error:
Images and their transforms are automatically considered periodic if we elect to
work with DFTs to implement filtering.
It is not difficult to visualize that convolving periodic functions can cause
interference between adjacent periods if the periods are close with respect to the
duration of the nonzero parts of the functions.
This interference called wraparound error, can be avoided by padding the
functions with zeros in the following manner.
Function paddedsize( ):
Assume that functions f(x,y) and h(x,y) are of size A*B and C*D,then we form
two extended (padded) functions both of size P*Q by appending zeros to f and h.
P>=A+C-1
Q>=B+D-1
Function PQ=paddedsize(AB,CD,PARAM)
F=fft2( f, PQ(1), PQ(2) )
Example 4.1 padding and without padding
[M,N]=size(f);
F=fft2(f);
sig=10;
H=lpfilter(‘gaussian’,M,N,sig);
G=H.*F
g=real(ifft2(G));
imshow(g,[ ])
4- 5
4- 6
4- 7
4.3.3 M-function
Use Function g=dftfilt( f, H ) to finish the filtering image with above 6 steps
4- 9
Generate the spatial filter using function fspecial:
h=fspecial(‘sobel’)
h= 1 0 -1
2 0 -2
1 0 -1
4- 10
• Gs=imfilter(double(f),h);
imshow(Gs, [ ] )
• PQ=paddedsize(size(f));
H=freqz2( h, PQ(1), PQ(2) );
H1=ifftshift(H);
• Gf=dftfilt(f, H1);
imshow(Gf, [ ] )
4- 11
imshow(abs(Gs), [ ] )
imshow(abs(Gf), [ ] )
imshow(abs(gs)>0.2*abs(max(gs(:))))
imshow(abs(gf)>0.2*abs(max(gf(:))))
4- 12
Because FFT computations in MATLAB assume that the origin of the transform
is at the top, left of the frequency rectangle, our distance computations are with
respect to that point. The data can be rearranged for visualization purposes by using
function fftshift.
Function dftuv (fftshift)
dftuv provides the necessary meshgrid array for use in distance computations
and other similar applications, the meshgrid arrays generated by dftuv are in the order
required for processing with fft2 of ifft2,so no rearranging of the data is required.
Example 4.3
[u,v]=dftuv(8,5);
D=U.^2+V.^2 fftshift(D)
4- 13
Function lpfilter (type-ideal,btw,gaussian):
Function[H,D]=lpfilter(type, M, N, D0, n)
4- 14
4- 15
Hhp(u,v)=1-Hlp(u,v)
Function hpfilter (type-ideal, btw, gaussian)
function H=hpfilter(type, M, N, D0, n)
4- 16
Example 4.7 highpass filtering
PQ=paddedsize(size(f));
D0=0.05*PQ(1);
H=hpfilter(‘gaussian’, PQ(1), PQ(2), D0);
g=dftfilt(f, H);
imshow(g, [ ] )
4- 17
Hhfe(u,v)=a+b*Hhp(u,v)
where a is the offset, b is the multiplier,and Hhp is the transfer function of a
highpass filter.
Example 4.8
Combining high frequency emphasis and histogram equalization.
PQ=paddedsize(size(f));
D0=0.05*PQ(1);
HBW=hpfilter(‘btw’, PQ(1), PQ(2), D0, 2)
gbw=dftfilt(f, HBW);
ghw=gscale(ghw);
H=0.5+2*HBW;
ghf=dftfilt(f, H);
ghf=gscale(ghf);
ghe=histeq(ghf, 256)
4- 18
Summary
1. Objective of restoration
_improve a given image
_objective process
2. Compare enhancement with it
_ subjective process
Restoration attempts to reconstruct or recover an image that has been degrade by
using a priori knowledge of the degradation phenomenon. Restoration techniques are
oriented toward modeling the degradation and applying the inverse process in order to
recover the original image.
3. Explore how to use MATLAB and IPT capabilities to model degradation
phenomena and to formulate restoration solutions.
5- 1
The degradation function H(u,v) is called the optical transfer function (OTF)
from the Fourier analysis of optical systems.
In spatial domain, h(x,y) is the point spread function (PSF) from a point of light
to obtain the characteristics of the degradation for any type of input.
The OTF and PSF are a Fourier transform pair.
Two basic types of noise models: noise in the spatial domain and noise in the
frequency domain, described by various Fourier properties of the noise.
Example 5.1: Using uniform random numbers to generate random numbers with
a specified distribution.
>>R=a+sqrt(b*log(1-rand(M, N)));
Table 5.1 lists the random variables of interest in the present discussion, along
with their PDFs, CDFs, and random number generator equations.
A=randn(M, N)
I = find(A) % The syntax forms of function imnoise2
[r,c] = find(A)
[r, c, v] = find(A)
>> I = find(A < 128); % To find and set to 0 all pixels in an image
>> A( I ) = 0; % whose values are less than 128
>> I = find( A >=64 & A <= 192); % To set to 128 all pixels in the closed
>> A( I ) = 128; % interval [64, 192]
function R = imnoise2( type, M, N, a, b)
(1). generates an M*N noise array, (2). produces the noise pattern itself
compared with imnoise(outputs a noise image)
Example 5.2: Histograms of data generated using the function imnoise2.
p=hist(r,bins)
Figure 5.2 shows histograms of all the random number types in Table 5.1.
5- 2
>> r= imnoise2( ‘gaussian’, 100000, 1, 0, 1);% to generate the data for
Fig.5.2(a)
p = hist(r, bins) % bins = 50 to generate the histograms in Fig.5.2.
r(x,y)=Asin[2pi*u0(x+Bx)/M+2pi*V0(y+By)/N
R(u,v)=jA/2[exp(j2pi*u0*Bx/M)delter(u+u0,v+v0)
-exp(j2pi*v0*By/N)delter(u-u0,v-v0)]
function [r, R, S] = imnoise3(M, N, C, A, B)
C is a k-by-2 matrix containing k pairs of frequency domain coordinates(u,v)
indicating the locations of impulses in the frequency domain.
Example 5.3: Using function imnoise3
>> C = [0 64; 0 128; 32 32; 64,0; 128 0; -32 32];
>> [r, R, S] = imnoise3(512, 512, C);
>>imshow(S, [ ]) % Fig. 5.3(a)
>> figure, imshow(r, [ ]) % Fig.5.3(b)
>> C = [0 32; 0 64; 16 16; 32 0; 64 0; -16 16] ; % Fig.5.3(c,d)
>> C = [6 32; -2 2] ; % Fig.5.3(e)
>> A = [1 5];
>> [r, R, S] = imnoise3( 512, 512, C, A);
5- 3
5- 4
5.3 Restoration in the Presence of Noise Only—Spatial Filtering
5- 5
5- 6
H(u,v)= once H has been obtained, filtering is done using function dftfilt
explained in section 4.3.3
PSF point spread function, a term that arises from letting h(x,y) operate on a
point of light to obtain the characteristics of the degradation for any type of input.
When no information is available about the PSF, we can resort to “blind
deconvolution” for inferring the PSF.
One of the principal degradations encountered in image restoration problems is
image blur.
Blur that occurs with the scene and sensor at rest with respect to each other can
be modeled by spatial or frequency domain lowpass filters.
Another important degradation model is image blur due to uniform linear motion
between the sensor and scene during image acquisition
Image blur can be modeled using IPT function fspecial
PSF = fspecial(‘motion’, len, theta)
Len pixels, theta is in degrees.
To create a degraded image with a PSF
>> g = imfilter(f, PSF, ‘circular’);
>> g = g + noise;
It is useful to use the same image or test pattern so that comparisons are
meaningful. The test pattern generated by
C = checkerboard( NP, M, N)
To generate a checkerboard in which all light squares are white we use
>> K = im2double( checkerboard( NP, M, N)) > 0.5;
To zoom an image by pixel replication(Appendix C for the code)
B = pixeldup(A, m, n)
Example 5.7: Modeling a blurred, noisy image.
>> f = checkerboard (8); % Figure 5.7(a)
>> PSF = fspecial(‘motion’, 7, 45); % Fig.5.7(b)
>> gb = imfilter(f, PSF, ‘circular’);
>> noise = imnoise(zeros(size(f)), ‘gaussian’, 0, 0.001); % Fig.5.7(c)
>> g = gb + noise; % Fig. 5.7(d)
>> imshow(pixeldup(f, 8), [ ])
5- 7
5.6 Direct Inverse Filtering
Wiener filter seeks an estimate that minimizes the statistical error function
e 2 E{( f fˆ ) 2 }
In frequency domain
1 | H (u , v ) |2
Fˆ (u , v ) [ ]G (u , v )
H (u , v ) | H (u , v ) |2 S (u , v ) / S f (u , v )
5- 9
5- 10
5- 11
(x, y) = T{(w,z)}
For example, if (x, y) = T{(w, v)} = (w/2, z/2), the “distortion” is simply a
shrinking of f by half in both spatial dimensions, as illustrated in Fig.5.12.
5- 12
One of the most commonly used forms of spatial transformations is the affine
transform
5- 14
Most computational methods for spatially transforming an image fall into one
of two categories:
Methods that use forward mapping
Methods that use inverse mapping
One problem with the forward mapping procedure is that two or more different
pixels in the input image could be transformed into the same pixel in the output image,
raising the question of how to combine multiple input pixel values into a single output
pixel value.
Another potential problem is that some output pixels may not be assigned a value
at all.
IPT USE g = imtransform(f, tform, interp) for inverse mapping instead.
A linear conformal transformation is a type of affine transformation that
preserves shapes and angles.
Example 5.13: Spatially transforming images.
>> f = checkerboard(50);
>> s = 0.8;
>> theta = pi/6;
>> T = [s*cos(theta) s*sin(theta) 0
-s*sin(theta) s*cos(theta) 0
0 0 1];
>> tform = maketform(‘affine’, T);
>> g = imtransform(f, tform);
>> g2 = imtransform(f, tform, ‘nearest’);
>> g3 = imtransform(f, tform, ‘FillValue’, 0.5);
>>T2 = [1 0 0; 0 1 0; 50 50 1];
>> tform2 = maketform(‘affine,…T2);
>> g4 = imtransform(f, tform2);
>> g5 = imtransform(f, tform2,…
‘Xdata’, [1 400], ‘Ydata’, … [1 400], ‘FillValue’, 0.5);
5- 15
Image registration methods seek to align two images of the same scene, the
figure illustrates the idea of control points using a test pattern and a version of the test
pattern that has undergone projective distortion.
IPT function cp2tform can be used to fit a specified type of spatial
transformation to the control points (support the transformation types in Table 5.4.
Example
>> basepoints =[83 81; 450 56; 43 293; 249 392; 436 442];
>> inputpoints = [68 66; 375 47; 42 286; 275 434; 523 532];
>> tform = cp2tform(inputpoints, basepoints, ‘projective’);
>> gp = imtransform(g, tform, ‘Xdata”, [1 502], ‘Ydata’, [1 502]);
5- 16
5- 17
The toolbox includes a graphical user interface designed for the interactive
selection of control points on a pair of images.
Figure 5.16 shows a screen capture of this tool, which is invoked by the
command cpselect.
Summary
A good overview of how MATLAB and IPT functions used for image restoration
and as the basis for generating models that help explain the degradation to which an
image has been subjected.
• imnoise, imnoise2, imnoise3
• spfilt for denoising
• Adaptive and blind filter for (PSF and noise)
• Geometric transformations and image registration
Chapter6 Color Image Processing
Preview
In this chapter
we discuss fundamentals of color image processing using the image processing
toolbox ,
extend some of its functionality by developing additional color generation and
transformation functions.
The discussion in this chapter assumes familiarity on the part of the reader with
the principles and terminology of color image processing at an introductory level.
6- 1
Let fR, fG, fB represent three RGB component images. An RGB image is formed
from these images by using the cat(concatenate) operator to stack the image.
rgb_image = cat(3 , fR , fG , fB) % fR,fG,fB represent three
6- 2
Function rgbcube (vx,vy,vz) is used to view the
color cube from any perspective.
function rgbcube(vx,vy,vz)
vertices_matrix=[000;001;010;011;100;101;110;111];
faces_matrix=[1562;1375;1243;2486;3784;5687];
colors=vertices_matrix;
path (‘Vertices’, vertices_matrix,’ Faces’, faces_matrix,…
‘FaceVertexCData’,colors,’FaceColor’,’interp’,…
‘EdgeAlpha’,0)
if nargin=0
vx=10;vy=10; vz=4;
elseif nargin ~=3
error(‘Wrong number of inputs.’)
end
axis off
view([vx,vy,vz])
axis square
6- 3
To display an indexed image we write:
>>imshow(x , map) or, alternatively,
>>image(x)
>>colormap(map)
A colormap is stored with an indexed image and is automatically loaded with
the image when function imread is used to load the image.
Sometime, we use funtion imapporx to approximate an indexed image,
whose syntax is:
[Y , newmap] = imapprox(x , map , n)
Return an array Y with colormap newmap which has at most n colors.
One approach to specify a color map:
>>map(k , :) = [r(k) g(k) b(k)]
% r(k) g(k) b(k) are RGB values that specify one row of a colormap
Table lists the RGB values for some basis color.Any of the three formats shown
in the table can be used to specify color.
For example: the background color of a figure can be changed to green by using
any of the following three statements:
>>whitebg(‘g’)
>>whitebg(‘green’)
>>whitebg([0 1 0])
6- 4
MATLAB provides several predefined color maps, accessed using the command:
colormap(map_name)
>>colormap(copper)
>>imshow(x , copper)
Following table 6.2 lists some of the colormaps available in MATLAB. The
length (number of colors) of these colormaps can be specified by enclosing the
number in parentheses ( ). For example, gray (16) generates a colormap with 16
shades of gray.
6- 5
6.1.3 IPT functions for Manipulating RGB and Indexed Images
6- 6
Function dither is applicable both to gray-scale and color images. Dithering is to
give the visual impression of shade variations on a printed page that consists of dots.
the syntax used by function dither for gray-scale image is:
bw = dither(gray_image)
% bw is the binary image
bw is the dithered result(a binary image).
x = grayslice(gray_image , n)
This function produces an indexed image by thresholding gray_image with
threshhold values
1/n,2/n,……n-1/n
an alternate syntax is:
x = grayslice(gray_image , v)
v is a vector whose values are used to threshold gray_image.
Function gray2ind, syntax is:
[x , map] = gray2ind(gray_image , n)
Function ind2gray,b syntax is:
gray_image = ind2gray(x , map)
converts an indexed image, composed of x and map, to a gray-scale image.
Function rgb2ind:
[x , map] = rgb2ind(rgb_image , n , dither_option)
Function ind2rgb: (converts the matrix X and corresponding colormap map to
RGB format)
rgb_image = ind2rgb(x , map)
Function rgb2gray: (coverts an RGB image to a gray-scale image)
gray_image = rgb2gray(rgb_image)
Example 6.1: Illustration of some of the functions in Table 6.3
Function rgb2ind is quite useful for reducing the number of colors in an RGB
image.
Figure6.4(a) is a 24-bit RGB Image, f. Figure6.4(b) and Figure6.4(c) show the
results of using the commands
>>[x1 , map1] = rgb2ind(f , 8 , ‘nodither’);
>>imshow(x1 , map1) and
>> [x2 , map1] = rgb2ind(f , 8 , ‘dither’);
>>figure , imshow(x2 , map2)
The effects of dithering are usually better illustrated with gray-scale
images. Figure6.4(d) and Figure6.4(e) were obtained using the commands
>>g = rgb2gray(f)
>>g1 = dither(g)
>>figure , imshow(g) ; figure , imshow(g1)
6- 7
used in television in United States. NTSC formats image data consist of three
component: luminance(Y), hue(I),and saturation(Q). YIQ components are obtained
from the RGB components of an image using the transformation:
yiq_image = rgb2ntsc(rgb_image)
where the input RGB image can be of class unit8,unit16,or double. The output
image is an M×N×3 array of class double.
the RGB components are obtained from the YIQ components using the
transformation:
rgb_image = ntsc2rgb(yiq_image)
ycbcr_image = rgb2ycbcr(rgb_image)
Transformation converts from YCbCr back to RGB:
rgb_image = ycbcr2rgb(ycbcr_image )
6- 8
the CMY s are obtained from the RGB using the transformation:
6- 10
6- 11
Fig6.8 shows the HSI model based on color triangles and also on circles.
As the HSI plane moves up and down the intensity axis, the boundary defined by
the intersection of the plane with the faces of the cube have either a triangular or
hexagonal shape. This can be visualized more readily. By looking at the cube down its
gray-scale axis, as shown in fig6.7(a). fig 6.7(b) show hexagonal shape and an
arbitrary color point(show as a dot). fig 6.7(c) , (d) describe as a triangle or a circle.
6- 12
Example 6.2: Converting from RGB to HSI
Fig6.9 show the hue,saturation,and intensity components of an image of an RGB
cube on a white background. Fig6.7(a) is the hue image. Fig6.9(b) is the saturation
image. It show progressively darker values toward the white vertex of the RGB cube.
Fig6.9(c) is the intensity image. It show the average of the RGB values at the
corresponding pixel in fig6.2(b).
6- 13
Fig6.10 shows spatial neighborhood processing of gray-scale and full-color
image. In fig(a) averaging would be accomplished by summing the gray levels of all
the pixels in the neighborhood and dividing by the total number of pixels in the
neighborhood. In fig(b) averaging would be done by summing all the vectors in the
neighborhood and dividing each component by the total number of vectors in the
neighborhood.
The technique in this section are based on processing the color components of a
color image or intensity component of a monochrome image within the context of a
single color image.
A color image c=[Cr; Cg; Cb]=[R; G; B]
C(x,y)=[Cr(x,y); Cg(x,y); Cb(x,y)]=[R(x,y); G(x,y); B(x,y)]
Transformation form
Si=Ti(r) , i=1,2,…,n
r denotes gray-level values, Si and Ti are as color input/output images, n is the
dimension of the color space (number of color components).
Fig 6.11 show a simple but powerful way to specify mapping functions
graphically. Fig 6.11(a) shows a transformation that is formed by linearly
interpolating three control points (the circled coordinates in the figure);Fig 6.11(b)
shows the transformation that results from a cubic spline interpolation of the same
three points. Fig 6.11(c), (d) provide more complex linear and cubic spline
interpolation, respectively.
6- 14
Linear interpolation is implemented by using:
z = interp1q(x , y , xi)
It returns a column vector containing the values of linearly interpolated 1-D
function z at points xi. Column vectors x and y specify the horizontal and vertical
coordinate pairs of the underlying control points. The elements of x must increase
monotonically. The length of z is equal to the length of xi. For example:
>> z = interp1q([0 255] ‘, [0 255]’ , [0 : 255]’)
Cubic spline interpolation is implemented using the spline function
z = spline(x , y , xi)
The development of function ice, given in Appendix B, is a comprehensive
illustration of how to design a graphical user interface(GUI) in MATLAB. Its syntax
is:
g = ice(‘Property Name’ , ‘Property value’ , …)
Table 6.4 lists the valid pairs for use in function ice.
To obtain the properties of an image with handle g we use the get function.
h = get(g)
This function returns all properties and applicable current values of the graphics
objects identified by the handle g.
Example of the syntax of function ice:
ice
g = ice(‘image’,f);
g = ice(‘image’,f,’wait’,’off’);
g = ice(‘image’,f,’space’,’off’);
6- 15
Control points are manipulated with the mouse, as summarized in table 6.5.
Table 6.6 lists the function of the other GUI components.
The following example show typical applications of function ice
Example 6.3:Inverse mappings: monochrome negatives and color complements
Figure 6.13(a) shows the ice interface after the default RGB curve of Fig 6.12 is
modified to produce an inverse or negative mapping function figure 6.13(b) shows the
monochrome negative that results from the inverse mapping.
6- 16
Inverse or negative mapping functions also are useful in color processing. As can
be seen in fig. 6.14(a) and (b), the result of the mapping is reminiscent of
conventional color film negatives. Fig.6.14(a) is transformed to cyan in
fig.6.14(b)---the color complement of red.
Consider next the use of function ice for monochrome and color contrast
manipulation. Figure 6.15(a) through (c) demonstrate the effectiveness of ice in
processing monochrome image. Figure 6.15(d) through (f) show similar effectiveness
for color inputs.
6- 17
Example 6.4: Monochrome and color contrast enhancement
6- 18
Example 6.5: Pseudocolor mappings.
Figure 6.16(a) is an X-ray image of a weld(the horizontal dark region) containing
several cracks and porosities (the bright white streaks running through the middle of
the image). Figure 6.16(b) show a pseudocolor version of the image,it was generated
by mapping the green and blue components of the RGB-converted input using the
mapping function in Fig.6.16(c) and (d). The interactively specified mapping function
transform the black-to-white gray scale to hues between blue and red, with yellow
reserved for white.
6- 19
Example 6.6: Color balancing.
Figure 6.17 shows an application involving a full-color image,in which it is
advantageous to map an image’s color components independently. figure 6.17(a)
shows a CMY scan of a mother and her child with an excess of magenta (keep in
mind that only an RGB version of the image can displayed by MATLAB). to
interactively modify the CMY components of RGB image f1, for Example, the
appropriate ice call is:
>>f2 = ice(‘image’ , f1 , ‘space’ , ‘CMY’);
6- 20
Example 6.7: Histogram based Mappings.
Figure 6.18 (a) shows a color image of a caster stand containing cruets and
shakers. The transformed image in fig.6.18(b),which was produce using the HSI
transformations in fig.6.18(c) and (d), is significant image. Note that the histograms of
the input and output image’s hue, saturation, and intensity components are shown in
fig.6.18(e) and (f).
6- 21
Smoothing an RGB color image, fc, with a linear spatial filter consists of the
following steps:
1. Extract the three component images:
>>fR = fc( : , : , 1); fG = fc( : , : , 2); fB = fc( : , : , 3);
2. Filter each component images individually. Letting w represent a smoothing
filter generated using fspecial , we smooth the red component images as follows:
>> fR_filtered = imfilter(fR , w);
3. Reconstruct the filter RGB images:
>>fc_filtered = cat(3 , fR_filtered , fG_filtered , fB_filtered );
Or combine three steps into one:
>>fc_filtered = imfilter(fR , w);
Example 6.8: Color image smoothing.
Figure 6.19(a) shows an RGB image of size 1197 ×1197 pixels and figs 6.19(b)
through (d) are its RGB component images, extracted using the procedure described
in the previous paragraph.
6- 22
Fig 6.20 (a) through (c) show the three HSI component images of fig.6.19(a),
obtained using function rgb2hsi.
6- 23
Fig. 6.21(a) shows the result of smoothing the image in fig.6.19(a) using function
imfilter with the ‘replicate’ option and an ‘average’ filter of size 25 ×25 pixels.fig.
6.21(b) was obtained using the command:
>>h = rgb2hsi(fc);
>>H = h(: , : , 1 ); S = h(: , : , 2 ); I = h(: , : , 3 );
>>w = fspecial(‘average’ , 25);
>>I_filtered = imfilter(I , w , ‘replicate’);
>>h = cat(3 , H , S , I_filtered );
>>f = hsi2rgb(h);
>>f = min(f , 1);
>>imshow(f)
6- 24
6.5.2 Color Image Sharpening
Sharpening an RGB color image with a linear spatial filter follows the same
procedure outlined in the previous section, but using a sharpening filter instead.
Example 6.9: color image sharpening
To sharpen the image we use the Laplacian filter mask
>>lapmark = [1 1 1 ; 1 –8 1 ; 1 1 1]
The enhanced image was computed and displayed using the command
>>fen = imsubtract(fb , imfilter(fb , lapmask , ‘replicate’));
>>imshow(fen)
6- 25
Figure 6.23(a) shows a neighborhood of 3 ×3, where the z’s indicate pixel value.
The image separated two masks shown in Figure 6.23(b) and (c).
6- 26
The following function implements the color gradient for RGB images:
[VG , A , PPG] = colorgrad(f, T)
Where f is an RGB image, T is an optional threshold in the range[0,1]; VG is the
RGB vector gradient Fθ(x,y); A is the angle image θ(x,y), in radians; and PPG is the
gradient formed by summing the 2-D gradients of the individual color planes
(generated for comparison purposes).
Example 6.10: RGB edge detection using function colorgrad.
6- 27
Figure 6.24(a) through (c) show three simple monochrome image which,when
used as RGB planes, produced the color image in fig.6.24(d). The objectives of this
example are
(1) to illustrate the use of function colorgrad
(2) to show that computing the gradient of a color image by combining
the gradient of its individual color planes is quite different from computing the
gradient directly in RGB vector space using the method just explained.
Let the f represent the RGB image in fig.6.24(d), the command:
>>[VG , A , PPG] = colorgrad(f)
Produced the image VG and PPG shown in fig.6.24(e) and (f).
For example: Fig.6.25(b) and (c) are analogous to fig.6.24(e) and (f). They were
obtained by applying function colorgrad to the image in fig.6.25(a). Fig.6.25(d) is the
difference of the two gradient images, scaled to the range [0,1]
6- 28
6- 29
Segmentation in the manner just described is implemented by function
colorseg ,which has the syntax (see Appendix C for the code)
S = colorseg(method , f , T , parameters)
Where method is either ‘euclidean’ or ‘mahalanobis’, f is the RGB image to be
segmented, and T is the threshold described above.
Example 6.11: RGB color image segmentation
6- 30
Letting f denote the color image in fig.6.27(a),the region in fig.6.27(b) was
obtained using the commands
>>mask = roipoly(f);
>>red = immultiply(mask , f( ; , : , 1));
>>green = immultiply(mask , f( ; , : , 2));
>>blue = immultiply(mask , f( ; , : , 3));
>>g = cat(3 , red , green , blue);
>>figure , imshow(g)
Where mask is a binary image (the same size as f) with 0s in the background and
1s in the region selected interactively.
Next we compute the mean vector and covariance matrix of the points in the
ROI(region of interest), but first the coordinates of the points in the ROI must be
extracted.
>>[m , n , k] = size(g);
>>I = reshape(g ,M*N , 3);
>>I = double(I(idx , 1:3));
>>[c , m] = covmatrix(I);
There are the non-background pixels of the masked image in fig.6.27(b).The
main diagonal of C contains the variances of the RGB component , all we have to do
is extract these elements and compute their square roots:
>>d = diag(c);
>>sd = sqrt(d);
22.0643 24.2442 16.1806
For the ‘euclidean’ option with T=25, we use:
>>E25 = colorseg(‘euclidean’ , f , 25 , m);
Figure 6.28(a) show the result, and fig.6.28(b) through (d) show the
segmentation result with T=50,75,100.
6- 31
Fig.6.29(a) though (d) show the results obtained using the ‘mahalanobis’ option
with the same sequence of threshold values.
6- 32
Summary
Preview
7.1 Background
Consider an image f(x,y) of size M×N whose forward, discrete transform, T(u,
v,…), can be expressed in terms of the general relation (page 242-243). The g u ,v ,...
and hu ,v ,... in these equations are called forward and inverse transformation kernels,
7-1
In the remainder of the chapter, we introduce a number of wavelet kernels.
Each possesses the following general properties:
Property1: Separability, Scalability, and Translatability. The kernels can be
represented as three separable 2-D wavelets and one separable 2-D scaling function
(P.244). Each of these 2-D functions is the product of two 1-D real, square-integrable
scaling and wavelet functions。
be represented at higher scales. 3.The only function that can be represented at every
scale is f(x)=0. 4. Any function can be represented with arbitrary precision as j→∞.
Wavelet j ,k , together with its integer translates and binary scalings, can
adjacent scales.
Property 3: Orthogonality. The expansion functions form an orthonormal or
biorthogonal basis for the set of 1-D measurable, square-integrable functions.
called scaling and wavelet vectors, respectively. They are the filter coefficients of the
fast wavelet transform (FWT), an iterative computational approach to the DWT
shown in Fig.7.2.
7-2
7.2.1 FWTs using the Wavelet Toolbox
7-3
The Haar scalling and wavelet functions shown in Figure 7.3 are
discontinuous and compactly supported. In addition, Because the Haar expansion
functions are orthogonal,the forward and inverse transformation kernels are identical.
Given a set of decomposition filters, the simplest way of computing the associated
wavelet transform is through the Wavelet Toolbox’s wavedec2 function. It is invoked
using [C, S] = wavedec2(X, N, Lo_D, Hi_D).
The slightly more efficient syntax [C, S] = wavedec2(X, N, Wname).
EXAMPLE 7.2: A simple FWT using Haar filters.
Consider the following single-scale wavelet transform with respect to Haar
wavelets:
>> f = magic(4)
>> [c1, s1] = wavedec2(f, 1, ‘haar’)
When the single-scale transform described above is extended to two scales, we
get
>> [c2, s2] = wavedec2(f, 2, ‘haar’)
To minimize border distortions, the border must be treated differently. Many
Wavelet Toolbox functions, including the wavedec2 function, extend or pad the image
being processed based on global parameter dwtmode.
>> dwtmode % To examine the active
>> st = dwtmode(‘status’) % extension mode.
>> dwtmode(STATUS)
% To set the extension mode to STATUS.
>> dwtmode(‘save’, STATUS)
% To make STATUS the default extension mode.
Table 7.2
7-4
7-5
Like its forward counterpart, the inverse fast wavelet transform can be computed
iteratively using digital filters. Figure 7.6 shows the required synthesis or
reconstruction filter bank. At each iteration, four scale j approximation and detail
subimages are upsampled and convolved with two one-dimension filters– one
operating on the subimages’ columns and the other on its rows.
Addition of the results yields the scale j+1 approximation, and the process is
repeated until the original image is reconstructed.
When using the Wavelet Toolbox, function waverec2 is employed to
compute the inverse FWT of wavelet decomposition structure[C, S]. It is invoked
using
g = waverec2(C, S, wname)
The required reconstruction filters can be alternately supplied via syntax
g = waverec2(C, S, Lo_R, Hi_R)
The following custom routine, which we call waveback, is the final function
needed to complete our wavelet-based package for processing images in conjunction
with IPT.
function [varargout] = waveback(c, s, varargin)
7-6
EXAMPLE 7.7: Comparing the execution times of waveback and waverec2.
The following test routine compares the execution times of Wavelet Toolbox
function waverec2 and custom function waveback using a simple modification of the
test function in Example 7.3:
function [ratio, maxdiff] = ifwtcompare(f, n, wname)
For a five scale transform of the 512*512 image in Fig.7.4 with respect to
–4th-order Daubechies’ wavelets, we get
>> [ ratio, maxdifference] = ifwtcompare(f, 5, ‘db4’)
>> f = imread(‘Vase’, ‘tif’);
ratio = 1.0000
Maxdifference = 3.6948e-013
7-7
EXAMPLE 7.9: Wavelet-based image smoothing or blurring.
Wavelets are effective instruments for smoothing or blurring images.
Consider again the test image of Fig.7.7(a). To streamline the smoothing process, we
employ the following utility function:
function [nc, g8] = wavezero(c, s, l, wname)
Using wavezero, a series of increasingly smoothed versions of Fig.7.8(a) can
be generated with the following sequence of commands:
>> f = imread(‘A.tif’);
>> [c, s] = wavefast(f, 4, ‘sym4’);
>> wave2gray(c, s, 20);
>> [c, g8] = wavezero(c, s, 1, ‘sym4’);
>> [c, g8] = wavezero(c, s, 2, ‘sym4’);
>> [c, g8] = wavezero(c, s, 3, ‘sym4’);
>> [c, g8] = wavezero(c, s, 4, ‘sym4’);
7-8
EXAMPLE 7.10: Progressive reconstruction.
Here, we deviate from the three-step procedure described at the beginning
of this section and consider an application without a Fourier domain counterpart.
Each image in the database is stored as a multiscale wavelet decomposition.
This structure is well suited to progressive reconstruction applications. For
the four-scale transform of this example, the decomposition vector is
[A4(:)’ H4(:)’ … Hi(:)’ Vi(:)’ Di(:)’ … V1(:)’ D1(:)’]
This progressive reconstruction process is easily simulated using the
following MATLAB command sequence:
>> f = imread(‘Strawberries.tif’); % Generate transform
>> [c, s] = wavefast(f, 4, ‘jpeg9.7’);
>> wave2gray(c, s, 8);
>> f = wavecopy(‘a’, c, s); % Approximation 1
>> figure; imshow(mat2gray(f));
>> [c, s] = waveback(c, s, ‘jpeg9.7’, 2); % Approximation 2
>> f = wavecopy(‘a’, c, s);
>> figure; imshow(mat2gray(f));
>> [c, s] = waveback(c, s, ‘jpeg9.7’, 3); % Approximation 3
>> f = wavecopy(‘a’, c, s);
>> figure; imshow(mat2gray(f));
>> [c, s] = waveback(c, s, ‘jpeg9.7’, 4); % Approximation 4
>> f = wavecopy(‘a’, c, s);
>> figure; imshow(mat2gray(f));
>> [c, s] = waveback(c, s, ‘jpeg9.7’, 1); % Final image
>> f = wavecopy(‘a’, c, s);
>> figure; imshow(mat2gray(f));
7-9
Summary
1. Introduces the wavelet transform and its use in image processing. Like the
Fourier transform, wavelet transforms can be used in tasks ranging from edge
detection to image smoothing.
2. Because they provide significant insight into both an image’s spatial and
frequency characteristics, wavelets can also be used in applications in which Fourier
methods are not well suited, like progressive image reconstruction.
3. Because the Image Processing Toolbox does not include routines for
computing or using wavelet transforms, a significant portion of this chapter is devoted
to the development of a set of functions that extend the Image Processing Toolbox to
wavelet-based imaging.
4. The functions developed were designed to be fully compatible with
MATLAB’s Wavelet Toolbox, which is introduced in this chapter but is not a part of
the Image Processing Toolbox.
Chapter 8 Image Compression
Preview
8.1 Background
8-1
Image compression system are composed of two distinct structural blocks:an
encoder and a decoder.
compression ratio: CR=n1/n2,
where n1 and n2 denote the number of information carrying units in the original
and encoded images respectively.
In the first stage of the encoding process, the mapper transforms the input image
into a format designed to reduce interpixel redundancies. The second stage, or
quantizer block, reduces the accuracy of the mapper’s output in accordance with a
predefined fidelity criterion-attempting to eliminate only psychovisually redundant
data. It is irreversible. The third stage of the process,a symbol coder creates a code for
the quantizer output and maps the output in accordance with the code.
function cr = imratio (f1, f2), imratio compute the ratio of the bytes in two
images/variables.
For example, the compression of the JPEG encoded image in fig.2.4(c) of
chapter 2 can be computed via
r = imratio(imread(‘bubbles25.jpg’),’bubbles25.jpg’)
r = 35.1612
Information preserving or lossless; lossy compression
Define e(x,y) between f(x,y) and f^(x,y):
e(x,y) = f^(x,y)- f(x,y)
M 1 N 1 2
1
[ fˆ ( x, y ) f ( x, y )] ] 2
1
erms [
MN x 0 y 0
The following M-function computes erms and displays both e(x,y) and its
histogram.
function rmse = compare(f1, f2, scale)
%compare computes and displays the error between two matrices.
Table 8.1
l
Lavg l ( k ) pr ( rk )
k 1 2
=3(0.1875)+1(0.5)+3(0.125)+2(0.1875) = 1.8125
Huffman codes contain the smallest possible number of code symbols per source
symbol subject to the constraint that the source symbols are coded one at a time(the
first step,….the second step………).
8-2
The source reduction and code assignment procedures just described are
implemented by the following M-function, which we call huffman:
function CODE = huffman(p)
huffman builds a variable-length Huffman code for a symbol source.
The following command line sequence uses huffman to generate the code in
Fig.8.2:
>>p = [0.1875 0.5 0.125 0.1875]
>>c = huffman(p)
c= ‘011’
‘1’
‘010’
‘00’
global X Y Z: This statement makes variables X, Y, and Z available to the
function in which they are declared. In huffman, CODE is initialized using the cell
function, whose syntax is
x = cell(m, n)
It creates an m*n array of empty matrices that can be referenced by cell or by
content. In each iteration of the loop, vector p is sorted in ascending order of
probability. This is done by the sort function, whose syntax is
[y, i] = sort (x).
8-3
Figure 8.3 shows the final output of the process for the symbol probabilities in
table 8.1 and fig.8.2(a). Figure 8.3(b) and(c) were generated by inserting between the
last two statements of the huffman main routine.
(1)Each two-way branch in the tree corresponds to a two-element cell array in s;
(2)Each two-element cell array contains the indices of the symbols that were
merged in the corresponding source reduction.
Table 8.2 details the sequence of makecode calls that results for the source
reduction cell array in fig.8.3.
Table 8.2
8-5
A MATLAB external function produced from C or Fortran source code. It
has a platform dependent extension. For example, we type
>>mex unravel.c
A MEX-file named unravel.dll with extension .dll will be created.
Like all C MEX-files, C MEX-file unraval. C consists of two distinct parts: a
computational routine and a gateway routine. The computational routine, also named
unravel, contains the C code that implements the link-based decoding process of
fig.8.5. The gateway routine, which must always be named mexFunction, interfaces C
computational routine unravel to M-file calling function, huff2mat.
Example 8.4 decoding with huff2mat
The Huffman encoded image of example 8.3 can be decoded with the following
sequence of commands:
>>load SqueezeTracy;
>>g = huff2mat(c);
>>f = imread(‘Tracy.tif’);
>>rmse = compare(f, g)
rmse = 0
Function load file reads MATLAB variables from disk file ‘file.mat’ and loads
them into the workspace. The variable names are maintained through a save/load
sequence.
8-6
As fig 8.7 (b) and (d) show, they have virtually identical histograms.
>>f1 = imread(‘Random Matches.tif ’);
>>c1 = mat2huff(f1);
>>entropy(f1)
ans = 7.4253
>>imratio(f1, c1)
ans = 1.0704
>>f2 = imread(‘Aligned Matches.tif’);
>>c2 = mat2huff(f2);
>>entropy(f2)
ans = 7.3505
>>imratio(f2, c2)
ans = 1.0821
A simple mapping procedure is illustrated in fig.8.8.
prediction error: en=fn-f^n
8-7
8-8
function y = mat2lpc(x, f)
mat2lpc compresses a matrix using 1-D lossless predictive coding.
function x = lpc2mat(y, f)
lpc2mat decompresses a 1-D lossless predictive encoded matrix.
Example 8.5 lossless predictive coding
Consider encoding the image of fig.8,7(c) using the simple first-order linear
predictor.
f^(x,y) = round[af(x, y - 1)]
Fig.8.9(a) shows the prediction error image that result with a = 1.
>>f = imread(‘Aligned Matched.tif’);
>>e = mat2lpc(f);
>>imshow(mat2gray(e));
>>entropy(e)
ans = 5.9727
>>c = mat2huff(e);
>>cr = imratio(f, c)
cr = 1.3311
>>[h, x] = hist(e(:) * 512, 512);
>>figure; bar(x, h, ‘k’);
>>g = lpc2mat(huff2mat(c));
>>compare(f, g)
ans = 0
8-9
8.4 Psychovisual Redundancy
8-10
quantize
function y = quantize(x, b, type)
%quantizes the elements of a UINT8 matrix.
Example 8.7: Combining IGS quantization with lossless predictive and
Huffman coding. The following sequence of command combines IGS quantization,
lossless predictive coding, and Huffman coding to compress the image of fig.8.10(a)
to less than a quarter of its original size:
>>f = imread(‘Brushes.tif’);
>>q = quantize(f, 4, ‘igs’);
>>qs = double(q) / 16;
>>e = mat2lpc(qs);
>>c = mat2huff(e);
>>imratio(f, c)
ans = 4.1420
>>ne = huff2mat(c);
>>nqs = lpc2mat(ne);
>>nq = 16*nqs;
>>compare(q, nq);
ans = 0
>>rmse = compare(f, nq)
rmse = 6.8382
In the JPEG baseline coding system, which is based on the discrete cosine
transform and limited to 8 bits, while the quantized DCT coefficient values are
restricted to 11 bits.
8-11
T^(u, v) = [T(u, v)/Z(u, v)]
T^(u, v) for u, v = 0, 1, …., 7 are the resulting normalized and quantized
coefficients,T(u, v) is the DCT of an 8*8 block of image f(x, y), and Z(u, v) is a
transform normalization array like that of fig.8.12(a).
8-12
function y = im2jpeg(x, quality)
im2jpeg compressed an image using a JPEG approximation.
8-13
Example 8.8: JPEG compression
Fig.8.13 (a) and (b) show two JPEG coded and subsequently decoded
approximations of the monochrome image in Fig.8.4(a).
>>f = imread(‘Tracy.tif’);
>>c1 = im2jpeg(f);
>>f1 = jpeg2im(c1);
>>imratio(f, c1)
ans = 18.2450
>>compare(f, f1, 3)
ans = 2.4675
Figure 8.14 show a simplified JPEG 2000 coding system (absent several optional
operation).
8-14
function y = im2jpeg2k(x, n, q)
im2jpeg2k compresses an image using a JPEG 2000 approximation.
Example 8.9:JPEG 2000 compression
>>f = imread(‘tracy.tif’);
>>c1 = im2jpeg2k(f, 5, [8 8.5]);
>>f1 = jpeg2k2im(c1);
>>rms1 = compare(f, f1)
rms1 = 3.6931
>>cr1 = imratio(f, c1)
cr1 = 42.1589
>>c2 = im2jpeg2k(f, 5, [8 7]);
>>f2 = jpeg2k2im(c2);
>>rms2 = compare(f, f2)
rms2 = 5.9172
>>cr2 = imratio(f, c1)
cr2 = 87.7323
>>c3 = im2jpeg2k(f, 1, [1 1 1 1]);
>>f3 = jpeg2k2im(c3);
>>rms3 = compare(f, f3)
rms3 =
1.1234
>>cr3 = imratio(f, c3)
cr3 =
1.6350
8-15
8-16
Summary
The word morphology commonly denotes a branch of biology that deals with the
form and structure of animals and plants.
Mathematical morphology: as a tool for extracting image components that are
useful in the representation and description of region shape, such as boundaries,
skeletons, and the convex hull.
Morphological techniques for pre- or post processing, such as morphological
filtering, thinning, and pruning. Morphology is a cornerstone of the mathematical
set of tools underlying the development of techniques that extract "meaning" from an
image.
1. w A
2. w A
3.B = {w | condition}
4. A c {w|w A} Complement of A
5. C A B union of two sets
6. C A B intersection of two sets
7. A - B {w|w A, w B} difference of two sets
9- 2
Morphological theory views a binary image as the set of its foreground (1-valued)
pixels, the elements of which are in Z2(Z is the set of integers). Set operations such as
union and intersection can be applied directly to binary image sets.
The set operations defined in Fig. 9.1 can be performed on binary images using
MATLAB's logical operators OR (|), AND (&), and NOT (~), as Table 9.1 shows.
Table 9.1
As a simple illustration, Fig. 9.3 shows the results of applying several logical
operators to two binary images containing text.
9- 3
例:A 0,1 , 1,1 , 2,1 , 2, 2 , 3,0
B 0,0 , 0,1
则A B
0,1 , 1,1 , 2,1 , 2, 2 , 3,0 , 0, 2 , 1, 2 , 2, 2 , 2,3 , 3,1
A B的意义A用B扩张,
即所有A的点集使Ba 击中A且交集非零。
9- 4
Figure 9.4 illustrates how dilation works.
Figure 9.4(a) shows a simple binary image containing a rectangular object.
Figure 9.4(b) is a structuring element, a five-pixel-long diagonal line in this case.
Figure 9.4(b) shows the origin of the structuring element using a black outline.
Figure 9.4(c) graphically depicts dilation as a process that translates the origin of
the structuring element throughout the domain of the image and checks to see where it
overlaps with 1-valued pixels.
Fig. 9.4(d) is 1 at each location of the origin such that the structuring element
overlaps at least one 1-valued pixel in the input image.
Dilation is commutative; that is,
A B B A.
It is a convention in image processing to let the first operand of A B be the
image and the second operand be the structuring element, which usually is much
smaller than the image. We follow this convention from this point on.
IPT function imdilate performs dilation. Its basic calling syntax is
A2 = imdilate(A, B)
A = imread('broken_text.tif');
B = [0 1 0; 1 1 1; 0 1 0];
A2 = imdilate(A, B);
imshow(A2);
9- 6
Dilation is associative. That is,
A (B C) (A B) C
Suppose:
A B A (B1 B2 ) (A B l ) B2 .
The associative property is important because the time required to compute
dilation is proportional to the number of nonzero pixels in the structuring element.
The gain in speed with the decomposed implementation is still significant.
IPT function strel constructs structuring elements with a variety of shapes and
sizes. Its basic syntax is: se = strel(shape, parameters), where shape is a string
specifying the desired shape, and parameters is a list of parameters that specify
information about the shape.
In addition to simplifying the generation of common structuring element shapes,
function strel also has the important property of producing structuring elements in
decomposed form.
Function imdilate automatically uses the decomposition information to speed up
the dilation process.
Examples:
Table 9.2
Erosion "shrinks" or "thins" objects in a binary image. As in dilation, the
manner and extent of shrinking is controlled by a structuring element. Figure 9.7
illustrates the erosion process.
9-7
Erosion
定义A用B结构单元腐蚀为AB, 其意义为
AB c E N , c b A, b B 或
AB c, Bc A
例: A
1,0 , 1,1 , 1, 2 , 1,3 , 1, 4 , 1,5 , 2,1 , 3,1 , 4,1 , 5,1
B 0,0 , 0,1
9- 8
A = imread('wirebond_mask.tif');
se = strel('disk', 10);
A2 = imerode(A, se); imshow(A2)
se = strel('disk', 5);
A3 = imerode(A, se);
imshow(A3)
A4 = imerode(A, strel('disk', 20));
imshow(A4)
9.3 Combining Dilation and Erosion
9- 9
The morphological opening of A by B, denoted A B , is simply erosion of A
by B, followed by dilation of the result by B:
A B (A-B) B
A B {(B) z |(B) Z A}
the union of all translations of B that fit entirely within A. Figure 9.9 (a,b,c )
illustrates the morphological opening of A by B.
The morphological closing of A by B, denoted A B , is a dilation followed by
an erosion:
A B (A B)-B
Geometrically, A B is the complement of the union of all translations of B that
do not overlap A. Figure 9.9(d,e) illustrates the morphological closing. Tend to
smooth the contours of objects.
9-10
f = imread('shapes.tif');
se = strel('square', 20);
fo = imopen(f, se);
imshow(fo)
fc = imclose(f, se);
imshow(fc)
foe = imclose(fo, se);
imshow(foc)
Figure 9.11 further illustrates the usefulness of closing and opening by applying these
operations to a noisy fingerprint.
9-11
f = imread('fingerprint.tif');
se = strel('square’, 3);
fo = imopen(f, se);
imshow(fo);
foe = imclose(fo,se);
imshow(foc);
Morphological opening removes completely regions of an object that cannot
contain the structuring element, smoothes object contours, breaks thin connections,
and removes thin protrusions.
C=imopen(A, B)
Morphological closing tends to smooth the contours of objects, joins narrow
breaks, fills long thin gulfs, and fills holes smaller than the structuring element.
C=imclose(A, B)
9-13
9-14
Thinning means reducing binary objects or shapes in an image to strokes that are
a single pixel wide.
9-15
f = imread('fingerprint_cleaned.tif');
g1 = bwmorph(f, 'thin’, 1);
g2 = bwmorph(f, 'thin', 2); imshow(g1), figure, imshow(g2)
ginf = bwmorph(f,'thin',Inf);
imshow(ginf)
Skeletonization is another way to reduce binary image objects to a set of thin
strokes that retain important information about the shapes of the original objects.
Figure 9.16(b) shows the resulting skeleton, which is a reasonable likeness of the
basic shape of the object.
9-16
fs = bwmorph(f, 'skel', Inf);
imshow(f), figure, imshow(fs);
for k =1:5
fs = fs & -endpoints(fs);
end
The concepts discussed thus far are applicable mostly to all foreground (or all
background) individual pixels and their immediate neighbors.
From Fig. 9.17, we can see that To develop computer programs that locate and
operate on objects, we need a more precise set of definitions for key terms.
9-17
9-18
A pixel p at coordinates (x, y) has two horizontal and two vertical neighbors
whose coordinates are (x + 1, y), (x-1, y), (x, y + 1) and (x, y-1). This set of
4-neighbors of p, denoted N4(p), is shaded in Fig. 9.18(a).
The four diagonal neighbors of p have coordinates (x+ 1, y + 1), (x + 1, y -
1), (x- 1, y + 1) and (x -1, y+1). Figure 9.18(b) shows these neighbors, which are
denoted ND(p).
The union of N4(p) and ND(p) in Fig. 9.18(c) are the 8-neighbors of p,
denoted N8(p). Two pixels p and q are said to be 4-adjacent if q N 4 (p) . Similarly,p
and q are said to be 8-adjacent if q N8 (p) . Figures 9.18(d) and (e) illustrate these
9-20
f = imread('objects.tif');
[L, n] = bwlabel(f);
imshow(f)
hold on % So later plotting commands plot on top of the image.
for k = 1:n
[r, c] = find(L == k);
rbar = mean(r);
cbar = mean(c);
plot(cbar, rbar, 'Marker', 'o', 'MarkerEdgeColor' , 'k',... 'MarkerFaceColor' , 'k',
'Marker-Size1 , 10);
plot(cbar, rbar, 'Marker', '*', 'MarkerEdgeColor', 'w');
end
9-22
In this section, as in the binary case, we start with dilation and erosion, which for
gray-scale images are defined in terms of minima and maxima of pixel neighborhoods.
All the binary morphological operations , with the exception of the hit-or-miss
transform, have natural extensions to gray-scale images.
9.6.1 Dilation and Erosion
9-23
Flat structuring elements for gray-scale images are created using strel in the same
way as for binary images. For example, the following commands show how to dilate
and erosion the image f in Fig. 9.23(a) using a flat 3x3 structuring element:
9-26
9.6.3 Reconstruction
h-minima transform
marker image is produced by (mask image-h)
opening-by-reconstruction
f=imread(’plugs.jpg’); se=strel(‘disk’,5);
fe=imerode(f, se); fobr=imreconstruct(fe, f)
closing-by-reconstruction
fobrc=imcomplement(fobr);
fobrce=imerode(fobrc, se);
forbcbr=imcomplement(imreconstruct(fobrce, fobrc))
9-29
9-30
Summary
The most common way to look for discontinuities is to run a mask through
the image.
The response R of mask at any point in the image is given by:
9
R w1 z1 w2 z2 w9 z9 wi zi
i 1
Where Zi is the intensity of the pixel associated with mask coefficient Wi.
10- 3
If the first mask were moved around an image, it would response to the
horizontal lines. The preferred direction of each mask is weighted with a large
coefficient than other possible directions. The coefficients of each mask sums to zero.
Let R1 , R2 , R3 , R4 denotes the responses of the mask i in Figure 10.3.
10- 4
10- 5
Example 10.3
10- 6
Example 10.4
10- 7
10.2 Line Detection Using the Hough Transform
The previous edge detection methods yield pixels that seldom characterize an
edge completely because of noise and other effects. Hough transform can be used to
find and link line segments.
Hough transform:
Suppose that many lines pass through a point ( xi , yi ) , all of which satisfy the
slope-intercept equation yi axi b , for some values of a and b. The equation can be
write as b xi a yi .
A second point ( x j , y j ) also has a line in parameters space associated with it,
and this line intersects the line associated with ( x j , y j ) at (a , b) . In fact, all points
contained on this line have lines in parameter space that intersect at (a , b) .
Figure 10-8 illustrates these concepts.
10- 8
A practical difficulty is that a approaches infinity as the line approaches the
vertical direction.One way around this difficulty is to use the normal representation of
a line:
x cos y sin
10- 9
Figure 10.9(a) illustrates the geometric interpretation of the parameters.
Figure 10.9(b) represents the family of lines that pass through a particular point.
Figure 10.9(c) illustrates the accumulator cells.
Initially, the cell at coordinates (i,j) are set to zeros. Then, for every
nonbackground point ( xk , yk ) , we let equal each of the allowed subdivision values
on the axis and solve for the corresponding using the equation:
xk cos yk sin . The resulting value are then rounded off to the nearest
allowed cell value along the -axis, the corresponding cell is then incremented. A
value of Q in A(i,j) means that Q points in the xy-plane lie on the line
x cos y sin . The number of subdivisions in the -plane determines the
accuracy of the co-linearity of these points.
A function for computing the Hough transfor-mation is given next.This function
makes use of sparse matrices, which provides advantages in both matrix storage space
and computation time.
Function hough.m at P396
The test:
Figure 10.10(a) shows the test image. Figure 10.10(b) shows the test results.
Figure 10.10(c) shows the labeled results.
10- 10
10.2.1 Hough Transform Peak Detection
10- 11
10.3 Thresholding
and G2 ;
1
4. Compute a new threshold value: T ( 1 2 ) ;
2
5. Repeat steps 2 through 4 until the difference in T in successive iterations is
smaller than a predefined parameter T0 .
Example 10.7 show how to implement this procedure in Matlab. The toolbox
provides a function called graythresh that computes a threshold using Ostu’s method.
Treating the normalized histogram as a discrete probability density function as
in :
nq
pr (rq ) ,q=0,1,2…,L-1
n
ostu’s method is to choose the threshold value k that maximum the
between-class variance: 2 B 0 ( 0 T ) 2 1 ( 1 T ) 2
10- 13
The objection of segmentation is to partition an image into regions: 10.1 and 10.2
approached this problem by finding boundaries between regions based on
discontinuities in intensity levels; 10.3 segmentation was accomplished via thresholds
based on the distribution of pixel properties;10.4 discuss segmentation techniques that
based on finding the regions directly.
Let R represent the entire image region. Segmen-tation is a process that partitions
R into n sub-regions, R1 , R2 Rn , such that:
n
R
i 1
i R
10- 14
10- 15
10- 16
The problem can be summarized by the following procedure in which, at any
step. 1.Split into four disjoint quadrants any region for Ri which P ( Ri ) TRUE . 2.
When no further splitting is possible, merges any adjacent regions Ri and R j for
10- 17
10- 18
Example 10.10 Segmenting a binary image using the distance and
watershed transform.
In this example, some object in Fig.10.20(e) were split improperly. This is
called oversegmentation and is a common problem with watershed-based
segmentation methods. The next two sections discuss different techniques for
overcoming this difficulty.
10- 19
10- 20
Marker selection can range from the simple procedures just described to
considerably more complex methods involving size, shape, location, relative
distances, texture content, and so on. The point is that using markers brings a priori
knowledge to bear on the segmentation problem.
Summary
Preview
11.1 Background
Cell Arrays
Cell Arrays provide a way to combine a mixed set of objects under one
variable name.
A single variable C
C={f, b, char_array}
Here f , an uint8 image of size 512*512; b, a sequence of 2-D coordinates in the
form of rows of a 188*2 array; char_array, a cell array containing two character name.
Structures
Structures are similar to cell arrays in the sense that they allow grouping of a
collection of dissimilar data into a single variable. However cells are addressed by
numbers, the elements of structures are addressed by names called fields.
function s=image_stats(f)
s.dim=size(f);
s.AI=mean(f);
s.AIrows=mean(f);
s.AIcols=mean(f);
s is a structure. AI (a scalar), dim(a 1*2 vector), AIrows(an M*1 vector), Aicols(a
1*N).
Notes the use of a dot to separate the structure from its various fields. The field
names arbitrary, but they must begin with a nonnumeric character.
fB、fI represent binary and intensity images. gB: binary output images .
gB=imfill(fB, locations, conn)
gB=imfill(fB, conn, ‘holes’)
G=imfill(fI, conn, ‘holes’)
[r,c]=find(g==2)
z=sortrows(S)
[z, m, n]=unique(s, ’rows’)
Z=circshift(s, [ud lr])
Z=circshift(s, ud)
Suppose that we want to find the boundary of the object with the longest
boundary in image f:
>>B=boundaries(f);
>>d=cellfun(‘length’, B);
>>[max_d, k]=max(d);
>>v=B{k(1)};
b8=boundary2eight(b)
b4=boundary2four(b)
G=boundary2im(b, M, N, x0, y0)
b=cat(1,B{:})
[s, su]=bsubsamp(b,gridsep)
Z=connectpoly(s(:,1), s(:, 2))
[x, y]=intline(x1, x2, y1, y2)
11.2 Representation
Standard practice is to use schemes that compact the data into representations
that are considerably more useful in the computation of descriptors.
11- 1
11- 2
11- 3
11- 4
An Algorithm for Finding MPPs:(1)obtain the cellular complex; (2)obtain
the region internal to the cellular complex; (3) use function boundaries to obtain the
boundary of the region in step 2 as a 4-connected, clockwise sequence of coordinates;
(4 )obtain the Freeman chain code of this 4-connected sequence using function
fchcode. (5)obtain the convex (black dots) and concave (white dots) vertices from the
chain code; (6) From an initial ploygon using the black dotd as vertices, and delete
from furture analysis any white dots that are outside this polygon. (7) From a polygon
with the remaining black and white dots as vertices; (8) Delete all black dots that are
concave vertices; (9) Repeat steps 7 and 8 until all changes cease, at which time all
vertices with angles of 180 degrees are deleted. The remaining dots are the vertices of
the MPP.
Some of the M-Function Used in Implementing the MPP Algorithm.
Function qtdecomp
Q=qtdecomp(B, threshold, [mindim maxdim])
Function qtgetblk
[vals, r, c]=qtgetblk(B, Q, mindim)
Example 11.4 Obtaining the cellular wall of the boundary of a region
>>B=bwperim(B, 8);
>>Q=qtdecomp(B, 0, 2);
>>R=imfill(BF, ‘holes’) & ~BF;
>>b=boundaries(b, 4, ‘cw’);
>>b=b{1};
11- 5’
11- 6
Example 11.5 Using Function minperpoly.
>>b=boundaries(B, 4, ‘cw’);
>>b=b{1};
>>[M, N]=size(B);
>>xmin=min(b(:, 1));
>>ymin=min(b(:, 2));
>>bim=bound2im(b, M, N, xmin, ymin);
>>imshow(bim)
>>[x, y]=minperpoly(B, 2);
>>b2=connectpoly(x, y);
>>B2=bound2im(b2, M, N, xmin, ymin);
>>imshow(B2)
11- 7
11- 8
11.2.3 Signatures
One of the simplest ways that generate a signature is to plot the distance from
an interior point (e.g., the centroid) to the boundary as a function of angle, as
illustrated in Fig.11.9.
11- 9
Function signature finds the signature of a given boundary.
[st, angle, x0, y0]=signature(b, x0, y0)
Function cart2pol to convert Cartesian to polar coordinates
[THETA, RHO]=cart2pol(X, Y)
Function pol2cart is used for converting back to Cartesian coordinates
[X, Y]=plo2cart(THETA, RHO)
11- 10
11- 11
Example 11.6 Signatures
>>[st, angle, x0, y0]=signature(bs);
>>plot(angle, st)
The region boundary can be partitioned by following the contour of the arbitrary
set S and marking the points at which a transition is made into or out of a component
of the convex deficiency.
The region boundary can be partitioned by following the contour of the arbitrary
set S and marking the points at which a transition is made into or out of a component
of the convex deficiency.
11- 12
11.2.5 Skeletons
11- 13
Instead of all the Fourier coefficients, only the first P coefficients are used. The
1 P 1
result is the following approximation to s(k): sˆ k
P u 0
a u e j 2 uk / K .
11- 15
Example 11.8 Fourier descriptors:
>>b=boundaries(f);
>>b=b{1};
>>bim=bound2im(b, 344, 270);
>>z=frdescp(b);
>>z546=ifrdecp(z, 546);
>>z546im=bound2im(z546, 344, 270);
11- 16
11- 17
i 0
K 1
m ri g ri
i 0
11- 18
Function regionprops
D=regionprops(L, properties)
Table 11.1
11.4.2 Texture
i 0 i 0
Table 11.2
Table 11.3
Example 11.10 Statistical texture measures
11- 19
11- 20
2.Spectral Measures of Texture
Spectral measures of texture are based on the Fourier spectrum, which is ideally
suited for describing the directionality of periodic or almost periodic 2-D patterns in
an image.
S r S r
0
R0
S S r
r 1
Function specxture
[srad, sang, S]=specxture(f), Where srad is S(r), sang is S , and S is the
spectrum image.
Example Computing spectral texture.
>>[srad, sang, S]=specxture(f);
>>axis([horzmin horzmax vertmin vertmax]);
11- 21
11- 22
u pq x x y y f x, y
p q
x y
m10 m01
x y
m00 m00
The normalized central moment of order (p + q):
u pq
η pq r
u00
pq
r 1
2
11- 23
Table 11.4
11- 24
11- 25
Table 11.5
11- 26
11- 27
Summary
Preview
12.1 Background
d ( x, y ) x y y x [( x1 y1 ) 2 ( xn yn ) 2 ]1/2
vector d sqrt ( sum ( abs ( x repmat ( y , p,1)). ^ 2, 2)) ,where d(i) is the Euclidean
distance between y and the ith row of X[i.e.,x(I,:)].
Suppose we have two vectors populations X, of dimension p n , and Y, of
dimension q n .
The matrix containing the distance between rows of these two populations can be
obtained using the expression
D sqrt ( sum(abs (repmat ( permute( X ,[132]),[1q1])...
repmat ( permute(Y ,[312]),[ p11])). ^ 2,3))
where D(i.j) is the Euclidean distance between the ith and ith rows of the
populations; that is the distance between X(i,:) and Y(j,:).
The syntax for function permute is
B=permute (A, order)
This function reorders the dimensions of A according to the elements of the
vector order (the elements of this vector must be unique).
In addition to the expressions just discussed, we use in this chapter a distance
by the covariance matrix, Cx , of the population. This metric, called the Mahalanobis
practice is to express the decision boundary between two classes by the single
function dij ( x) di ( x) d j ( x) 0 .Thus dij ( x) 0 for patterns of class wi and
dij ( x) 0 for patterns of class w j .
where N j is the number of training pattern vectors from class w j and the
Suppose that all the mean vectors are organized as rows of a matrix M. The
distance from an arbitrary pattern x to all the mean vectors is:
d sqrt ( sum( abs ( M repmat ( x,W ,1)). ^ 2, 2))
The minimum of d determines the class membership of pattern vector x:
class find (d min(d ));
Selecting the smallest distance is equivalent to evaluating the functions
1
d j ( x ) xT m j mTj m j , j=1,2,…,W
2
classifier is
d ij ( x) d i ( x) d j ( x)
1
xT (mi m j ) (mi m j )T (mi m j ) 0
2
The surface given by this equation is the perpendicular bisector of the line
segment joining mi and m j .
12.3.3Matching by Correlation
Given an image f(x,y), the correlation problem is to find all places in the image
that match a given subimage (also called a mask or template) w(x,y).
One approach for finding matches is to treat w(x,y) as a spatial filter and
compute the sum of products for each location of w in f. The best matches of w(x,y)
in f(x,y) are the locations of the maximum values in the resulting correlation image.
Practical implementations of spatial correlation typically rely on
hardware-oriented solutions.
For prototyping, an alternative approach is to implement correlation in the
frequency domain, making use of the correlation theorem relates spatial correlation to
the product of the image transforms.
The correlation theorem:
f ( x, y ) w( x, y ) F (u , v) H * (u , v)
f ( x, y ) w* ( x, y ) F (u , v) H (u , v)
12- 1
To find the location of the best match implies finding the location(s) of the
highest value in the correlation images:
>>[I,J]=find(g==max(g(:)))
I = 554
J = 203
Another approach is to threshold the correlation image near its maximum, or
threshold its scaled version, gs, whose highest value is known to be 255. Fig.12.1 (d)
was obtained using the command:
>>imshow (gs>254)
The well-known Bayes classifier for 0-1 loss function has decision function
(PDF) of the form
d j ( x) p( x / j ) P( j ) , j=1,2,…,W
(2 ) n / 2 C j
where C j and m j are the covariance matrix and mean vector of the pattern
Substituting the expression for the Gaussian PDF gives the equation:
n 1 1
d j ( x ) ln P ( j ) ln 2 ln C j [( x m j )T C j 1 ( x m j )]
2 2 2
Table 12.1 summarizes the recognition results obtained with the training and
independent pattern sets.
The principal approach in use today for this type of classification is based on
neural network.
Table 12.3
12.4.2 String Matching
and b1b2 ...bn , respectively. Let denote the number of matches between these two
strings, where a match is said to occur in the kth position if ak bk . The number of
symbols that do not match is
max( a , b )
12- 3
[xn,yn]=randvertex(x,y,npix)
Angles=polyangles(x,y)
S=floor(angles/45)+1
S=int2str(s)
R=strsimilarity(sl1,sl2)
The results obtained using five typical strings are summarized in Table 12.4.
Table12.5 shows the same type of computation involving five strings of class 2
against themselves. Table 12.6 shows values of the similarity measure between the
strings of class 1and class 2.
Summary
Although the material in the present chapter is introductory in nature, the topics
covered are fundamental to understanding the state of the art in object recognition.
Having finished study of the material in the preceding twelve chapters, the reader
is now in the position of being able to master the fundamentals of how to prototype
software solutions of image-processing problems using MATLAB and Image
Processing Toolbox functions.