Color Imaging Fundamentals and Applications PDF
Color Imaging Fundamentals and Applications PDF
Color Imaging Fundamentals and Applications PDF
i i
Color Imaging
i i
i i
i i
i i
Color Imaging
Fundamentals and Applications
Erik Reinhard
Erum Arif Khan
Ahmet Oğuz Akyüz
Garrett Johnson
A K Peters, Ltd.
Wellesley, Massachusetts
i i
i i
i i
i i
A K Peters, Ltd.
888 Worcester Street, Suite 230
Wellesley, MA 02482
www.akpeters.com
All rights reserved. No part of the material protected by this copyright notice may
be reproduced or utilized in any form, electronic or mechanical, including photo-
copying, recording, or by any information storage and retrieval system, without
written permission from the copyright owner.
TA1634.R45 2007
621.36’7--dc22
2007015704
Printed in India
12 11 10 09 08 10 9 8 7 6 5 4 3 2 1
i i
i i
i i
i i
Contents
Preface xiii
I Principles 1
1 Introduction 3
1.1 Color in Nature . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Color in Society . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 In this Book . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Physics of Light 17
2.1 Electromagnetic Theory . . . . . . . . . . . . . . . . . . . . 18
2.2 Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4 Spectral Irradiance . . . . . . . . . . . . . . . . . . . . . . 45
2.5 Reflection and Refraction . . . . . . . . . . . . . . . . . . . 47
2.6 Birefringence . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.7 Interference and Diffraction . . . . . . . . . . . . . . . . . . 66
2.8 Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.9 Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . 84
2.10 Application: Image Synthesis . . . . . . . . . . . . . . . . . 96
2.11 Application: Modeling the Atmosphere . . . . . . . . . . . 104
2.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.13 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 120
i i
i i
i i
i i
vi Contents
5 Perception 251
5.1 Lightness, Brightness, and Related Definitions . . . . . . . . 252
5.2 Reflectance and Illumination . . . . . . . . . . . . . . . . . 254
5.3 Models of Color Processing . . . . . . . . . . . . . . . . . . 256
5.4 Visual Illusions . . . . . . . . . . . . . . . . . . . . . . . . 259
5.5 Adaptation and Sensitivity . . . . . . . . . . . . . . . . . . 270
5.6 Visual Acuity . . . . . . . . . . . . . . . . . . . . . . . . . 279
5.7 Simultaneous Contrast . . . . . . . . . . . . . . . . . . . . 282
5.8 Lightness Constancy . . . . . . . . . . . . . . . . . . . . . 286
5.9 Color Constancy . . . . . . . . . . . . . . . . . . . . . . . . 295
5.10 Category-Based Processing . . . . . . . . . . . . . . . . . . 298
5.11 Color Anomalies . . . . . . . . . . . . . . . . . . . . . . . 302
5.12 Application: Shadow Removal from Images . . . . . . . . . 309
5.13 Application: Graphical Design . . . . . . . . . . . . . . . . 312
5.14 Application: Telling Humans and Computers Apart . . . . . 314
5.15 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 314
i i
i i
i i
i i
Contents vii
7 Colorimetry 363
7.1 Grassmann’s Laws . . . . . . . . . . . . . . . . . . . . . . 364
7.2 Visual Color Matching . . . . . . . . . . . . . . . . . . . . 366
7.3 Color-Matching Functions . . . . . . . . . . . . . . . . . . 373
7.4 CIE 1931 and 1964 Standard Observers . . . . . . . . . . . 375
7.5 Calculating Tristimulus Values and Chromaticities . . . . . . 378
7.6 Practical Applications of Colorimetry . . . . . . . . . . . . 387
7.7 Application: Iso-Luminant Color Maps . . . . . . . . . . . 397
7.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . 403
9 Illuminants 491
9.1 CIE Standard Illuminants and Sources . . . . . . . . . . . . 491
9.2 Color Temperature . . . . . . . . . . . . . . . . . . . . . . 503
9.3 Color-Rendering Index . . . . . . . . . . . . . . . . . . . . 508
9.4 CIE Metamerism Index . . . . . . . . . . . . . . . . . . . . 512
9.5 Dominant Wavelength . . . . . . . . . . . . . . . . . . . . . 514
9.6 Excitation Purity . . . . . . . . . . . . . . . . . . . . . . . 517
9.7 Colorimetric Purity . . . . . . . . . . . . . . . . . . . . . . 517
i i
i i
i i
i i
viii Contents
i i
i i
i i
i i
Contents ix
i i
i i
i i
i i
x Contents
IV Appendices 929
B Trigonometry 939
B.1 Sum and Difference Formulae . . . . . . . . . . . . . . . . 939
B.2 Product Identities . . . . . . . . . . . . . . . . . . . . . . . 940
B.3 Double-Angle Formulae . . . . . . . . . . . . . . . . . . . 941
B.4 Half-Angle Formulae . . . . . . . . . . . . . . . . . . . . . 941
B.5 Sum Identities . . . . . . . . . . . . . . . . . . . . . . . . . 941
B.6 Solid Angle . . . . . . . . . . . . . . . . . . . . . . . . . . 942
i i
i i
i i
i i
Contents xi
Bibliography 961
Index 1027
i i
i i
i i
i i
Preface
Color is one of the most fascinating areas to study. Color forms an integral part
of nature, and we humans are exposed to it every day. We all have an intuitive
understanding of what color is, but by studying the underlying physics, chemistry,
optics, and human visual perception, the true beauty and complexity of color can
be appreciated—at least to some extent. Such understanding is not just important
in these areas of research, but also for fields such as color reproduction, vision
science, atmospheric modeling, image archiving, art, photography, and the like.
Many of these application areas are served very well by several specifically
targeted books. These books do an excellent job of explaining in detail some as-
pect of color that happens to be most important for the target audience. This is
understandable as our knowledge of color spans many disciplines and can there-
fore be difficult to fathom.
It is our opinion that in application areas of computer science and computer
engineering, including such exciting fields as computer graphics, computer vi-
sion, high dynamic range imaging, image processing and game development, the
role of color is not yet fully appreciated. We have come across several applications
as well as research papers where color is added as an afterthought, and frequently
wrongly too. The dreaded RGB color space, which is really a collection of loosely
similar color spaces, is one of the culprits.
With this book, we hope to give a deep understanding of what color is, and
where color comes from. We also aim to show how color can be used correctly
in many different applications. Where appropriate, we include at the end of each
chapter sections on applications that exploit the material covered. While the book
is primarily aimed at computer-science and computer-engineering related areas,
as mentioned above, it is suitable for any technically minded reader with an in-
terest in color. In addition, the book can also be used as a text book serving a
graduate-level course on color theory. In any case, we believe that to be useful in
any engineering-related discipline, the theories should be presented in an intuitive
manner, while also presenting all of the mathematics in a form that allows both a
deeper understanding, as well as its implementation.
xiii
i i
i i
i i
i i
xiv Preface
Most of the behavior of light and color can be demonstrated with simple ex-
periments that can be replicated at home. To add to the appeal of this book, where
possible, we show how to set-up such experiments that frequently require no more
than ordinary household objects. For instance, the wave-like behavior of light is
easily demonstrated with a laser pointer and a knife. Also, several visual illusions
can be replicated at home. We have shied away from such simple experiments
only when unavoidable.
The life cycle of images starts with either photography or rendering, and in-
volves image processing, storage, and display. After the introduction of digital
imaging, the imaging pipeline has remained essentially the same for more than
two decades. The phosphors of conventional CRT devices are such that in the
operating range of the human visual system only a small number of discernible
intensity levels can be reproduced. As a result, there was never a need to capture
and store images with a fidelity greater than can be displayed. Hence the immense
legacy of eight-bit images.
High dynamic range display devices have effectively lifted this restriction, and
this has caused a rethinking of the imaging pipeline. Image capturing techniques
can and should record the full dynamic range of the scene, rather than just the
restricted range that can be reproduced on older display devices. In this book,
the vast majority of the photography was done in high dynamic range (HDR),
with each photograph tone-mapped for reproduction on paper. In addition, high
dynamic range imaging (HDRI) is integral to the writing of the text, with excep-
tions only made in specific places to highlight the differences between conven-
tional imaging and HDRI. Thus, the book is as future-proof as we could possibly
make it.
Acknowledgments
Numerous people have contributed to this book with their expertise and help.
In particular, we would like to thank Eric van Stryland, Dean and Director of
CREOL, who has given access to many optics labs, introduced us to his col-
leagues, and allowed us to photograph some of the exciting research undertaken
at the School of Optics, University of Central Florida. Karen Louden, Curator
and Director of Education of the Albin Polasek Museum, Winter Park, Florida,
has given us free access to photograph in the Albin Polasek collection.
We have sourced many images from various researchers. In particular, we
are grateful for the spectacular renderings given to us by Diego Gutierrez and
his colleagues from the University of Zaragoza. The professional photographs
donated by Kirt Witte (Savannah College of Art and Design) grace several pages,
and we gratefully acknowledge his help. Several interesting weather phenomena
were photographed by Timo Kunkel, and he has kindly allowed us to reproduce
i i
i i
i i
i i
Preface xv
some of them. We also thank him for carefully proofreading an early draft of the
manuscript.
We have had stimulating discussions with Karol Myszkowski, Grzegorz
Krawczyk, Rafał Mantiuk, Kaleigh Smith, Edward Adelson and Yuanzhen Li,
the results of which have become part of the chapter on tone reproduction. This
chapter also benefitted from the source code of Yuanzhen Li’s tone-reproduction
operator, made available by Li and her colleagues, Edward Adelson and Lavanya
Sharan. We are extremely grateful for the feedback received from Charles Poyn-
ton, which helped improve the manuscript throughout in both form and substance.
We have received a lot of help in various ways, both direct and indirect, from
many people. In no particular order, we gratefully acknowledge the help from
Janet Milliez, Vasile Rotar, Eric G Johnson, Claudiu Cirloganu, Kadi Bouatouch,
Dani Lischinski, Ranaan Fattal, Alice Peters, Franz and Ineke Reinhard, Gordon
Kindlmann, Sarah Creem-Regehr, Charles Hughes, Mark Colbert, Jared Johnson,
Jaakko Konttinen, Veronica Sundstedt, Greg Ward, Mashhuda Glencross, Helge
Seetzen, Mahdi Nezamabadi, Paul Debevec, Tim Cox, Jessie Evans, Michelle
Ward, Denise Penrose, Tiffany Gasbarrini, Aaron Hertzmann, Kevin Suffern,
Guoping Qiu, Graham Finlayson, Peter Shirley, Michael Ashikhmin, Wolfgang
Heidrich, Karol Myszkowski, Grzegorz Krawczyk, Rafal Mantiuk, Kaleigh
Smith, Majid Mirmehdi, Louis Silverstein, Mark Fairchild, Nan Schaller, Walt
Bankes, Tom Troscianko, Heinrich Bülthoff, Roland Fleming, Bernard Riecke,
Kate Devlin, David Ebert, Francisco Seron, Drew Hess, Gary McTaggart, Habib
Zargarpour, Peter Hall, Maureen Stone, Holly Rushmeier, Narantuja Bujantog-
toch, Margarita Bratkova, Tania Pouli, Ben Long, Native Visions Art Gallery
(Winter Park, Florida), the faculty, staff, and students of the Munsell Color Sci-
ence Laboratory, Lawrence Taplin, Ethan Montag, Roy Berns, Val Helmink,
Colleen Desimone, Sheila Brady, Angus Taggart, Ron Brinkmann, Melissa An-
sweeney, Bryant Johnson, and Paul and Linda Johnson.
i i
i i
i i
i i
Part I
Principles
i i
i i
i i
i i
Chapter 1
Introduction
i i
i i
i i
i i
4 1. Introduction
at the atomic level, such that the origins of color can be appreciated. We find that
the intimate relationship between energy levels, orbital states, and electromag-
netic waves helps to understand why diamonds shimmer, rubies are red, and the
feathers of the blue jay are blue. Even before light enters the eye, a lot has already
happened.
The complexities of color multiply when perception is taken into account. The
human eye is not a simple light detector by any stretch of the imagination. Human
vision is able to solve an inherently under-constrained problem: it tries to make
sense out of a 3D world using optical projections that are two-dimensional. To
reconstruct a three-dimensional world, the human visual system needs to make
a great many assumptions about the structure of the world. It is quite remark-
able how well this system works, given how difficult it is to find a computational
solution that only partially replicates these achievements.
When these assumptions are violated, the human visual system can be fooled
into perceiving the wrong thing. For instance, if a human face is lit from above, it
is instantly recognizable. If the same face is lit from below, it is almost impossi-
ble to determine whose face it is. It can be argued that whenever an assumption is
broken, a visual illusion emerges. Visual illusions are therefore important to learn
about how the human visual system operates. At the same time, they are impor-
tant, for instance in computer graphics, to understand which image features need
to be rendered correctly and which ones can be approximated while maintaining
realism.
Color theory is at the heart of this book. All other topics serve to underpin
the importance of using color correctly in engineering applications. We find that
too often color is taken for granted, and engineering solutions, particularly in
computer graphics and computer vision, therefore appear suboptimal. To redress
the balance, we provide chapters detailing all important issues governing color
and its perception, along with many examples of applications.
We begin this book with a brief assessment of the roles color plays in different
contexts, including nature and society.
i i
i i
i i
i i
Figure 1.1. The chlorophyll in leaves causes most plants to be colored green.
ure 1.1), which is due to chlorophyll, a pigment that plays a role in photo-
synthesis (see Section 3.4.1).
• Color has evolved in many species in conjunction with the color vision
of the same or other species, for instance for camouflage (Figure 1.2), for
attracting partners (Figure 1.3), for attracting pollinators (Figure 1.4), or for
appearing unappetizing to potential predators (Figure 1.5).
i i
i i
i i
i i
6 1. Introduction
Figure 1.2. Many animals are colored similar to their environment to evade predators.
Figure 1.3. This peacock uses bright colors to attract a mate; Paignton Zoo, Devon, UK.
(Photo by Brett Burridge (www.brettb.com).)
i i
i i
i i
i i
Figure 1.4. Many plant species grow brightly colored flowers to attract pollinators such
as insects and bees; Rennes, France, June 2005.
Figure 1.5. These beetles have a metallic color, presumably to discourage predators.
i i
i i
i i
i i
8 1. Introduction
Figure 1.6. This desert rose is colored to reflect light and thereby better control its
temperature.
Color in plants also aids other functions such as the regulation of temperature.
In arid climates plants frequently reflect light of all colors, thus appearing lightly
colored such as the desert rose shown in Figure 1.6.
In humans, color vision is said to have co-evolved with the color of fruit [787,
944]. In industrialized Caucasians, color deficiencies occur relatively often, while
color vision is better developed in people who work the land. Thus, on average,
color vision is diminished in people who do not depend on it for survival [216].
Human skin color is largely due to pigments such as eumelanin and phaeome-
lanin. The former is brown to black, whereas the latter is yellow to reddish-brown.
Deeper layers contain yellow carotenoids. Some color in human skin is derived
from scattering, as well as the occurrence of blood vessels [520].
Light scattering in combination with melanin pigmentation is also the mech-
anism that determines eye color in humans and mammals. There appears to be
a correlation between eye color and reactive skills. In the animal world, hunters
who stalk their prey tend to have light eye colors, whereas hunters who obtain
their prey in a reactive manner, such as birds that catch insects in flight, have dark
eyes. This difference, as yet to be fully explained, extends to humankind where
dark-eyed people tend to have faster reaction times to both visual and auditory
stimuli than light-eyed people [968].
i i
i i
i i
i i
Figure 1.7. Color plays an important role in art. An example is this photograph of the
Tybee Light House, taken by Kirt Witte, which won the 2006 International Color Award’s
Masters of Color Photography award in the abstract category for professional photogra-
phers (see also www.theothersavannah.com).
i i
i i
i i
i i
10 1. Introduction
Figure 1.8. Statue by Albin Polasek; Albin Polasek Museum, Winter Park, FL, 2004.
i i
i i
i i
i i
Figure 1.9. Color in different church interiors; Rennes, France, June 2005 (top); Mainau,
Germany, July 2005 (bottom).
With modern technology, new uses of color have come into fashion. Impor-
tant areas include the reproduction of color [509], lighting design [941], photogra-
phy [697], and television and video [923]. In computer graphics, computer vision,
and other engineering disciplines, color also plays a crucial role, but perhaps less
so than it should. Throughout this book, examples of color use in these fields are
provided.
i i
i i
i i
i i
12 1. Introduction
i i
i i
i i
i i
book. However, it is clear that with the advent of more sophisticated techniques,
the once relatively straightforward theories of color processing in the visual cor-
tex, have progressed to be significantly less straightforward. This trend is contin-
uing to this day. The early inferences made on the basis of single-cell recordings
have been replaced with a vast amount of knowledge that is often contradictory,
and every new study that becomes available poses intriguing new questions. On
the whole, however, it appears that color is not processed as a separate image
attribute, but is processed together with other attributes such as position, size,
frequency, direction, and orientation.
Color can also be surmised from a perceptual point of view. Here, the human
visual system is treated as a black box, with outputs that can be measured. In
psychophysical tests, participants are set a task which must be completed in re-
sponse to the presentation of visual stimuli. By correlating the task response to
the stimuli that are presented, important conclusions regarding the human visual
system may be drawn. Chapter 5 describes some of the findings from this field of
study, as it pertains to theories of color. This includes visual illusions, adaptation,
visual acuity, contrast sensitivity, and constancy.
In the following chapters, we build upon the fundamentals underlying color
theory. Chapter 6 deals with radiometry and photometry, whereas Chapter 7 dis-
cusses colorimetry. Much research has been devoted to color spaces that are de-
signed for different purposes. Chapter 8 introduces many of the currently-used
color spaces and explains the strengths and weaknesses of each color space. The
purpose of this chapter is to give transformations between existing color spaces
and to enable the selection of an appropriate color space for specific tasks, realiz-
ing that each task may require a different color space.
Light sources, and their theoretical formulations (called illuminants), are dis-
cussed in Chapter 9. Chapter 10 introduces chromatic adaptation, showing that the
perception of a colored object does not only depend on the object’s reflectance,
but also on its illumination and the state of adaptation of the observer. While
colorimetry is sufficient for describing colors, an extended model is required to
account for the environment in which the color is observed. Color appearance
models take as input the color of a patch, as well as a parameterized description
of the environment. These models then compute appearance correlates that de-
scribe perceived attributes of the color, given the environment. Color appearance
models are presented in Chapter 11.
In Part III, the focus is on images, and in particular their capture and display.
Much of this part of the book deals with the capture of high dynamic range im-
ages, as we feel that such images are gaining importance and may well become
the de-facto norm in all applications that deal with images.
i i
i i
i i
i i
14 1. Introduction
Chapter 12 deals with the capture of images and includes an in-depth descrip-
tion of the optical processes involved in image formation, as well as issues related
to digital sensors. This chapter also includes sections on camera characterization,
and more specialized capture techniques such as holography and light field data.
Techniques for the capture of high dynamic range images are discussed in Chap-
ter 13. The emphasis is on multiple exposure techniques, as these are currently
most cost effective, requiring only a standard camera and appropriate software.
Display hardware is discussed in Chapter 14, including conventional and
emerging display hardware. Here, the focus is on liquid crystal display devices, as
these currently form the dominant display technology. Further, display calibration
techniques are discussed.
Chapter 15 is devoted to a discussion on natural image statistics, a field im-
portant as a tool both to help understand the human visual system, and to help
structure and improve image processing algorithms. This chapter also includes
sections on techniques to measure the dynamic range of images, and discusses
cross-media display technology, gamut mapping, gamma correction, and algo-
rithms for correcting for light reflected off display devices. Color management for
images is treated in Chapter 16, with a strong emphasis on ICC profiles. Finally,
Chapter 17 presents current issues in tone reproduction, a collection of algorithms
required to prepare a high dynamic range image for display on a conventional dis-
play device.
For each of the topics presented in the third part of the book, the emphasis
is on color management, rather than spatial processing. As such, these chapters
augment, rather than replace, current books on image processing.
The book concludes with a set of appendices, which are designed to help clar-
ify the mathematics used throughout the book (vectors and matrices, trigonome-
try, and complex numbers), and to provide tables of units and constants for easy
reference. We also refer to the DVD-ROM included with the book, which contains
a large collection of images in high dynamic range format, as well as tonemapped
versions of these images (in JPEG-HDR format for backward compatibility), in-
cluded for experimentation. The DVD-ROM also contains a range of spectral
functions, a metameric spectral image, as well as links to various resources on the
Internet.
i i
i i
i i
i i
overview of dyes and pigments is available in Colors: The Story of Dyes and Pig-
ments [245]. An overview of color in art and science is presented in a collection
of papers edited by Lamb and Bourriau [642]. Finally, a history of color order,
including practical applications, is collected in Rolf Kuehni’s Color Space and its
Divisions: Color Order from Antiquity to Present [631].
i i
i i
i i
i i
Chapter 2
Physics of Light
i i
i i
i i
i i
18 2. Physics of Light
the theory of light. It gives rise to various applications, including ray tracing in
optics. It is also the foundation for all image synthesis as practiced in the field of
computer graphics. We show this by example in Section 2.10.
Thus, the purpose of this chapter is to present the theory of electromagnetic
waves and to show how light propagates through different media and behaves near
boundaries and obstacles. This behavior by itself gives rise to color. In Chapter 3,
we explain how light interacts with matter at the atomic level, which gives rise to
several further causes for color, including dispersion and absorption.
• Faraday’s law;
We present each of these laws in integral form first, followed by their equiv-
alent differential form. The integral form has a more intuitive meaning but is
restricted to simple geometric cases, whereas the differential form is valid for any
point in space where the vector fields are continuous.
There are several systems of units and dimensions used in Maxwell’s equa-
tions, including Gaussian units, Heaviside-Lorentz units, electrostatic units, elec-
tromagnetic units, and SI units [379]. There is no specific reason to prefer one
system over another. Since the SI system is favored in engineering-oriented dis-
ciplines, we present all equations in this system. In the SI system, the basic quan-
tities are the meter (m) for length, the kilogram (kg) for mass, the second (s)
for time, the ampere (A) for electric current, the kelvin (K) for thermodynamic
temperature, the mole (mol) for amount of substance, and the candela (cd) for
luminous intensity. (see Table D.3 in Appendix D).
i i
i i
i i
i i
-F
eQ
-e Q
R
F q
Figure 2.1. Two charges Q and q exert equal, but opposite force F upon each other
(assuming the two charges have equal sign).
1 Qq
F= eQ . (2.1)
4 π ε 0 R2
In this equation, R is the distance between the charges, and the constant ε0 =
−9 (in Farad/meter) is called the permittivity of vacuum. The vector e is
36 π × 10
1
Q
a unit vector pointing from the position of one charge to the position of the other.
If we assume that Q is an arbitrary charge and that q is a unit charge, then
we can compute the electric field intensity E by dividing the left- and right-hand
i i
i i
i i
i i
20 2. Physics of Light
Q2 Q1
R2
-e Q
2
-e Q
q 1
R1
E1
E2
E
Figure 2.2. The electric field intensity E at the position of charge q is due to multiple
charges located in space.
F Q
E= = eQ . (2.2)
q 4 π ε 0 R2
If there are multiple charges present, then the electric field intensity at the
position of the test charge is given by the sum of the individual field intensities
(Figure 2.2):
N
Qi
E=∑ e .
2 Qi
(2.3)
i=1 4 π ε0 Ri
For media other than vacuum, the permittivity will generally have a different
value. In that case, we drop the subscript 0, so that, in general, we have
D = ε E. (2.5)
This equation may be seen as relating the electric field intensity to the electric
flux density, whereby the difference between the two vectors is in most cases
determined by a constant unique to the specific medium. This constant, ε , is
called the material’s permittivity or dielectric constant. Equation (2.5) is one of
i i
i i
i i
i i
three so-called material equations. The remaining two material equations will be
discussed in Sections 2.1.3 and 2.1.8.
If, instead of a finite number of separate charges, we have a distribution of
charges over space, the electric flux density is governed by Gauss’ law for electric
fields.
The integral is over the closed surface s and n denotes the outward facing surface
normal. If the charge Q is distributed over the volume v according to a charge
distribution function ρ (also known as electric charge density), we may rewrite
Gauss’ law as follows:
D · n ds = ρ dv. (2.7)
s v
Thus, a distribution of charges over a volume gives rise to an electric field that may
be measured over a surface that bounds that volume. In other words, the electric
flux emanating from an enclosing surface is related to the charge contained by
that surface.
F = Q v × B. (2.8)
1 See also Appendix A which provides the fundamentals of vector algebra and includes further
detail about the relationship between integrals over contours, surfaces, and volumes.
i i
i i
i i
i i
22 2. Physics of Light
v B
B
E+v B v
Q
Figure 2.3. The Lorentz force equation: F is the sum of E and the cross product of the
particle’s velocity v and the magnetic flux density B at the position of the particle.
B = μ H. (2.10)
This equation is the second of three material equations and will be discussed
further in Section 2.1.8.
i i
i i
i i
i i
enclosed by that surface), magnetic fields behave somewhat differently. The total
magnetic flux emanating from a closed surface bounding a magnetic field is equal
to zero; this constitutes Gauss’ law for magnetic fields:
B · n ds = 0. (2.11)
s
This equation implies that no free magnetic poles exist. As an example, the mag-
netic flux emanating from a magnetic dipole at its north pole is matched by the
flux directed inward towards its south pole.
Here, the left-hand side is an integral over the contour c that encloses an open
surface s. The quantity integrated is the component of the electric field intensity E
normal to the contour. The right-hand side integrates the normal component of B
over the surface s. Note that the right-hand side integrates over an open surface,
whereas the integral in (2.11) integrates over a closed surface.
j = ρ v. (2.13)
If the current flux density j is integrated over surface area, then we find the total
charge passing through this surface per second (coulomb/second = ampere; C/s =
A). Thus, the current resulting from a flow of charges is given by
j · n ds. (2.14)
s
i i
i i
i i
i i
24 2. Physics of Light
The units in (2.14) and (2.16) are now both in coulomb per second and are thus
measures of current. Both types of current are related to the magnetic flux density
according to Ampere’s circuital law:
d
H · n dc = j · n ds + D · nds. (2.17)
c s dt s
This law states that a time-varying magnetic field can be produced by the flow of
charges (a current), as well as by a displacement current.
The above equations are given in integral form. They may be rewritten in dif-
ferential form, after which these equations hold for points in space where both
electric and magnetic fields are continuous. This facilitates solving these four
simultaneous equations.
Starting with Gauss’ law for electric fields, we see that the left-hand side
of (2.18a) is an integral over a surface, whereas the right-hand side is an integral
i i
i i
i i
i i
and, therefore, in the limit when the volume v goes to zero, we have
∇·D = ρ . (2.21)
A similar set of steps may be applied to Gauss’ law for magnetic fields, and this
will result in a similar differential form:
∇·B = 0. (2.22)
Ampere’s law, given in (2.18d), is stated in terms of a contour integral on the left-
hand side and a surface integral on the right-hand side. Here, it is appropriate to
apply Stokes’ theorem to bring both sides into the same domain (see Appendix A):
H · n dc = ∇× H · n ds. (2.23)
c s
Substituting this result into (2.18d) yields a form where all integrals are over the
same surface. In the limit when this surface area s goes to zero, Equation (2.18d)
becomes
∂D
∇× H = j + , (2.24)
∂t
which is the desired differential form. Finally, Faraday’s law may also be rewrit-
ten by applying Stokes’ theorem applied to the left-hand side of (2.18c):
∂B
E · n dc = ∇× E · n ds = − · n ds. (2.25)
c s s ∂t
Under the assumption that the area s becomes vanishingly small, this equation
yields
∂B
∇× E = − . (2.26)
∂t
Faraday’s law and Ampere’s law indicate that a time-varying magnetic field
has the ability to generate an electric field, and that a time-varying electric field
i i
i i
i i
i i
26 2. Physics of Light
generates a magnetic field. Thus, time-varying electric and magnetic fields can
generate each other. This property forms the basis for wave propagation, allow-
ing electric and magnetic fields to propagate away from their source. As light can
be considered to consist of waves propagating through space, Maxwell’s equa-
tions are fundamental to all disciplines involved with the analysis, modeling, and
synthesis of light.
D = ε E; (2.27a)
B = μ H. (2.27b)
The third material equation relates the current flux density j to the electric field
intensity E according to a constant σ and is known as Ohm’s law. It is given here
in differential form:
j = σ E. (2.27c)
Material σ Material σ
Good conductors
Silver 6.17 × 107 Tungsten 1.82 × 107
Copper 5.8 × 107 Brass 1.5 × 107
Gold 4.1 × 107 Bronze 1.0 × 107
Aluminium 3.82 × 107 Iron 1.0 × 107
Poor conductors
Water (fresh) 1.0 × 10−3 Earth (dry) 1.0 × 10−3
Water (sea) 4.0 × 100 Earth (wet) 3.0 × 10−2
Insulators
Diamond 2.0 × 10−13 Porcelain 1.0 × 10−10
Glass 1.0 × 10−12 Quartz 1.0 × 10−17
Polystyrene 1.0 × 10−16 Rubber 1.0 × 10−15
i i
i i
i i
i i
ε = ε0 εr ; (2.30a)
μ = μ0 μr . (2.30b)
Values of εr and μr are given for several materials in Tables 2.3 and 2.4.
Normally, the three material constants, σ , ε , and μ , are independent of the
field strengths. However, this is not always the case. For some materials these
values also depend on past values of E or B. In this book, we will not consider
such effects of hysteresis. Similarly, unless indicated otherwise, the material con-
stants are considered to be isotropic, which means that their values do not change
Table 2.2. The speed of light c, permittivity ε0 , and permeability μ0 (all in vacuum).
i i
i i
i i
i i
28 2. Physics of Light
Material εr Material εr
Air 1.0006 Paper 2−4
Alcohol 25 Polystyrene 2.56
Earth (dry) 7 Porcelain 6
Earth (wet) 30 Quartz 3.8
Glass 4 − 10 Snow 3.3
Ice 4.2 Water (distilled) 81
Nylon 4 Water (sea) 70
Material μr Material μr
Aluminium 1.000021 Nickel 600.0
Cobalt 250.0 Platinum 1.0003
Copper 0.99999 Silver 0.9999976
Gold 0.99996 Tungsten 1.00008
Iron 5000.0 Water 0.9999901
2.2 Waves
Maxwell’s equations form a set of simultaneous equations that are normally diffi-
cult to solve. In this section, we are concerned with finding solutions to Maxwell’s
equations, and to accomplish this, we may apply simplifying assumptions. The
assumptions outlined in the previous section, that material constants are isotropic
and independent of time, are the first simplifications. To get closer to an appro-
priate solution with which we can model light and optical phenomena, further
simplifications are necessary.
In particular, a reasonable class of models that are solutions to Maxwell’s
equations is formed by time-harmonic plane waves. With these waves, we can ex-
plain optical phenomena such as polarization, reflection, and refraction. We first
derive the wave equation, which enables the decoupling of Maxwell’s equations,
and therefore simplifies the solution. We then discuss plane waves, followed by
time-harmonic fields, and time-harmonic plane waves. Each of these steps con-
stitutes a further specialization of Maxwell’s equations.
i i
i i
i i
i i
2.2. Waves 29
∇·D = 0; (2.31a)
∇·B = 0; (2.31b)
∂B
∇× E = − ; (2.31c)
∂t
∂D
∇× H = . (2.31d)
∂t
We may apply the material equations for D and H to yield a set of Maxwell’s
equations in E and B only:
ε ∇·E = 0; (2.32a)
∇·B = 0; (2.32b)
∂B
∇× E = − ; (2.32c)
∂t
1 ∂E
∇× B = ε . (2.32d)
μ ∂t
This set of equations still expresses E in terms of B and B in terms of E. Never-
theless, this result may be decoupled by applying the curl operator to (2.32c) (see
Appendix A):
∂
∇× ∇× E = − ∇× B. (2.33)
∂t
Substituting (2.32d) into this equation then yields
∂ 2E
∇× ∇× E = −μ ε . (2.34)
∂ t2
To simplify this equation, we may apply identity (A.23) from Appendix A:
∂ 2E
∇(∇·E) − ∇2 E = − μ ε . (2.35)
∂ t2
From (2.32a), we know that we may set ∇·E to zero in (2.35) to yield the standard
equation for wave motion of an electric field,
∂ 2E
∇2 E − μ ε = 0. (2.36)
∂ t2
i i
i i
i i
i i
30 2. Physics of Light
∂ 2B
∇2 B − μ ε = 0. (2.37)
∂ t2
The two wave equations do not depend on each other, thereby simplifying the
solution of Maxwell’s equations. This result is possible, because the charge and
current distributions are zero in source-free regions, and, therefore, we could sub-
stitute ∇·E = 0 in (2.35) to produce (2.36) (and similarly for (2.37)). Thus, both
wave equations are valid for wave propagation problems in regions of space that
do not generate radiation (i.e., they are source-free). Alternatively, we may as-
sume that the source of the wave is sufficiently far away.
Under these conditions, we are therefore looking for solutions to the wave
equations, rather than solutions to Maxwell’s equations. One such solution is
afforded by plane waves, which we discuss next.
For convenience, we choose a coordinate system such that one of the axes, say z,
is aligned with the direction of propagation, i.e., the surface normal of the plane is
s = (0, 0, 1). As a result, we have r·s = z. Thus, if we consider a wave modeled by
an infinitely large plane propagating through space in the direction of its normal
r · s, we are looking for a solution whereby the spatial derivatives in the plane
are zero and the spatial derivatives along the surface normal are non-zero. The
Laplacian operators in (2.36) and (2.37) then simplify to2
∂ 2E
∇2 E = . (2.39)
∂ z2
Substituting into (2.36) yields
∂ 2E ∂ 2E
− μ ε 2 = 0. (2.40)
∂z 2 ∂t
2 In this section we show results for E and note that similar results may be derived for B.
i i
i i
i i
i i
2.2. Waves 31
i i
i i
i i
i i
32 2. Physics of Light
To show that plane waves are transversal, i.e., both E and H are perpendicular
to the direction of propagation s, we take the dot product between E and s and H
and s:
μ
E·s = − s × H · s, (2.47a)
ε
ε
H·s = s × E · s. (2.47b)
μ
a cos(ω t + ϕ ), (2.48)
where a is the amplitude and ω t + ϕ is the phase. The current and charge sources
vary with time t as well as space r and are therefore written as j(r,t) and ρ (r,t).
Using the results from Appendix C.4, we may separate these fields into a
spatial component and a time-varying component:
i i
i i
i i
i i
2.2. Waves 33
Since Maxwell’s equations are linear, this results in the following set of equations:
Note that all the underlined quantities are (complex) functions of space r only.
In addition, the time-dependent quantity eiω t cancels everywhere. For a homoge-
neous field, we may set j and ρ to zero. For a homogeneous field in steady state
we therefore obtain the following set of equations:
∇·D = 0, (2.51a)
∇·B = 0, (2.51b)
∇× E = − iω B, (2.51c)
∇× H = iω D. (2.51d)
By a procedure similar to the one shown in Section 2.2.1, wave equations for the
electric field E and the magnetic field E may be derived:
∇2 E − ω 2 μ ε E = 0, (2.52a)
∇ B − ω μ ε B = 0.
2 2
(2.52b)
∂E ∂B
= 0, = 0, (2.53a)
∂x ∂x
∂E ∂B
= 0, = 0. (2.53b)
∂y ∂y
i i
i i
i i
i i
34 2. Physics of Light
From Faraday’s law, we find identities for the partial derivatives in the z directions
for E:
∂ Ey
− = −i ω Bx , (2.54a)
∂z
∂ Ex
= −i ω By , (2.54b)
∂z
0 = −i ω Bz . (2.54c)
∂ By
− = i ω ε μ Ex , (2.54d)
∂z
∂ Bx
= i ω ε μ Ey , (2.54e)
∂z
0 = i ω ε μ Ez . (2.54f)
From these equations, we find that the components of E and B in the direction of
propagation (z) are zero. Differentiation of (2.54b) and substitution from (2.54d)
leads to
∂ 2 Ex ∂ By
= −i ω = −ω 2 μ ε Ex . (2.55)
∂z 2 ∂z
As we assume that the field is uniform, Ex is a function of z only, and we may
therefore replace the partial derivatives with ordinary derivatives, yielding the
wave equation for harmonic plane waves ,
d 2 Ex
+ ω 2 μ ε Ex = 0. (2.56)
dz2
Similar equations may be set up for Ey , Bx , and By . By letting
β 2 = ω2 μ ε, (2.57)
a general solution for (2.56) is given by
−i β z iβ z
Ex = E+
me + E−
me . (2.58)
+
For argument’s sake, we will assume that the newly introduced constants Em and
− + −
Em are real, and therefore we replace them with Em and Em . The solution of the
wave equation is then the real part of Ex ei ω t :
Ex (z,t) = Re{Ex ei ω t } (2.59a)
i(ω t−β z) i(ω t+β z)
= Re{E+
me + E−
me } (2.59b)
= E+
m cos(ω t − β z) + E−
m cos(ω t + β z). (2.59c)
i i
i i
i i
i i
2.2. Waves 35
Thus, the solution to the wave equation for harmonic plane waves in homogeneous
media may be modeled by a pair of waves, one propagating in the +z direction
and the other traveling in the −z direction.
Turning our attention to only one of the waves, for instance E+
m cos (ω t − β z),
it is clear that by keeping to a single position z, the wave produces an oscillation
with angular frequency ω . The frequency f of the wave and its period T may be
derived from ω as follows:
ω 1
f= = . (2.60)
2π T
At the same time, the wave E+m cos(ω t − β z) travels through space in the positive
z direction, which follows from the −β z component of the wave’s phase. The
value of the cosine does not alter if we add or subtract multiples of 2 π to the
phase. Hence, we have
ω t + β z = ω t + β (z + λ ) + 2 π . (2.61)
Solving for λ , which is called wavelength, and combining with (2.57) we get
2π 2π
λ= = √ . (2.62)
β ω με
With the help of Equations (2.29) and (2.60), we find the well-known result that
the wavelength λ of a harmonic plane wave relates to its frequency by means of
the speed of light:
vp
λ= . (2.63)
f
The phase velocity v p may be viewed as the speed with which the wave prop-
agates. This velocity may be derived by setting the phase value to a constant:
ωt C
ωt − β z = C ⇒ z= − . (2.64)
β β
For a wave traveling in the z direction through vacuum, the velocity of the wave
equals the time derivative in z:
dz ω 1
vp = = =√ . (2.65)
dt β με
Thus, we have derived the result of (2.28). In vacuum, the phase velocity is v p = c.
For conductive media, the derivation is somewhat different because the elec-
tric field intensity j = σ E is now not zero. Thus, Ampere’s law is given by
∇× H = σ E + iωε E. (2.66)
i i
i i
i i
i i
36 2. Physics of Light
i i
i i
i i
i i
2.2. Waves 37
{
1Å 1 nm
-10 -9 1m
λ -13 10 10 -6 0 5
10 10 10 10
{
1 THz 1 GHz 1 MHz 1 kHz
f
22 15 12 9 6 3
10 10 10 10 10 10
i i
i i
i i
i i
38 2. Physics of Light
Emissivity
0.48
Emissivity of tungsten
0.47 at 2000 (K)
0.46
0.45
0.44
0.43
0.42
0.41
300 350 400 450 500 550 600 650 700 750 800
Wavelength λ (nm)
Figure 2.5. The relative contribution of each wavelength to the light emitted by a tungsten
radiator at a temperature of 2000 K [653].
2.3 Polarization
We have already shown that harmonic waves are transversal: both E and H lie
in a plane perpendicular to the direction of propagation. This still leaves some
degrees of freedom. First, both vectors may be oriented in any direction in this
plane (albeit with the caveat that they are orthogonal to one another). Further,
the orientation of these vectors may change with time. Third, their magnitude
may vary with time. In all, the time-dependent variation of E and H leads to
polarization, as we will discuss in this section.
We continue to assume without loss of generality that a harmonic plane wave
is traveling along the positive z-axis. This means that the vectors E and H may be
decomposed into constituent components in the x- and y-directions:
E = E x ex + E y ey e−i β z . (2.74)
Here, ex and ey are unit normal vectors along the x- and y-axes. The complex
amplitudes E x and E y are defined as
E x = |E x | ei ϕx , (2.75a)
i ϕy
E y = |E y | e . (2.75b)
The phase angles are therefore given by ϕx and ϕy . For a given point in space z =
r · s, as time progresses the orientation and magnitude of the electric field intensity
vector E will generally vary. This can be seen by writing Equations (2.75) in their
i i
i i
i i
i i
2.3. Polarization 39
real form:
Ex
= cos(ω t − β z + ϕx ), (2.76a)
|E x |
Ey
= cos(ω t − β z + ϕy ). (2.76b)
|E y |
It is now possible to eliminate the component of the phase that is common to both
of these equations, i.e., ω t − β z, by rewriting them in the following form (using
identity (B.7a); see Appendix B):
Ex
= cos(ω t − β z) cos(ϕx ) − sin(ω t − β z)sin(ϕx ), (2.77a)
|E x |
Ey
= cos(ω t − β z) cos(ϕy ) − sin(ω t − β z)sin(ϕy ). (2.77b)
|E y |
If we solve both equations for cos(ω t − β z) and equate them, we get
Ex Ey
cos(ϕy ) − cos(ϕx ) = sin(ω t − β z)
|E x | |E y |
× (sin(ϕy ) cos(ϕx ) − cos(ϕy ) sin(ϕx )) (2.78a)
= sin(ω t − β z) sin(ϕy − ϕx ). (2.78b)
Repeating this, but now solving for sin(ω t − β z) and equating the results, we find
Ex Ey
sin(ϕy ) − sin(ϕx ) = cos(ω t − β z) sin(ϕy − ϕx ). (2.79)
|E x | |E y |
By squaring and adding these equations, we obtain
Ex 2 Ey 2 Ex Ey
+ −2 cos(ϕy − ϕx ) = sin2 (ϕy − ϕx ). (2.80)
|E x | |E y | |E x | |E y |
This equation shows that the vector E rotates around the z-axis describing an
ellipse. The wave is therefore elliptically polarized. The axes of the ellipse do not
need to be aligned with the x- and y-axes, but could be oriented at an angle.
Two special cases exist; the first is when the phase angles ϕx and ϕy are sepa-
rated by multiples of π :
ϕy − ϕx = m π (m = 0, ±1, ±2, . . .). (2.81)
For integer values of m, the sine operator is 0 and the cosine operator is either +1
or −1 dependent on whether m is even or odd. Therefore, Equation (2.80) reduces
to
Ex 2 Ey 2 Ex E y
+ = 2 (−1)m . (2.82)
|E x | |E y | |E x | |E y |
i i
i i
i i
i i
40 2. Physics of Light
B
Wave propagation
E
The general form of this equation is either x2 +y2 = 2xy or x2 +y2 = −2xy. We are
interested in the ratio between x and y, as this determines the level of eccentricity
of the ellipse. We find this as follows:
x2 + y2 = 2xy. (2.83a)
x2 x
2
+1 = 2 . (2.83b)
y y
x
= 1. (2.83c)
y
Ex Ey
= (−1)m . (2.84)
Ey Ex
As such, the ratio between x- and y-components of E are constant for fixed m.
This means that instead of inscribing an ellipse, this vector oscillates along a line.
Thus, when the phase angles ϕx and ϕy are in phase, the electric field intensity
vector is linearly polarized, as shown in Figure 2.6. The same is then true for the
magnetic vector H.
i i
i i
i i
i i
2.3. Polarization 41
E
y
Wave
x propagation
Figure 2.7. For a circularly polarized wave, the electric vector E rotates around the Poynt-
ing vector while propagating. Not shown is the magnetic vector, which also rotates around
the Poynting vector while remaining orthogonal to E.
The second special case occurs when the amplitudes |E x | and |E y | are equal
and the phase angles differ by either π /2 ± 2 m π or −π /2 ± 2 m π . In this case,
(2.80) reduces to
This is the equation of a circle, and this type of polarization is therefore called
circular polarization. If ϕy − ϕx = π /2 ± 2 m π the wave is called a right-handed
circularly polarized wave. Conversely, if ϕy − ϕx = −π /2 ± 2 m π the wave is
called left-handed circularly polarized. In either case, the field vectors inscribe a
circle, as shown for E in Figure 2.7.
The causes of polarization include reflection of waves off surfaces or scat-
tering by particles suspended in a medium. For instance, sunlight entering the
Earth’s atmosphere undergoes scattering by small particles, which causes the sky
to be polarized.
Polarization can also be induced by employing polarization filters. These fil-
ters are frequently used in photography to reduce glare from reflecting surfaces.
Such filters create linearly polarized light. As a consequence, a pair of such filters
can be stacked such that together they block all light.
This is achieved if the two filters polarize light in orthogonal directions, as
shown in the overlapping region of the two sheets in Figure 2.8. If the two filters
are aligned, then linearly polarized light will emerge, as if only one filter were
present. The amount of light Ee transmitted through the pair of polarizing filters
i i
i i
i i
i i
42 2. Physics of Light
Figure 2.8. Two sheets of polarizing material are oriented such that together they block
light, whereas each single sheet transmits light.
Figure 2.9. A polarizing sheet in front of an LCD screen can be oriented such that all light
is blocked.
i i
i i
i i
i i
2.3. Polarization 43
Figure 2.10. A polarizing sheet is oriented such that polarized laser light is transmitted.
is a function of the angle θ between the two polarizers and the amount of incident
light Ee,0 :
Ee = Ee,0 cos2 (θ ) . (2.86)
This relation is known as the Law of Malus [447, 730, 1128].
The same effect is achieved by placing a single polarizing filter in front of an
LCD screen, as shown in Figure 2.9 (see also Section 14.2). Here, a backlight
emits non-polarized light, which is first linearly polarized in one direction. Then
the intensity of each pixel is adjusted by means of a second variable polarization in
the orthogonal direction. Thus, the light that is transmitted through this sequence
of filters is linearly polarized. As demonstrated in Figure 2.9, placing one further
polarizing filter in front of the screen thus blocks the remainder of the light.
In addition, laser light is polarized. This can be shown by using a single
polarization filter to block laser light. In Figure 2.10, a sheet of polarizing material
is placed in the path of a laser. The sheet is oriented such that most of the light is
transmitted. Some of the light is also reflected. By changing the orientation of the
sheet, the light can be blocked, as shown in Figure 2.11. Here, only the reflecting
component remains.
Polarization is extensively used in photography in the form of filters that can
be attached to the camera. This procedure allows unwanted reflections to be re-
i i
i i
i i
i i
44 2. Physics of Light
Figure 2.11. A polarizing sheet is oriented such that polarized laser light is blocked.
Figure 2.12. The LCD screen emits polarized white light, which undergoes further polar-
ization upon reflection dependent on the amount of stress in the reflective object. Thus, the
colorization of the polarized reflected light follows the stress patterns in the material.
i i
i i
i i
i i
moved from scenes. For instance the glint induced by reflections off water can be
removed. It can also be used to darken the sky and improve its contrast.
In computer vision, polarization can be used to infer material properties us-
ing a sequence of images taken with a polarization filter oriented at different an-
gles [1250]. This technique is based on the fact that specular materials partially
polarize light.
Polarization also finds uses in material science, where analysis of the polar-
izing properties of a material provides information of its internal stresses. An
example is shown in Figure 2.12 where the colored patterns on the CD case are
due to stresses in the material induced during fabrication. They become visible
by using the polarized light of an LCD screen.
E = E0 cos (r · s − ω t) , (2.87a)
H = H0 cos (r · s − ω t) . (2.87b)
S = E0 × H0 cos2 (r · s − ω t) . (2.88)
The instantaneous energy density thus varies with double the angular frequency
(because the cosine is squared). At this rate of change, the Poynting vector does
not constitute a practical measure of energy flow.
However, it is possible to average the magnitude of the Poynting vector over
time. The time average f (t)T of a function f (t) over a time interval T is given
by
t+T /2
1
f (t)T = f (t) dt. (2.89)
T t−T /2
i i
i i
i i
i i
46 2. Physics of Light
i i
i i
i i
i i
By solving both expressions for s and equating them, we find the following
relation:
√ √
μ H = ε E . (2.94)
Using Equation (A.7) once more, the magnitude of the time-averaged Poynting
vector follows from (2.92):
1 ε0
S T = E0 2 (2.95a)
2 μ0
ε0 c
= E0 2 . (2.95b)
2
εv 2
Ee = E0 T . (2.97)
2
i i
i i
i i
i i
48 2. Physics of Light
n
s(i) s(i)
C
θi
CB
Medium 1 θi z=0 sin (θi ) = sin (θi ) =
AC
Medium 2 A B AD
θt sin (θt ) = sin (θt ) =
θt AB
z D
sin (θi ) sin (θt )
x -n s(t) v1
=
v2
As both the incident and the reflected wave propagate through the same medium,
their velocities will be identical. However, the refracted wave will have a different
velocity, since the medium has different permittivity and permeability constants.
The above equalities should hold for any point on the boundary. In particular, the
equalities hold for locations r1 = (1 0 0) and r2 = (0 1 0). These two locations
give us the following set of equalities:
i i
i i
i i
i i
n
s(i) s(r)
θi θr
Medium 1 z=0
sin (θi ) sin (θr)
Medium 2 =
θt v1 v1
z
sin (θi ) sin (θt )
x -n s(t) v1
=
v2
If we assume that the Poynting vectors for the incident, reflected, and trans-
mitted waves lie in the x-z plane, then by referring to Figures 2.13 and 2.14, we
have
⎛ ⎞
sin(θi )
S(i) = ⎝ 0 ⎠ , (2.100a)
cos(θi )
⎛ ⎞
sin(θr )
S(r) = ⎝ 0 ⎠ , (2.100b)
cos(θr )
⎛ ⎞
sin(θt )
S(t) = ⎝ 0 ⎠ . (2.100c)
cos(θt )
The z-coordinates are positive for S(i) and S(t) and negative for S(r) . By combin-
ing (2.99) and (2.100) we find that
Since sin(θi ) = sin(θr ) and the z-coordinates of S(i) and S(r) are of opposite sign,
the angle of incidence and the angle of reflection are related by θr = π − θi . This
relation is known as the law of reflection.
Since the speed of light in a given medium is related to the permittivity and
permeability of the material according to (2.28), we may rewrite (2.101) as
sin(θi ) v1 μ2 ε2 n2
= = = = n. (2.102)
sin(θt ) v2 μ1 ε1 n1
i i
i i
i i
i i
50 2. Physics of Light
Figure 2.15. Refraction demonstrated by means of a laser beam making the transition
from smoke-filled air to water.
Figure 2.16. Laser light reflecting and refracting off an air-to-water boundary.
i i
i i
i i
i i
Figure 2.17. Laser light interacting with a magnifying lens. Note the displacement of the
transmitted light, as well as the secondary reflections of front and back surfaces of the lens.
√ √
The values n1 = μ1 ε1 and n2 = μ2 ε2 are called the absolute refractive indices
of the two media, whereas n is the relative refractive index for refraction from the
first into the second medium. The relations given in (2.102) constitute Snell’s
law.5
An example of refraction is shown in Figure 2.15, where a laser was aimed
at a tank filled with water. The laser is a typical consumer-grade device normally
used as part of a light show in discotheques and clubs. A smoke machine was used
to produce smoke and allow the laser light to scatter towards the camera. For the
same reason a few drops of milk were added to the tank. Figure 2.16 shows a
close-up with a shallower angle of incidence. This figure shows that light is both
reflected and refracted.
A second example of Snell’s law at work is shown in Figure 2.17 where a
laser beam was aimed at a magnifying lens. As the beam was aimed at the center
of the lens, the transmitted light is parallel to the incident light, albeit displaced.
The displacement is due to the double refraction at either boundary of the lens.
Similarly, the figure shows light being reflected on the left side of the lens. Two
beams are visible here: one reflected off the front surface of the lens and the
5 Named after the Dutch scientist Willebrord van Roijen Snell (1580–1626).
i i
i i
i i
i i
52 2. Physics of Light
second beam reflected off of the back surface. These beams are predicted by a
straightforward repeated application of Snell’s law.
The refraction diagram shown in Figure 2.13 shows a wave traveling from an
optically less dense medium into a more dense medium, and hence θt < θi . If
we increase the angle of incidence until, in the limit, we reach θi = π /2, then the
wave is traveling parallel to the boundary and S(i) · n = 0. For this situation, there
will be a maximum angle of refraction, θt = θc . This angle is called the critical
angle.
A wave traveling in the opposite direction, i.e., approaching the boundary
from the denser medium towards the less dense medium, refraction into the less
dense medium will only occur if the angle of incidence is smaller than the crit-
ical angle. For angles of incidence larger than θc , the wave will not refract but
will only reflect. This phenomenon is called total reflection, or sometimes total
internal reflection, and is illustrated in Figure 2.18. It can be demonstrated by
punching a hole in a plastic bottle and draining water through it. By aligning a
laser with the opening in the bottle and adding a drop of milk (to make the laser
beam visible), total internal reflection such as that seen in Figure 2.19 can be ob-
tained. The light stays within the water stream through multiple reflections. The
transmittance of light through strands of fiber, known as fiberoptics, works on the
same principle (Figure 2.20) .
While Snell’s law predicts the angles of reflection and transmission, it does
not give insight into the amplitude of the reflected and refracted fields, E and B,
respectively. To evaluate these amplitudes, we first need to look at the boundary
conditions related to the interface between the two media, which are most easily
evaluated by means of Maxwell’s equations in integral form. The components
of D and B normal to the boundary are assessed first, followed by the tangential
components of these fields.6
n θi < θc : Reflection + Refraction
s(t)
θi > θc : Total Reflection
θt
Medium 1 z=0
Medium 2 θi
θr θc
s(r) θr θi s(i)
z -n
x s(r) s(i)
i i
i i
i i
i i
Figure 2.19. A stream of water exits a plastic bottle. Coming from the right, a laser beam
enters the bottle and exits through the same hole as the water. Multiple reflections inside
the stream of water are visible, essentially keeping the light trapped inside the stream.
Figure 2.20. Fiberoptic strands transmitting light with very little loss of energy due to
total internal reflection inside the strands.
i i
i i
i i
i i
54 2. Physics of Light
δA1
n
δh
Medium 1
Medium 2 -n
δA 2
Figure 2.21. Pill box intersecting the boundary between two media.
tersects the boundary between two dielectric materials, as shown in Figure 2.21,
then the fields gradually change from the top of the volume to the bottom, which
is located on the other side of the boundary.
By Gauss’ theorem (see Appendix A.5), the divergence of B integrated over
the volume of the pill box is related to the integral of the normal component of B,
when integrated over the surface of the pill box:
∇·B dv = B · n ds = 0. (2.103)
v s
In the limit that the height of the pill box goes to zero, contribution of the cylin-
drical side of the pill box also goes to zero. If the areas of the top and bottom
of the pill box (δ A1 and δ A2 ) are small and there is no flux or induction at the
In the limit that the pill box is shrunk to zero height, the charge density ρ , i.e., the
charge per unit of volume, goes to infinity. For this reason, the concept of surface
i i
i i
i i
i i
δs1
n
t
δh
t -t
Medium 1 b
Medium 2 δs2
This means that upon crossing a boundary between two dielectrics, the surface
charge density causes the normal component of D to be discontinuous and jump
by ρ̂ .
To analyze the tangential components of E and H, we can make use of Stokes’
theorem (Appendix A.7), which relates line integrals to surface integrals. We
place a plane perpendicular to the surface boundary that intersects the bound-
ary, as shown in Figure 2.22. Note also that the incident, reflected and refracted
Poynting vectors are assumed to lie in this plane, as shown in Figure 2.23.
In Figure 2.22, n is the normal of the boundary between the two dielectrics, b
is the normal of the newly created plane, and t = b × n is a vector that lies both
Plane of incidence
s (r) z=0
n s (i)
t
Medium 1 b
Medium 2 s (t)
Figure 2.23. Plane (green) intersecting the boundary between two media. The Poynting
vectors of incident, reflected, and refracted TEM waves all lie in this plane.
i i
i i
i i
i i
56 2. Physics of Light
in the new plane and in the plane of the surface boundary. All three vectors are
assumed to be unit length. Stokes’ theorem then gives
∂B
∇× E · b ds = E · b dc = − · b ds, (2.108)
s c s ∂t
where the middle integral is a line integral over the contour c that surrounds the
plane. The other two integrals are surface integrals over the area of the plane. If
δ s1 and δ s2 are small, (2.108) simplifies as follows:
∂B
E(2) · t δ s2 − E(1) · t δ s1 = − · b δ s δ h. (2.109)
∂t
In the limit that the height δh goes to zero, we get
E(2) − E(1) · t δ s = 0. (2.110)
This result indicates that at the boundary between two dielectrics, the tangential
component of the electric vector E is continuous. A similar derivation can be
made for the magnetic vector, which yields
n × H(2) − H(2) = ĵ, (2.112)
where ĵ is the surface current density (introduced for the same reason ρ̂ was
above). Thus, the tangential component of the magnetic vector H is discontin-
uous across a surface and jumps by ĵ. Of course, in the absence of a surface
current density, the tangential component of vector H becomes continuous.
The above results for the continuity and discontinuity of the various fields will
now be used to help derive the Fresnel equations, which predict the amplitudes
of the transmitted and reflected waves. To facilitate the derivation, the vector E at
the boundary between the two dielectrics may be decomposed into a component
parallel to the plane of incidence, and a component perpendicular to the plane of
incidence, as indicated in Figure 2.24. The magnitudes of these components are
termed A and A⊥ . The x and y components of the incident electric field are then
given by
(i)
Ex = − A cos(Θi ) e−i τi , (2.113a)
(i)
Ey = A⊥ e−i τi , (2.113b)
(i)
Ez = A sin(Θi ) e−i τi , (2.113c)
i i
i i
i i
i i
E s (i)
A = E projected onto P
Perpendicular plane Q A = E projected onto Q
Plane of incidence P
s (i)
n E z=0
A A
t
b
Medium 1
Medium 2
Figure 2.24. The electric vector E can be projected onto the plane of incidence as well
as onto a perpendicular plane. The magnitudes of the projected vector are indicated by A
and A⊥ .
and R and T the complex amplitudes of the reflected and refracted waves, ex-
pressions for the reflected and transmitted fields may be derived analogously.
i i
i i
i i
i i
58 2. Physics of Light
(r) (r) √
Ex = − R cos(Θr ) e−i τr Hx = − R⊥ cos(Θr ) ε1 , e−i τr , (2.118a)
(r) (r) √
Ey = R⊥ e−i τr Hy = − R ε1 e−i τr , (2.118b)
(r) (r) √
Ez = R sin(Θr ) e−i τr Hz = R⊥ sin(Θr ) ε1 e−i τr , (2.118c)
(t) (t) √
Ex = − T cos(Θt ) e−i τt Hx = − T ⊥ cos(Θt ) ε2 , e−i τt , (2.119a)
(t) (t) √
Ey = T ⊥ e−i τt Hy = − T ε2 e−i τt , (2.119b)
(t) (t) √
Ez = T sin(Θt ) e−i τt Hz = T ⊥ sin(Θt ) ε2 e−i τt . (2.119c)
According to (2.111) and (2.112) and under the assumption that the surface cur-
rent density is zero7 , the tangential components of E and H should be zero at the
surface boundary. We may therefore write
By substitution of this equation, and using the fact that cos(Θr ) = cos(π − Θr ) =
− cos(Θr ), we get
cos(Θi ) A − R = cos(Θt ) T , (2.121a)
ε2
A + R = T , (2.121b)
ε1
ε2
cos(Θi ) (A⊥ − R⊥ ) = cos(Θt ) T ⊥ , (2.121c)
ε1
A⊥ + R⊥ = T ⊥ . (2.121d)
i i
i i
i i
i i
√ √
We use Maxwell’s relation n = μ ε = ε and solve the above set of equations
for the components of the reflected and transmitted waves, yielding
n2 cos(Θi ) − n1 cos(Θt )
R = A , (2.122a)
n2 cos(Θi ) + n1 cos(Θt )
n1 cos(Θi ) − n2 cos(Θt )
R⊥ = A⊥ , (2.122b)
n1 cos(Θi ) + n2 cos(Θt )
2n1 cos(Θi )
T = A , (2.122c)
n2 cos(Θi ) + n1 cos(Θt )
2n1 cos(Θi )
T ⊥ = A⊥ . (2.122d)
n1 cos(Θi ) + n2 cos(Θt )
These equations are known as the Fresnel equations; they allow us to compute
the transmission coefficients t and t⊥ as follows:
T 2 cos(Θi )
t = = n2 , (2.123a)
A cos(Θi ) + cos(Θt )
n1
T⊥ 2 cos(Θi )
t⊥ = = n2 . (2.123b)
A⊥ cos(Θi ) + cos(Θt )
n1
The reflection coefficients r and r⊥ are computed similarly:
n1
R cos(Θi ) −
cos(Θt )
n2
r = = n1 , (2.124a)
A cos(Θi ) + cos(Θt )
n2
n2
cos(Θi ) − cos(Θt )
R⊥ n1
r⊥ = = n2 . (2.124b)
A⊥ cos(Θi ) + cos(Θt )
n1
We may rewrite both the reflection and the transmission coefficients in terms of
the angle of incidence alone. This can be achieved by applying Snell’s law (2.102),
so that
n1
sin (Θt ) = sin (Θi ) (2.125)
n2
and therefore, using Equation (B.11), we have
2
n1
cos (Θt ) = 1 − sin2 (Θi ) (2.126)
n2
i i
i i
i i
i i
60 2. Physics of Light
By substitution into (2.123) and (2.124) expressions in n1 , n2 , and cos (Θi ) can
be obtained. The reflection and transmission coefficients give the percentage of
incident light that is reflected or refracted:
E0,r
r= (2.127a)
E0,i
E0,t
t= (2.127b)
E0,i
In the case of a wave incident upon a conducting (metal) surface, light is both
reflected and refracted as above, although the refracted wave is strongly attenuated
by the metal. This attenuation causes the material to appear opaque.
A water surface is an example of a reflecting and refracting boundary that
occurs in nature. While all water reflects and refracts, on a wind-still day the
Figure 2.25. On a wind-still morning, the water surface is smooth enough to reflect light
specularly on the macroscopic scale; Konstanz, Bodensee, Germany, June 2005.
i i
i i
i i
i i
boundary between air and water becomes smooth on the macroscopic scale, so
that the effects discussed in this section may be observed with the naked eye, as
shown in Figure 2.25.
where Ee,i is the radiant flux density of Equation (2.96). Energy per unit of time
is known as radiant flux and is discussed further in Section 6.2.2. Similarly, we
can compute the reflected and transmitted flux (Pe,r and Pe,t ) with
R = r 2 , (2.134a)
R⊥ = r⊥
2
. (2.134b)
i i
i i
i i
i i
62 2. Physics of Light
i i
i i
i i
i i
2.6. Birefringence 63
Figure 2.26. A calcite crystal causes birefringence which depends on the orientation of
the crystal.
2.6 Birefringence
Some materials have an index of refraction that depends on the polarization of the
light. This means that unpolarized light can be split into two or more polarized
light paths. This effect is called birefringence or double refraction and is demon-
strated in Figure 2.26 by means of a calcite crystal (also known as iceland spar).
A second example of a birefringent material is the cornea of the human eye (see
Section 4.2.2) [755].
To see that light becomes linearly polarized into two orthogonal directions,
Figure 2.27 contains photographs of the same calcite crystal placed upon a detail
of Figure 1.7. The light is linearly polarized, which is demonstrated by placing
a linear polarization filter on top of the crystal. This allows one of the two lin-
early polarized light directions to pass through, while blocking the other. As a
consequence, slowly rotating either the filter or the crystal, will suddenly change
the path selected to pass through the polarizer. The combination of linear polariz-
ers and liquid birefringent materials, which have polarizing behavior that can be
controlled by applying an electric field, forms the basis of liquid crystal displays
(LCDs), which are discussed further in Section 14.2.
Molecules with an elongated shape tend to give rise to birefringence. Such
rod-like molecules are called calamitic. To create birefringence in bulk, these
molecules must be aligned, so that the material itself will exhibit birefringence.
The vector pointing along the long axis of the molecules is called the director.
In that case, the relative permittivity (i.e., the dielectric constant) of the material
i i
i i
i i
i i
64 2. Physics of Light
Figure 2.27. The block of birefringent calcite is placed on a print of Figure 1.7 (top),
showing double refraction. The bottom two images are photographs of the same configu-
ration, except that now a linear polarizing filter is placed on top of the crystal. Dependent
on the orientation (which is the difference between the bottom two photographs), one or
the other refraction remains. A polarizer used in this manner is called an analyzer.
Δn = n − n⊥ . (2.141)
We now consider the amplitude of the electric field intensity in two orthogonal
directions x and y, as given in (2.76), with x aligned with the director. Recall that
the distance z in this equation is only a convenient notation representing a point r
in space under a plane wave traveling in direction S. In other words, in the more
general case we have β z = β r · S. With β given by
ω 2π n
β= = , (2.142)
c λ
i i
i i
i i
i i
2.6. Birefringence 65
with
2 π n
k = S, (2.145a)
λ
2 π n⊥
k⊥ = S. (2.145b)
λ
In the case that the director makes an angle, say ψ , with our choice of coordinate
system, a further scaling of the electric field magnitudes is required:
Ex
= cos(ψ ) cos(ω t − k r + ϕx ), (2.146a)
|E x |
Ey
= sin(ψ ) cos(ω t − k⊥ r + ϕy ). (2.146b)
|E y |
whereby we momentarily ignore the factors exp(iω t) and exp(iϕ ). Absorbing the
scaling according to ψ into the constants Ex and Ey and placing the two phasors
in a column vector, we obtain the so-called Jones vector J:
Jx Ex,ψ exp −ik r
J= = . (2.148)
Jy Ey,ψ exp (−ik⊥ r)
The use of the Jones vector will return in the analysis of liquid crystals in
Section 14.2.
i i
i i
i i
i i
66 2. Physics of Light
2.7.1 Interference
If the phase function of each of the waves is correlated, we speak of coherent
light. When such waves are superposed, the result shows peaks and troughs as a
result of the sinusoidal variation of both electric and magnetic vectors, as shown
in Figure 2.28. If the two waves are out of phase by π , then the two waves exactly
cancel. Thus, dependent on the phase difference at a given point in space, the
two waves may reinforce each other or cancel each other. It is said that coherent
waves interfere, and the resulting pattern is called an interference pattern.
The irradiance at a point in space is given by (2.96), where E20 T is the time
average of the magnitude of the electric field vector. Applying the superposition
principle for a pair of coherent monochromatic time-harmonic plane waves with
2.0
cos (x)
1.5 cos (x+0.5)
cos (x) + cos (x+0.5)
1.0
0.5
0.0
-0.5
−1.0
−1.5
−2.0
0 2 4 6 8 10 12 14
x
Figure 2.28. Two waves superposed forming a new wave, in this case with a larger ampli-
tude than either of the constituent waves.
i i
i i
i i
i i
Since the electric vector is now the sum of two components, the square of its
magnitude is given by
The implications for the irradiance are that this quantity is composed of three
terms, namely the irradiance due to either wave as well as a cross term:
Ee = ε0 c E21 T + E22 T + 2 E1 · E2 T . (2.151)
The cross term 2 E1 · E2 T , known as the interference term, explains the patterns
that may be observed when two coherent bundles of light are superposed.
We assume that the electric vectors of the two waves are modeled by
where ai are the amplitudes and ϕi are the initial phases of the two (linearly po-
larized) waves. The interference term is then
It can be shown that the time-averaged quantity of the interference term becomes
The interference term can then be rewritten in the following form to highlight the
fact that its oscillations depend on the phase difference between the two waves:
i i
i i
i i
i i
68 2. Physics of Light
s2
ϕ2
ϕ1 r
s1
Figure 2.29. The phase at point r is determined by the phases at the origins of the two
waves, as well as the distance traveled.
Here, r is the distance between the point of interest and the source of the wave-
front, i.e., the radius. A pair of spherical wavefronts may show interference just
as planar waves would. The phase difference δ of two spherical wavefronts E1
i i
i i
i i
i i
and E2 is then
2π
δ= (r1 − r2 ) + (ϕ1 − ϕ2 ) . (2.159)
λ
If for simplicity we assume the amplitudes a1 and a2 to be identical and constant
over the region of interest, it can be shown that the irradiance depends on the
phase difference as follows [447]:
δ
Ee = 4Ee,0 cos2 , (2.160)
2
where Ee,0 is the irradiance due to either source (which in this case are identical).
Maxima in the irradiance Ee occur at intervals,
r2 − r 1 = mλ . (2.165)
This result indicates that the location of peaks and troughs in the interference pat-
tern depends on the wavelength of the two sources, which in our simplified anal-
ysis are assumed to be in phase and emit light of the same wavelengths. Thus, if
the phase difference between two bundles of light can be coherently manipulated
by fractions of a wavelength, we can expect white light to be broken up into its
constituent parts.
Further, a thin layer which reflects off its front surface as well as its back sur-
face, can create interference patterns. This occurs because the path length from
the light source to the human observer is slightly shorter for the light reflected
off the front surface than it is for the light reflected off the back surface. As a
result, according to (2.164) this results in small phase differences. If the spacing
of the reflecting layers causes a phase difference in the order of a visible wave-
length, this is seen as interference color. As shown in Figure 2.31, the viewing
i i
i i
i i
i i
70 2. Physics of Light
Incoming light VD 1 VD 2
Figure 2.31. Interference colors, which depend on viewing angle, are caused by a semi-
transparent layer creating two thinly spaced reflective layers. The path lengths of light
reflected off each surface are slightly different, causing interference colors.
angle has an impact on the relative path lengths, and therefore one can expect
the color of the surface to vary with viewing angle. The dark substrate serves to
create brighter and more brilliant colors. In practice, this effect is not limited to a
single transparent layer, but further reflections can be created by a stack of layers.
Figure 2.32. This Morpho zephyritis shows iridescent blue colors as a result of interfer-
ence in its wings.
i i
i i
i i
i i
Figure 2.33. Iridescence in sea shells. Shown here is a close-up of the inside of an abalone
shell.
Recently, display devices have been proposed which are based on this principle,
as discussed in Section 14.10.
Interference colors occur in nature, for instance, in the wings of certain species
of butterfly such as the Morpho zephyritis shown in Figure 2.32. Colors resulting
from interference at wavelength scales are usually termed iridescent. Multiple
colors may be present and normally vary with viewing angle. In elytra, the wing-
cases of beetles (see Figure 1.5), as well as in butterfly wings, the layers causing
interference are backed by a dark layer, usually made of melanin, which strength-
ens the color. Iridescence is also present in many birds, including the peacock
(Figure 1.3) and hummingbirds, as well as sea shells such as abalones and oysters
(Figure 2.33).
Figure 2.34. The number 20 from a US $20 bill photographed under different viewing
angles. The particles in the ink are aligned to produce a distinct change in color with
viewing angle.
i i
i i
i i
i i
72 2. Physics of Light
Figure 2.35. An interference pattern is etched into a small piece of plastic, which only
reveals itself when hit by a laser beam. (Etching courtesy of Eric G. Johnson, School of
Optics, University of Central Florida.)
2.7.2 Diffraction
We have thus far discussed interference in the form of a pair of hypothetical light
sources that emit light of the same wavelength and are in phase. In this section
we discuss diffraction, which is the behavior of light in the presence of obstacles.
Diffraction may cause interference patterns.
In Section 2.5 the behavior of light near a boundary was discussed. In the case
of a boundary between a dielectric and a metal, the wave will not propagate very
far into the metal. The theory of diffraction deals with opaque boundaries with
edges and holes in different configurations.
If we assume that we have a thin opaque sheet of material with a hole in it,
then the interaction of the wave with the sheet away from the hole will be as
discussed in Section 2.5. In the center of the hole, and assuming that the hole
is large enough, the wave will propagate as if there were no opaque sheet there.
However, near the edge of the hole, diffraction occurs. Here the wave interacts
with the material, causing new wavefronts. A detailed treatment of this interaction
would involve atomic theory (touched upon in Chapter 3) and is beyond the scope
i i
i i
i i
i i
D θ
R
Plane waves
Figure 2.36. Light passing through a slit is projected onto a wall. The distribution of
irradiance values is observed for points P along the wall.
of this book. We briefly outline the behavior of light near edges using a simplified
wave theory of light.
For a sheet of material with a hole, the visible manifestation of diffraction
depends on the size of the hole in relation to the wavelength of the incident light.
If the hole is very large, the sheet casts a shadow and the result is much like
everyday experience predicts. When the hole becomes small, the edge of the hole
will diffract light and cause interference patterns.
For a rectangular hole, i.e., a slit, we are interested in the irradiance distri-
bution as a result of diffraction. If the distance R between the slit and the line
of observation on a wall is large in comparison with the size of the slit (R D,
see Figure 2.36), we speak of Fraunhofer diffraction. It can be shown that the
irradiance distribution along the wall as a function of angle θ is given by [447]
2 Dπ
Ee (θ ) = Ee (0) sinc sin(θ ) . (2.166)
λ
1.0
Ee(θ)/Ee(0)
0.8
0.6
0.4
0.2
0.0
−2 −1 0 1 2
θ
Figure 2.37. Irradiance as function of eccentricity angle θ at a distance R from the slit.
The size of the slit was chosen to be D = 10λ /π .
i i
i i
i i
i i
74 2. Physics of Light
A plot of this function for D = 10λ /π is shown in Figure 2.37. The equation
shows that the diffraction pattern will become very weak as the size of the slit
becomes large relative to the wavelength λ . Diffraction therefore occurs most
strongly when the slit is narrow.
The irradiance of the center spot is given by Ee (0) and may be expressed in
terms of the strength of the electric field at the slit Ξ0 :
2
1 Ξ0 D
Ee (0) = . (2.167)
2 R
Fraunhofer diffraction can also be shown for other configurations. An often-
seen demonstration is that of a double slit. Here, two parallel slits each cause an
interference pattern; these patterns are superposed. The irradiance is then
⎛ ⎞
2 Dπ
⎜ sin sin(θ ) ⎟
⎜ λ ⎟ 2 aπ
Ee (θ ) = Ee (0) ⎜ 2 ⎟ cos , (2.168)
⎝ Dπ ⎠ λ
sin(θ )
λ
with a the separation distance between the two slits, as shown in Figure 2.38.
Thus, a diffraction effect may be obtained by aiming a coherent bundle of light
at a single or double slit. Such an experiment can be readily devised from a laser
pointer, a comb, and pieces of cardboard. By using the cardboard to mask off all
the gaps in the comb bar, a single or double slit can be fabricated (Figure 2.39).
The projection of the light on a wall is as shown in Figure 2.40. The horizontal
bright-and-dark pattern is caused by interference of the light passing through the
slits. For comparison, the cross section of the laser beam is shown in the same
figure. A good quality beam should have a Gaussian radial fall-off. The cheap
laser used here has a much more erratic pattern.
To show diffraction occurring near an edge, a (near-) monochromatic laser is
a good source of light, since it emits a coherent bundle of light. By shining this
D P
a θ
R
D
Plane waves
Figure 2.38. Light passing through a double slit is projected onto a wall.
i i
i i
i i
i i
Figure 2.39. With a source of laser light, a comb, and two bits of cardboard, a setup may
be created to demonstrate interference patterns. The cardboard is used to mask off most of
the comb such that only one or two slits are available to let light through. This is the setup
used to demonstrate single- and double-slit diffraction in Figure 2.40.
Figure 2.40. Interference patterns created by passing laser light through a single slit
(middle) and a double slit (bottom), as shown in Figure 2.39. The laser’s cross section is
shown at the top.
i i
i i
i i
i i
76 2. Physics of Light
Figure 2.41. A laser is shone just over a knife’s edge (left). This causes an interference
pattern (right).
source over the edge of a knife, as shown in Figure 2.41, a diffraction pattern can
be created. The right panel in this figure shows the pattern of light created on a
wall several meters away from the laser and the knife. Below the center spot there
is a rapid fall-off, whereas above the center spot a series of additional spots are
visible.
This interference pattern is due to small changes in phase, which themselves
are due to the light interacting with the knife’s edge in a manner dependent on its
minimum distance to the edge.
While pointing a laser at the edge of a knife is not a common everyday oc-
currence, light streaking past an edge is very common. The above experiment
shows that shadow edges in practice are never completely sharp. This is in part
because light sources are never infinitely far away, nor infinitely small, although
even in such an idealized case, the diffraction around the edge will cause some
interference and therefore softening of the edge.
i i
i i
i i
i i
Figure 2.42. A diffraction grating lit by a D65 daylight simulator (left). The rainbow
patterns on both the reflected (middle) and transmitted sides (right) are shown.
Figure 2.43. A DVD lit by a D65 daylight simulator. The small pits encoding the data are
of the order of a wavelength, and therefore cause a diffraction pattern resulting in a set of
rainbow colors.
i i
i i
i i
i i
78 2. Physics of Light
2.8 Scattering
Many optical phenomena may be described as scattering, although it is often more
convenient to use specialized theories. For instance, we have described reflection
and refraction of electro-magnetic waves at a boundary between two dielectrics
with different indices of refraction in Section 2.5. Similarly diffraction and inter-
ference can be seen as special cases of scattering [111].
Scattering is an interaction of light with particles such as single atoms,
molecules, or clusters of molecules. Light can be absorbed by such particles and
almost instantaneously be re-emitted. This repeated absorption and re-emission
of light is called scattering. The direction of scattering may be different from the
direction of the incident light, leading for instance to the translucent appearance
of some materials as shown in Figure 2.44. Scattering events are called inelastic if
the wavelength of the re-emitted light is similar to the wavelength of the incident
light. If the wavelengths alter significantly, the scattering is called elastic.
Figure 2.44. This statue scatters light internally, resulting in a translucent appearance.
i i
i i
i i
i i
2.8. Scattering 79
While a full account of scattering theory would be too involved for a book on
color theory, in the following subsections we offer a simplified explanation of dif-
ferent types of scattering. More detailed accounts of light scattering can be found
in “Light Scattering by Small Particles” [499], “Dynamic Light Scattering” [88],
“Scattering Theory of Waves and Particles” [832] and “Electromagnetic Scatter-
ing in Disperse Media” [54].
where d is the initial maximum distance between the positive and negative charges.
The oscillation of this dipole gives rise to an electro-magnetic field with a rela-
tively complicated structure near the dipole. At some distance away from the
dipole, the field has a simpler structure whereby the E and H fields are transverse,
perpendicular, and in phase [447]. The magnitude of the electric vector is given
by
qd π sin(θ ) cos(r · s − ω t)
E = , (2.170)
λ 2 ε0 r
where position r is a distance r away from the dipole. The irradiance Ee associated
with this field is obtained by applying (2.96) and noting that the squared cosine
term becomes 1/2 after time-averaging (see Section 2.4):
q2 d 2 π 2 c sin2 (θ )
Ee = . (2.171)
2λ 4 r 2
This equation shows that a dipole oscillating with an angular frequency ω , gener-
ates an electro-magnetic field with an irradiance that falls off with the square of
the distance and, as such, obeys the inverse-square law. We therefore speak of a
dipole radiator.
In addition, the irradiance depends inversely on the fourth power of wave-
length. Thus, the higher the frequency of the oscillations (smaller λ ), the stronger
the resulting field.
Dipole radiation can be induced in small particles in the presence of an electro-
magnetic field. Such particles thus scatter light at the blue end of the visible spec-
trum more than light with longer wavelengths. If these scattering particles are
spaced in a medium at distances of at least one wavelength (in the visible region),
then the scattered waves are essentially independent. Light passing through such
i i
i i
i i
i i
80 2. Physics of Light
Figure 2.45. Rayleigh scattering demonstrated with a tank of water and milk. On the left,
a D65 daylight simulator is placed behind the tank. The scattering causes a color that tends
towards orange. The image on the right shows the same tank and contents lit from the side
by the same light source. The substance now has a bluish tinge.
a medium will be absorbed and re-emitted in all directions. As a result, waves are
independent from each other in all directions, except forward. Light scattered in
the forward direction will generally maintain its phase coherence. This type of
scattering is called Rayleigh scattering.
Rayleigh scattering can be demonstrated by shining light through a tank of
water which contains a small amount of milk. The milk solution acts as a scatter-
ing medium. The light passing through in the forward direction will take on an
orange tint, since longer wavelengths are scattered less than shorter wavelengths.
Light scattered towards the side will have a blue tinge, as shown in Figure 2.45.
Another example of Rayleigh scattering is the blue color of some people’s
eyes (Figure 2.46). Both blue skies and orange and red sunrises and sunsets are
explained by Rayleigh scattering as well. The latter example is discussed further
Figure 2.46. The blue color in the iris shown on the left is due to Rayleigh scattering.
Green eyes, as shown on the right, occur due to a combination of Rayleigh scattering and
staining due to melanin. (Photographs adapted from [667].)
i i
i i
i i
i i
2.8. Scattering 81
Figure 2.47. The blue coloring of the feathers of this blue jay are caused by Rayleigh
scattering; Yosemite National Park, California, 1999.
in Section 2.11. Rayleigh scattering is rare in the plant world, but surprisingly
common among animals. It is for instance the cause of the blue color in blue jays
(Figure 2.47).
i i
i i
i i
i i
82 2. Physics of Light
Figure 2.48. Large particles, such as water droplets, cause light to be scattered in a largely
wavelength-independent manner. Hence, clouds appear a neutral gray or white; A Coruña,
Spain, August 2005.
Figure 2.49. Early morning fog causing Mie scattering; Casselberry, FL, 2004.
i i
i i
i i
i i
2.8. Scattering 83
Figure 2.50. Smoke machines used for visual effect at concerts and in clubs produce fine
droplets that remain suspended in air, causing Mie scattering.
Mie scattering is only weakly dependent on wavelength and may scatter red
and green wavelengths in specific directions [815]. This is called polychroism and
occurs when the particles are larger than those causing Rayleigh scattering, but
still relatively small. For large particles, the scattering is largely independent of
wavelength, causing white light to remain white after scattering. As an example,
Mie scattering causes clouds to appear white or gray, as shown in Figure 2.48.
Mist and fog are other examples of Mie scattering (Figure 2.49), including artifi-
cially created fog such as that used at concerts (Figure 2.50).
A further natural diffraction-related phenomenon explained by Mie’s theory,
is the back scattering that occurs in clouds near the anti-solar point. A viewer
observing clouds with the sun directly behind, most commonly observed from an
airplane, may see a circular pattern called a glory. Dependent on the distance
between the observer and the clouds, the center of the glory will be occupied by
the shadow of the airplane. Figure 2.51 shows an example of a glory. Here the
distance between the clouds and the plane is large, and therefore the shadow is
negligibly small.
i i
i i
i i
i i
84 2. Physics of Light
Figure 2.51. A glory; New Zealand, 2002. (Photograph courtesy of Timo Kunkel
(www.timo-kunkel.com).)
where S(r) is a real scalar function of space r. The function S(r) is known as
the optical path or the eikonal. Vectors e(r) and h(r) are complex vector valued
functions of space. Maxwell’s equations for this particular solution can then be
i i
i i
i i
i i
1
(∇ S) × h + ε e = − ∇× h, (2.173a)
iβ
1
(∇ S) × e − μ h = − ∇× e, (2.173b)
iβ
1
e · (∇ S) = − (e · ∇ ln ε + ∇·e) , (2.173c)
iβ
1
h · (∇ S) = − (h · ∇ ln μ + ∇·h) . (2.173d)
iβ
In the limit that the wavelength goes to zero, the value of β goes to infinity (be-
cause β = 2λπ ). As a result Maxwell’s equations simplify to
(∇ S) × h + ε e = 0, (2.174a)
(∇ S) × e − μ h = 0, (2.174b)
e · (∇ S) = 0, (2.174c)
h · (∇ S) = 0. (2.174d)
1
(e · (∇ S)) ∇ S − e (∇ S)2 + ε e = 0. (2.175)
μ
(∇ S)2 = μ ε , (2.176a)
(∇ S)2 = n2 , (2.176b)
or equivalently
2 2 2
∂S ∂S ∂S
+ + = n2 (x, y, z). (2.177)
∂x ∂y ∂z
This equation is known as the eikonal equation. The gradient of S can be seen
as the normal vector of the wavefront. It can be shown that the average Poynting
vector S is in the direction of the wavefront:
∇S
S= . (2.178)
n
i i
i i
i i
i i
86 2. Physics of Light
This equation is known as the ray equation [466, 617, 650, 1081] and is valid
for inhomogeneous isotropic media that are stationary over time. This equation
expresses the fact that at every point in the medium, the tangent and the normal
vector associated with the ray path span a plane called the osculating plane, and
the gradient of the refractive index must lie in this plane.
A consequence of Fermat’s principle is that in a homogeneous medium where
n(r) is constant, light travels in a straight line. In inhomogeneous media, however,
light may travel along curved arcs according to the above ray equation. This
happens, for instance, in the atmosphere, as shown in Section 2.10.4. Finally,
Snell’s law for reflection and refraction can be derived from Fermat’s principle
(Section 2.5.1).
cos(θ ) = N · L. (2.181)
A = A N · L = A cos(θ ), (2.182)
as illustrated in Figure 2.52. In this figure, a large surface is angled away from the
light source, whereas a smaller surface is aimed directly at the light source. The
i i
i i
i i
i i
A’ = A cos(θ)
θ
A
A’
Figure 2.52. A surface can only receive radiation proportional to its projected area in the
direction of the light source.
projected area of the larger surface in the direction of the light source is identical
to the projected area of the smaller surface, and therefore both surfaces receive
the same amount of radiation.
A similar law called the cosine law of emission states that radiation emit-
ted from iso-radiant or Lambertian surfaces 8 decreases with a factor of cos(θ ),
where θ is the angle of exitance. An example is shown in Figure 2.53, where the
radiation emitted from a point decreases as the angle of exitance increases.
θ2 θ1
Figure 2.53. The radiation emitted from a uniform diffuser decreases as the angle of
exitance Θ increases.
Surfaces that obey the cosine law of emission appear equally light regardless
of the viewing direction. Although no perfect Lambertian surfaces exist in prac-
tice, many surfaces are well approximated by this law. These are called matte or
diffuse surfaces.
point individually, there may be spatial variation in the amount that is emitted. Iso-radiance requires
uniformity across the entire surface [839].
i i
i i
i i
i i
88 2. Physics of Light
Wavefronts
Point source
of this relation lies in a more general law which stems from geometrical consid-
erations; this more general law is called the inverse square law9 .
Consider a point source that emits light. This light propagates through space
as spherical wavefronts. The farther the wave has traveled, the greater the surface
area spanned by the wave. The energy carried by the wave is thus distributed over
increasingly large areas, and therefore the energy density is reduced (Figure 2.54).
Figure 2.55. The objects farther from the light source (a 25W omni-directional incandes-
cent light) are dimmer than the objects closer to it due to the inverse square law.
9 This law applies to other modalities as well, including gravity and sound.
i i
i i
i i
i i
The inverse square law states that the surface density of radiant energy emitted
by a point source decreases with the squared distance d between the surface and
the source:
1
Ee ∝ 2 . (2.183)
d
An example is shown in Figure 2.55 where the mugs farther from the light source
are dimmer than the objects nearer to it.
As a general rule of thumb, if the distance of an observer to a source is greater
than 10 times the largest dimension of the source, it can be approximated as a
point source with respect to the observer [397]. The error associated with this
approximation is important in radiometry and will be further discussed in Sec-
tion 6.2.11.
i i
i i
i i
i i
90 2. Physics of Light
Ee (s)
A = − ln (2.188a)
Ee (0)
= σa s. (2.188b)
T = 10−σa s/ ln(10)
, (2.189a)
A = σa s/ ln(10). (2.189b)
A = ε cs, (2.190)
This relation is known as Beer’s law. Here, we have merged the ln(10) factor
with the constant ε . As both concentration c and pathlength s have the same
effect on absorbance and transmittance, it is clear that a given percent change
in absorbance or transmittance can be effected by a change in pathlength, or a
change in concentration, or a combination of both.
Beer’s law is valid for filters and other materials where the concentration c
is low to moderate. For higher concentrations, the attenuation will deviate from
that predicted by Beer’s law. In addition, if scattering particles are present in the
medium, Beer’s law is not appropriate. This is further discussed in Section 3.4.6.
Finally, if the pathlength is changed by a factor of k1 from s to k1 s, then the
transmittance changes from T to T as follows:
T = T k1 , (2.192)
T = T k2 . (2.193)
i i
i i
i i
i i
Surface normals
Light paths
Figure 2.56. A perfect specular reflector (left) and a rough surface (right). Light incident
from a single direction is reflected into a single direction for specular surfaces, whereas
rough surfaces reflect light into different directions.
Surface normals
Light paths
Multiple reflections
Figure 2.57. Self-occlusion occurs when reflected rays are incident upon the same surface
and cause a secondary reflection.
i i
i i
i i
i i
92 2. Physics of Light
2.9.6 Micro-Facets
The most direct way to model the macroscopic reflectance behavior of surfaces
is to treat a surface as a large collection of very small flat surfaces, called micro-
facets. It is generally assumed that each facet is perfectly specular, as outlined in
the preceding section. The orientation of each facet could be explicitly modeled,
but it is more efficient to use a statistical model to describe the distribution of
orientations. An example is the first micro-facet model, now named the Torrance-
Sparrow model [1140].
Blinn proposed to model the distribution of orientations D(Θi , Θo ) as an expo-
nential function of the cosine between the half-angle and the aggregated surface
normal [105]. The half-angle vector Θh is taken between the angle of incidence
Θi and the angle of reflection Θo :
i i
i i
i i
i i
0.45σ 2
B= , (2.195c)
σ 2 + 0.09
α = max(θi , θo ), (2.195d)
β = min(θi , θo ). (2.195e)
In this set of equations, σ is a free parameter modeling the roughness of the sur-
face. The factor ρ is the (diffuse) reflectance factor of each facet. The directions
of incidence and reflection (Θi and Θo ) are decomposed into polar coordinates,
where polar angle θ is the elevation above the horizon and azimuth angle φ is in
the plane of the facet:
Θ = (θ , φ ) . (2.196)
The above micro-facet models are isotropic, i.e., they are radially symmetric.
This is explicitly visible in the Oren-Nayar model where the result depends only
on the difference between φi and φo , but not on the actual values of these two
angles. This precludes modeling of anisotropic materials, where the amount re-
flected depends on the angles themselves. A characteristic feature of anisotropic
materials is, therefore, that rotation around the surface normal may alter the
amount of light reflected. An often-quoted example of an anisotropic material
is brushed aluminium, although it is also seen at a much larger scale for exam-
ple in pressed hay as shown in Figure 2.58. Several micro-facet models that take
anisotropy into account are available [45, 46, 702, 921, 1005, 1212].
Figure 2.58. Pressing hay into rolls, as shown here, produces an anisotropic effect on
the right side due to the circular orientation of each of the grass leaves; Rennes, France,
June 2005.
i i
i i
i i
i i
94 2. Physics of Light
δ (θo − θi ) δ (φo − φr − π )
fr (Θi , Θo ) = , (2.198)
sin(θr ) cos(θr )
where we have used (2.196), and δ () is the Kronecker delta function. At the other
extreme, Lambertian surface models assume that light incident from all different
directions is reflected equally [645]:
1
fr (Θi , Θo ) = . (2.199)
π
Although this is not a physically plausible model, it is a reasonable approximation
for several materials, including matte paint.
An early model of glossy reflection was presented by Phong [897] and is given
here in modified form with the coordinate system chosen such that x and y are in
the plane of the surface and the z-coordinate is along the surface normal [895]:
e
fr (Θi , Θo ) = Θi · (−Θox , −Θoy , Θoz )T . (2.200)
i i
i i
i i
i i
Here, the outgoing direction Θo is scaled first by (−1, −1, 1)T , thus mirroring
this vector in the surface normal. It is now possible to make this scaling a user
parameter, allowing the modeling of various types of off-specular reflection:
e
fr (Θi , Θo , s) = Θi · s ΘTo . (2.201)
While this BRDF could be used to model glossy materials, there is a further re-
finement that can be made by realizing that this function specifies a single lobe
around the angle of incidence. More complicated BRDFs which are still physi-
cally plausible can be created by modeling multiple lobes and summing them:
n e
ρd
fr (Θi , Θo ) = + ∑ Θi · si ΘTo i . (2.202)
π i=1
ρd
The term models a diffuse component. Each of the n lobes is modeled by a
π
different scaling vector si as well as a different exponent ei . The result is called
the Lafortune model [637] .
An empirical anisotropic reflectance model takes into account the orientation
of the surface. The amount of light reflected into Θo depends on the angle of rota-
tion φn around the surface normal. Defining directional vectors for the incoming
and outgoing directions (vΘi and vΘo ), the half-vector vh is defined by
vΘi + vΘo
vh = . (2.203)
vΘi + vΘo
The angle Θnh is between the half-vector and the surface normal. Ward’s aniso-
tropic BRDF is then given by [269, 1200, 1212]
ρd ρs A
fr (Θi , Θo , s) = + , (2.204a)
π cos(θi ) cos(θo ) 4 π σx σy
where
φn φn
A = exp − tan Θnh cos 2
+ sin 2
. (2.204b)
σx2 σy2
The specular and diffuse components are modeled by ρs and ρd . Their sum will
be less than 1. The function is parameterized in σx and σy , each of which are
typically less than 0.2. These two parameters quantify the level of anisotropy
in the model. If they are given identical values, i.e., σ = σx = σy , the model
simplifies to an isotropic model:
Θnh
exp − tan
ρd ρs σ2
fr (Θi , Θo , s) = + . (2.205)
π cos(θi ) cos(θo ) 4π σ 2
i i
i i
i i
i i
96 2. Physics of Light
Material ρd ρs σx σy
Lightly brushed aluminium 0.15 0.19 0.088 0.13
Rolled aluminium 0.10 0.21 0.04 0.09
Rolled brass 0.10 0.33 0.05 0.16
Enamel finished metal 0.25 0.047 0.080 0.096
Table 2.5. Example parameter settings for Ward’s model of anisotropic reflection
(After [1212]).
“light” here, indicated with the symbol L, since the appropriate radiometric term, radiance, is not
defined yet. Its definition can be found in Section 6.2.7.
i i
i i
i i
i i
where P are points on other surfaces in the scene that contribute to the incident
light. The total amount of light traveling from point P in direction Θo is then
simply the sum of all reflected light into that direction plus the light emitted at
surface point P. The latter value is only non-zero for self-luminous surfaces. The
equation governing light transport in a scene is known as the rendering equation
and is given by [568]
Lo (P, Θo ) = Le (P, Θo ) + fr (P, Θi , Θo )Li (P , Θi ) cos(θi )d ωi . (2.207)
Ωi
i i
i i
i i
i i
98 2. Physics of Light
serves to specify what will be visible in the image, much like holding a photo
camera in a real environment determines the composition of a photograph.
Rather than start at the light sources and follow the path that photons take
through the scene until they hit the view plane, the process is usually reversed
and light is traced backwards from the view point through the view plane into the
scene. This optimization helps reduce the number of rays that needs to be traced,
as there is no guarantee that any ray started at a light source will ever reach the
view plane.
When a ray hits a surface at a point P, called the intersection point, the ren-
dering equation can be evaluated by sampling the environment with further (sec-
ondary) rays. This sampling is potentially costly, as most of the environment
would have to be sampled to determine which parts of the scene illuminated the
intersection point.
To reduce the cost of sampling the environment, it is noted that some parts of
the scene contribute more to the illumination of a point than others. In particular,
nearby light sources that are directly visible from P are likely to contribute the
bulk of illumination. Similarly, unless the material is Lambertian, light arriving
from the reflected direction is more likely to contribute than that from other direc-
tions. By splitting the integral in (2.207) into separate components, an efficient
approximate solution can be found:
+ ρd (P) La (P) .
In this equation, the first term signifies the light emitted from point P, as
before. The second term samples all the light sources directly. In it, v (P, PS )
is the visibility term which equals 1 if there is a direct line of sight between P
and the sample point PS on the light source. The visibility term is 0 otherwise.
In (2.208), the third term samples a small set of directions centered around the
reflected direction to account for specular and possibly glossy reflection. All other
light directions are not sampled, but are approximated with a constant, the fourth
term in (2.208). This last term is named the ambient term. It should also be noted
spec
that the BRDF is split into a specular and diffuse component ( fr and f rdiff ) for
the sake of convenience.
i i
i i
i i
i i
Figure 2.59. Diffuse inter-reflection between the rocks creates an intense orange
glow; Bryce Canyon, Southern Utah. (Image courtesy of the National Park Service
(http://www.nps.gov/archive/brca/photogalleryarchive.htm).)
2.10.2 Radiosity
While ray tracing is able to produce serviceable images in many applications, the
approximations are such that many light paths are not sampled. For instance, light
reflected from a Lambertian surface incident upon a second Lambertian surface,
i.e., diffuse inter-reflection, is not accounted for. Instead, in ray tracing this com-
ponent is approximated with a single constant, the fourth term in (2.208). An
example of diffuse inter-reflection in nature is shown in Figure 2.59.
Under the assumption that an environment consists of Lambertian surfaces,
BRDFs for all surfaces are simplified to be constant:
ρd (P)
fr (P) = . (2.209)
π
The rendering equation 2.207 then simplifies to become the radiosity equation
[200]:
i i
i i
i i
i i
where the integration is over differential surface areas dA . To make the solution
to this equation tractable, the scene is usually subdivided into small patches Ai ,
where the energy associated with each patch is assumed to be constant. For a
patch i this yields the following equation:
ρd,i Lj cos (Θi ) cos (Θ j )
π ∑
Li = Le,i + δi j dA j dAi . (2.211)
j Ai Ai A j π r2
In this equation, r is the distance between the patches i and j and δi j gives the
mutual visibility between the delta areas of the patches i and j. This equation can
be rewritten as
Here, the form factor fi→ j is the fraction of power leaving patch i that arrives at
patch j. Form factors depend solely on the geometry of the environment, i.e., the
size and the shape of the elements and their orientations relative to each other.
Therefore, the radiosity method is inherently view independent. It is normally
used as a preprocessing step to compute the diffuse light distribution over a scene
and is followed by a rendering step that produces the final image (or sequence of
images).
i i
i i
i i
i i
energy whenever such rays hit a Lambertian surface. The data structure used to
store the energy of photons is called a photon map [549].
As light is properly refracted and reflected during this initial pass, the photon
map accurately encodes the caustics caused by transparent and specular objects.
During a second pass, conventional ray tracing may be used in a slightly mod-
ified form. Whenever a Lambertian surface is intersected, rather than continue as
normal, the photon map may be sampled instead. Variations are possible where
rather than storing and sampling energy in a photon map, angles of incidence are
recorded. In the second pass, rather than aiming rays directly at the light sources,
the values in the photon map are read to determine secondary ray directions. In
either solution, caustics can be rendered. In addition, photon mapping may be
extended to simulate the atmosphere as detailed in Section 2.11.
i i
i i
i i
i i
sition
ent po
Appar
Ray
from
th e su
n
Earth
Figure 2.61. The varying density of the atmosphere causes the sun to be in a different
position than where it appears to be.
its temperature—the higher the temperature, the lower its density. The index of
refraction in turn depends on the density of the medium. Within a medium with a
smoothly varying refractive index, rays bend towards the area with a greater index
of refraction [112]. For the earth’s atmosphere, this has the implication that when
the sun is seen just above the horizon, it is in fact just below the horizon, as shown
in Figure 2.61.
In the atmosphere, the density of air is mediated by both pressure and tem-
perature, with both normally being highest near the earth’s surface. However,
local variations in temperature can bend rays in unexpected directions, yielding
effects such as mirages and fata morganas [783]. The inhomogeneous density of
air results in a spatially varying index of refraction.
To bend rays along their path, after advancing each ray by a small distance, its
direction l needs to be recomputed as a function of the local gradient of the refrac-
tive index ∇ n(r). This can be achieved by evaluating the ray equation (2.180).
If sunlight heats the ground, then close to the ground the index of refraction
changes more rapidly than higher above the ground. If an observer looks at a
distant object, then this may give rise to multiple paths that light rays may take
between the object and the observer. As an example, Figure 2.62 shows how
a curved path close to the ground plane may carry light between the roof and
Cool air
Figure 2.62. The roof of the building is visible along a direct path, as well as along a
curved path. The latter gives the impression of the building being reflected in the ground
plane.
i i
i i
i i
i i
Figure 2.63. Rendering of an inferior mirage. (Image courtesy of Diego Gutierrez, Adolfo
Munoz, Oscar Anson, and Francisco Seron; Advanced Computer Graphics Group (GIGA),
University of Zaragoza, Spain [411].)
the observer, whereas a straighter path farther from the ground also carries light
between the roof and the observer. The effect is that the building is visible both
upright and inverted. This is known as a mirage.
Dependent on atmospheric conditions, layers of colder and warmer air some-
times alternate, leading to different types of mirages. The configuration shown
in Figure 2.62 leads to an inferior mirage. A rendered example is shown in Fig-
ure 2.63. An inversion layer may cause the temperature at some altitude to be
higher than at ground level, and this gives rise to a superior mirage, as rendered
in Figure 2.64. Finally, if several such inversion layers are present, observers may
see a fata morgana, as shown in Figure 2.65. Further atmospherical phenomena,
including rainbows, halos, and scattering, are discussed in Section 2.11.
Figure 2.64. Rendering of a superior mirage. (Image courtesy of Diego Gutierrez, Adolfo
Munoz, Oscar Anson, and Francisco Seron; Advanced Computer Graphics Group (GIGA),
University of Zaragoza, Spain [411].)
i i
i i
i i
i i
Figure 2.65. Rendering of a fata morgana. (Image courtesy of Diego Gutierrez, Adolfo
Munoz, Oscar Anson, and Francisco Seron; Advanced Computer Graphics Group (GIGA),
University of Zaragoza, Spain [411].)
2.10.5 Optimizations
Many optimization techniques have been applied to make rendering algorithms
faster, including spatial subdivision techniques, which essentially sort the objects
in space allowing intersection points to be computed without testing each ray
against all objects [437]. Other strategies include distributed and parallel process-
ing, allowing multiple processing units to partake in the computations [161].
As sampling is a core technique in rendering algorithms, optimizing sampling
strategies has received a great deal of attention. In particular, from (2.207) we
have that each point in the scene contributes some amount of light to be reflected
in a direction of our choosing. Using pre-processing techniques, sampling may
be directed towards those parts of the scene that contribute most to the final re-
sult. Such techniques are collectively called importance sampling. An example is
Ostromoukhov’s technique, which is based on Penrose tiling [863].
At the same time, the BRDF kernel will weight these contributions. It is
therefore reasonable to only sample directions that will be weighted with a large
factor. Sampling techniques which take both the environment as well as the BRDF
into account to generate sampling directions should allow accurate results with
minimal loss of accuracy [655].
i i
i i
i i
i i
Direction of Zenith
Sun
interest
Point of interest
θ θs
North
West γ
φ
φs
East
South
Figure 2.66. The geometry related to computing the position of the sun in the sky.
ena such as rainbows, halos, and the northern and southern lights. Each of these
phenomena can be explained [716, 783], and most have been modeled and ren-
dered. Examples include simulation of night scenes [1124], rendering of northern
and southern lights [59], and rendering of mirages and fata-morganas [410, 411].
In this section, we begin by briefly explaining several atmospheric phenomena
and then show how some of them may be incorporated into an image synthesis
algorithm.
11 The Julian date is an integer in the range [1, 365] indicating the day of the year.
i i
i i
i i
i i
4 π (J − 80)
t = ts + 0.170 sin (2.213)
373
2 π (J − 8) 12 (MS − L)
− 0.129 sin + . (2.214)
355 π
The solar declination δ , also required to compute the position of the sun, is ap-
proximated by
2 π (J − 81)
δ = 0.4093 sin . (2.215)
368
With the help of the site latitude l (in radians), the angle in the sky of the sun is
then given by (see also Figure 2.66)
π
π t
θs = − sin−1 sin(l) sin(δ ) − cos(l) cos(δ ) cos , (2.216)
2 12
⎛
π t ⎞
− cos(δ ) sin
⎜ ⎟
φs = tan−1 ⎝ 12
π t ⎠. (2.217)
cos(l) sin(δ ) − sin(l) cos(δ ) cos
12
The second characteristic of the sun required for accurate modeling of the
atmosphere is its spectral distribution. Outside the atmosphere, the spectral dis-
tribution of the solar irradiance is given in Figure 2.67 [403, 404]. At the Earth’s
Spectral irradiance (Wm -2 nm -1 )
2.5
Extraterrestrial solar spectral irradiance
Global tilt
2.0 Direct + circumsolar
1.5
1.0
0.5
0.0
0 500 1000 1500 2000 2500 3000 3500 4000
Wavelength (nm)
Figure 2.67. The extraterrestrial solar irradiance [403, 404], as well as two terrestrial
conditions, as specified by the American Society for Testing and Materials (ASTM) in
standard G173-03 [48].
i i
i i
i i
i i
surface, this distribution is altered due to interaction with the atmosphere [1037].
The American Society for Testing and Materials (ASTM) has defined two refer-
ence spectra using conditions representative of average conditions encountered in
mainland USA [48]; these are plotted in Figure 2.67. These terrestrial standards
make assumptions on the atmosphere, including the use of the 1976 U.S. Stan-
dard Atmosphere [1166] and a surface spectral reflectivity as produced by light
soil [486].
Figure 2.68. A sunset. The distance traveled by sunlight through the thin upper atmo-
sphere causes much of the light at blue wavelengths to be scattered out, leaving the red
part of the spectrum. This is an example of Rayleigh scattering; Clevedon, UK, August
2007.
i i
i i
i i
i i
Figure 2.69. The blue sky is caused by Rayleigh scattering in the thin upper atmosphere;
St. Malo, France, June 2005.
This colors the sun orange and red, as shown in Figure 2.68. For a model for
rendering twilight phenomena the reader is referred to a paper by Haber [413].
The color of the sky at mid-day is also due to Rayleigh scattering. While most
wavelengths travel through the atmosphere undisturbed, it is mostly the blue that
is scattered, as shown in Figure 2.69.
The atmosphere contains many particles that are relatively large. These parti-
cles give rise to Mie scattering , leading, for instance, to haze, a whitening of the
sky’s color. Finally, the scattering of light causes objects in the distance to appear
desaturated and shifted towards the blue. This phenomenon is known as aerial
perspective.
Haze may be described with a heuristic parameter called turbidity [760]. To
determine an appropriate value for turbidity on the basis of a description of the
atmosphere or the meteorological distance,12 Table 2.6 may be consulted (af-
ter [925]13 ).
With the position of the sun known, and a value for turbidity specified, it is
now possible to compute aerial perspective. When light is reflected off a distant
object, along its path towards the observer it may be attenuated. In addition, the
12 The meteorological distance is defined as the distance under daylight conditions at which the
apparent contrast between a black target and the horizon is just noticeable. It roughly corresponds to
the most distant feature that can be distinguished.
13 The following derivation to compute the attenuation along a given path through the sky is also
i i
i i
i i
i i
Table 2.6. Turbidity T as function of the type of atmosphere. Values are approximate.
atmosphere scatters light at all positions in space, and in particular along this
path, some light may be scattered into the direction of the observer. This is called
in-scattering. The light arriving at the observer is therefore a function of the in-
scattered light Lin and the attenuated light L0 :
L = τ L0 + Lin , (2.218)
where L is the amount of light arriving at the observer and τ is the extinction fac-
tor. Both the in-scattering term and the extinction factor depend on the scattering
medium and are, therefore, subject to both Rayleigh scattering for molecules and
Mie scattering for larger particles. The extinction factor (as well as Lin ) depends
on the scattering coefficients βm and βh for molecules and haze respectively:
s s
τ (s) = exp − βm (h(x)) exp − βh (h(x)) . (2.219)
0 0
The function h(x) denotes the height of each point x along the path between 0 and
s and is given by h(x) = h0 + x cos(Θ). The extinction factor is thus an integral
over the path taken by the light through the atmosphere. As the scattering function
depends on the density of the particles, and therefore on the height h above the
Earth’s surface, the scattering function for both βm and βh is given by
β (h) = β 0 exp(−α h), (2.220)
where β 0 is the scattering coefficient at the Earth’s surface and α = 1.3 is the
exponential decay constant. The value of β 0 can be computed as a function of
angle θ (see Figure 2.66), or can be integrated over all angles. These expressions
for the Rayleigh scattering part are given by βm0 :
2
π 2 n2 − 1 6 + 3 pn
βm0 (θ ) = cos 1 + cos2 (θ ) , (2.221)
2N λ 4 6 − 7 pn
2
8 π 3 n2 − 1 6 + 3 pn
βm =
0
. (2.222)
3N λ4 6 − 7 pn
i i
i i
i i
i i
η (log10)
1
-1 400 nm
450 nm
550 nm
650 nm
850 nm
-2
0 20 40 60 80 100 120 140 160 180
Scattering angle
Figure 2.70. Values for η given different wavelengths and scattering angles. These values
are valid for a value of 4 for Junge’s constant v.
where v is Junge’s exponent which takes a value of 4 for sky modeling. The
concentration factor c depends on the turbidity T as follows:
τ (s) = exp (−am (bm − um (s))) exp (−ah (bh − uh (s))) , (2.226)
u(s) = exp (−α (h0 + x cos(θ ))) , (2.227)
i i
i i
i i
i i
K
0.69
0.68
0.67
0.66
0.65
350 400 450 500 550 600 650 700 750 800
Wavelength
β0
a=− , (2.228)
α cos(θ )
b = exp (−α h0 ) . (2.229)
This expression thus forms the first component of the computation of atmospheric
attenuation and can be plugged into (2.218). The second component involves the
computation of the light that is scattered into the path. At any given point along
the path, some light is scattered in, and this quantity thus needs to be integrated
over the length of the path.
We first compute the light scattered in at a specific point. This quantity de-
pends on the scattering functions βm0 and βh0 at sea level, as well as altitude h. We
can thus write expressions for the angular scattering functions in a manner similar
to (2.220):
i i
i i
i i
i i
components):
Lin (x) = um (x) Ls (ω ) βm0 (ω , θ , φ ) d ω (2.231)
Ω
+ uh (x) Ls (ω ) βh0 (ω , θ , φ ) d ω ,
Ω
where we have applied (2.230) and taken u(x) out of the integrals. The total
amount of scattering along a path can be computed by integrating the above ex-
pression over this path, taking into account the attenuation τ that occurs at the
same time (this component can be taken from (2.226)):
s
Lin = Lin (x) τ (s)dx. (2.232)
0
2.11.3 Rainbows
During a rain shower, droplets of water are suspended in air. If a directional light
source, such as the sun, illuminates a volume with water droplets, each droplet
Figure 2.72. A rainbow; Millford Sound, New Zealand, 2001. (Photo courtesy of Timo
Kunkel (www.timo-kunkel.com).)
i i
i i
i i
i i
Secondary rainbow
42
51
Sun
Observer
Primary rainbow
reflects and refracts light. The refraction causes dispersion of the light which de-
pends on wavelength. This gives rise to a rainbow (Figure 2.72). The rainbow is
due to a single internal reflection inside the water droplet. If a second reflection
occurs, this gives rise to a weaker secondary rainbow. The location of these rain-
bows with respect to the sun and the observer is diagrammed in Figure 2.73. The
0 -6
-7
-2
-8
-4
-9
120 130 140 150
-6
-8 440 nm
510 nm
650 nm
-10
0 20 40 60 80 100 120 140 160 180
Scattering angle
Figure 2.74. Intensity as function of scattering angle for water droplets of 1000 μ m
(std. dev. 120 μ m) suspended in air and lit by sunlight. These plots were computed for
perpendicular polarization. (Data generated using Philip Laven’s Mieplot program [654].)
i i
i i
i i
i i
Inensity (log 10 )
-7
-8
-2
-4
-9
-6 120 130 140 150
-8 440 nm
510 nm
650 nm
-10
0 20 40 60 80 100 120 140 160 180
Scattering angle
Figure 2.75. Intensity as function of scattering angle for water droplets of 1000μ m (std.
dev. 120μ m) suspended in air and lit by sunlight. These plots were computed for parallel
polarization. (Data generated using Philip Laven’s Mieplot program [654].)
region in between the primary and secondary rainbows is somewhat darker than
the surrounding sky and is known as Alexander’s dark band.
Droplets of a given size will scatter light over all angles, and this behav-
ior may be modeled with a phase function, which records for all outgoing an-
gles, how much light is scattered in that direction. Figures 2.74 and 2.75 show
the distribution of intensities generated by water droplets with a size distributed
around 1000 μ m with a log normal distribution having a standard deviation of
10 μ m. These distributions differ in their polarization and were both generated
for droplets suspended in air that was lit by sunlight. The plots show marked
peaks in the forward direction (no scattering, i.e., a scattering angle of 0 degrees),
as well as peaks for the primary and secondary rainbows. In addition, Alexander’s
dark band is visible.
2.11.4 Halos
Some clouds, such as cirrus clouds, are formed by ice particles rather than rain
droplets. These ice particles are usually assumed to be shaped like a hexagonal
slab with a varying height-to-width ratio. Crystals that are flat are called plates,
whereas crystals that are elongated are called pencils. Under wind-still conditions,
i i
i i
i i
i i
Plate Pencil
Figure 2.76. Ice crystals formed in cirrus clouds are usually assumed to be either plates
or pencils.
22 halo
46 halo
Parhely (sun-dogs)
Upper-tangent arc
Sun
Observer
Figure 2.77. Different types of halos and their position in relation to the observer and the
sun (after [377]).
i i
i i
i i
i i
Figure 2.78. Halo around the sun. Toward the lower left a second refraction is visible;
Southern Utah, 2001.
have more complicated shapes than simple hexagonal forms [996]. As a result,
the phase function shown in Figure 2.80, which is based on a simplified model,
suggests that the 22◦ and in particular the 46◦ are more common than they are in
practice. See also Sassen and Ulanowski for a discussion [997, 1161].
Figure 2.79. The 46◦ halo, photographed with a wide-angle lens; Wanaka, New Zealand,
2001. (Photo courtesy of Timo Kunkel (www.timo-kunkel.com).)
i i
i i
i i
i i
Phase (log 10 )
4
3
2 22 Halo
1 46 Halo
0
-1
-2
-3
0 30 60 90 120 150 180
Scattering angle
Figure 2.80. Phase function for hexagonal ice crystals (after [963]).
Figure 2.81. On the left, white light is passed through a dispersing prism as well as a tank
of clear water. Normal dispersion is observed. On the right, a few drops of milk are added
to the water. The blue component of the dispersed light is scattered laterally, so that a red
disk with a green rim remains.
i i
i i
i i
i i
light through a container with milky water. The result is a simulation of a sunset
with added dispersion.
∂ L (P, Θ)
= α (P) Le (P, Θ) (2.233)
dP
+ α (P) Li (P, Θ)
+ σ (P) L(P, Θ)
+ α (P) L(P, Θ).
With α the absorption coefficient and σ the scattering coefficient, the four terms in
this equation model emission, in-scattering, out-scattering, and absorption. The
term in-scattering refers to light coming from any direction which is scattering
into the direction of the ray path. Out-scattering accounts for the light scattering
out of the ray’s path. The in-scattered light Li can come from any direction and is
therefore an integral over a full sphere Ω:
Li (P, Θ) = p(P, Θ, Θ ) L(P, Θ ) dΘ . (2.234)
Ω
Thus, the light coming from each direction is weighted by a phase function p
that models the scattering properties of the volume, i.e., the directionality of the
scattering. As desired, this function can be chosen to account for rainbows [963],
halos [371, 372, 377, 541, 542], as well as clouds. In addition, a specific form of
i i
i i
i i
i i
Figure 2.82. Rendering of participating media. The left image shows distortions caused
by spatially varying temperature. The middle image shows a 3D turbulence field simulating
smoke. An anisotropic medium is modeled in the rightmost image. (Images courtesy of
Diego Gutierrez, Adolfo Munoz, Oscar Anson, and Francisco Seron; Advanced Computer
Graphics Group (GIGA), University of Zaragoza, Spain [411].)
this function was shown in Section 2.11.2 that accounts for the coloring of the
sky. Section 3.8.2 discusses an alternative phase function for rendering flames.
In the above, it is implied that each function is also dependent on wavelength.
The scattering is inelastic, i.e., scattering does not change a particle’s wavelength.
To account for elastic scattering, and thus model fluorescence and phosphores-
cence, an extra term can be added to the radiative transfer equation :
pλ (P, Θ , Θ)
αλi (P) f λi →λ (P) Lλi (P, Θ ) dΘ d λi . (2.235)
Ω λ 4π
2.12 Summary
Light is modeled as harmonic electromagnetic waves. The behavior of electric
and magnetic fields gives rise to the propagation of light through free space.
The electric and magnetic vectors are orthogonal to the direction of propagation,
which is indicated by the Poynting vector. The magnitude of the Poynting vector
is commensurate with the amount of energy carried by the wave.
While propagating, the electric and magnetic vectors normally rotate around
the Poynting vector, inscribing an ellipse. When the ellipse degenerates either
into a circle or a line, the light is said to be polarized.
i i
i i
i i
i i
i i
i i
i i
i i
Chapter 3
Chemistry of Matter
In the previous chapter, light was treated exclusively as a wave phenomenon. This
model is suitable for explaining reflection, refraction, and polarization. However,
the wave model alone is not able to explain all phenomena of light. It is therefore
necessary to extend the theory to include the notion of particles. This leads to
the concept of wave-particle duality: light has behaviors commensurate with both
wave models as well as particle models.
It was postulated by de Broglie that not only light, but all matter exhibits
both wave-like and particle-like behaviors. This includes protons, neutrons and
electrons that form atomic and molecular structures.
Thus, a model is needed to account for both wave and particle behaviors,
and such a model is afforded by Schrödinger’s equation. This equation can be
used to show that particles can only have specific energies, i.e., energy levels of
particles are generally quantized. The study of particles at the microscopic level
is therefore termed quantum mechanics [179, 683, 1109].
This chapter first briefly reviews classical physics which forms the basis of
quantum mechanics. Then, quantum mechanics and molecular orbital theory is
explained. These theories are necessary to understand how light interacts with
matter. In particular, we discuss the close interaction of energy states of elec-
trons orbiting atomic nuclei and light of particular wavelengths being emitted,
absorbed, or reflected.
The material presented in this chapter, in conjunction with electromagnetic
theory, also affords insight into the behavior of light when it interacts with matter
which itself has structure at the scale of single wavelengths.
121
i i
i i
i i
i i
p = m v. (3.1)
L = m v r. (3.2)
The momentum of a particular will change over time if a force is applied to it.
According to Newton’s second law of mechanics, a constant force F gives rise to
a linear change of momentum with time:
d(mv) d p
F= = . (3.3)
dt dt
For a particle with constant mass m, this law may be rewritten to yield
dv
F =m = ma, (3.4)
dt
where a is the acceleration of the particle. In addition to mechanical forces, parti-
cles may be subject to forces induced by charges, as predicted by Coulomb’s law
and given in Equation (2.1).
mv
L=mvr
mon symbol used for luminance. In the remainder of this book, L is therefore taken to mean luminance.
i i
i i
i i
i i
E = T +U. (3.5)
mv2 p2
T= = . (3.6)
2 2m
Thus, the kinetic energy of a particle may be expressed in terms of its linear
momentum p. The potential energy of a charged particle in the electrostatic field
of another particle a distance r away is given by
Q1 Q2
U= . (3.7)
r
In the following, we will assume that, for a given system of particles, the total
energy remains constant. In that case, the Hamiltonian H of the system equals its
total energy:
H = E. (3.8)
If we assume that heavy particles, such as protons, are stationary, then the kinetic
energy is due to the electrons only. For instance, the Hamiltonian of a hydrogen
atom (which consists of one proton and one electron) becomes
H = T +U (3.9a)
p 2 e2
= − . (3.9b)
2m r
In this equation, e is the unit charge, which is positive for the proton and nega-
tive for the electron, and r is the distance between the proton and the electron.
For atoms with multiple protons and electrons, the Hamiltonian becomes more
complex as it will include a term for each pair of particles. As an example, the
Hamiltonian for a helium atom with two protons and two electrons would be
i i
i i
i i
i i
therefore the kinetic energy must increase. For the electron and the proton to
remain together, this excess kinetic energy must be lost. Otherwise, the kinetic
energy of the electron can become so large as to overcome the attractive force
of the proton. In that case, the electron would escape. Stability of a system
thus implies that the negative potential energy is greater than the positive kinetic
energy. A consequence is that the total energy of the system has to be negative
for the system to be stable.
While many phenomena may be adequately explained by classical physics, at
the microscopic level there are discrepancies between effects that can be measured
and those that are predicted by the theory. This has necessitated a new theory.
p = mc. (3.12)
Here, p is the relativistic momentum of the photon. The relativistic energy of the
photon is then given by
E = pc. (3.13)
For a given particle, such as an electron or a photon, Max Planck realized that
oscillations occur not at all frequencies, but only at specific frequencies. This
implies that the energy carried by a particle can only take on discrete values and
may be expressed in terms of frequency f :
E = nh f , n = 0, 1, 2, 3, . . . . (3.14)
2 This hints at the well-known wave-particle duality, which will be described in Section 3.2.1.
i i
i i
i i
i i
Thus, quantum theory derives its name from the quantization of energy levels.
h
λ= , (3.16)
p
i i
i i
i i
i i
Normally, the number of protons and neutrons in a nucleus are identical. If these
numbers are different, then the atom is called an isotope.
The number of electrons surrounding the nucleus determines the charge of the
atom. If the number of electrons equals the number of protons in the nucleus, then
the total charge is zero. If the number of electrons and protons is not matched, the
charge is non-zero, and the atom is called an ion. If the charge is positive due to
a lack of electrons, the atom is called a cation. An excess of electrons causes the
atom to be negatively charged, and the atom is then called an anion.
We consider an atom with only a single proton and electron, i.e., a hydrogen
atom. Since the electron is in orbit around the proton, it neither gains nor loses
energy. If this were not the case, the electron would gravitate towards the proton
or spiral away from the proton.
The orbits of electrons are for now assumed to be circular. The angular mo-
mentum L of an electron can then be derived from de Broglie’s postulate. The
length of the orbit is related to its radius r by a factor of 2 π . If an electron is not
gaining or losing energy, then, after each orbit, it will return to the same position.
Ascribing a wave to this electron, the circumference should be an integral number
of wavelengths, i.e.,
2π r = nλ, n = 1, 2, 3, . . . . (3.17)
nh nh
2π r = = , n = 1, 2, 3, . . . . (3.18)
p mv
nh
mvr = = L, n = 1, 2, 3, . . . . (3.19)
2π
Thus, the angular momentum is considered quantized and non-zero (because
n > 0). As an aside, the quantity 2hπ occurs frequently; it has been given its
own symbol h̄ = 2hπ . The expression for mvr allows us to solve for r, i.e., we can
compute the radius at which the electron is orbiting. However, the result depends
on its velocity, which is unknown.
To arrive at an expression for the radius, we proceed as follows. For a proton
with an electron in orbit, the Coulomb force is given by
e2
F= . (3.20)
r2
i i
i i
i i
i i
From Newton’s second law, we know that this equals m a. The acceleration as a
result of the Coulomb force can be computed from the change in velocity after
the electron has rotated around the proton once. The magnitude of the velocity is
given by
2π r
v= , (3.21)
T
where T is the time required for one orbit. The change in magnitude Δv for a
single orbit is Δv = 2 π v. The acceleration is then simply a = Δv/T = 2 π v/T .
Using (3.21), this can be written as
v2
a= . (3.22)
r
We now use (3.20) and find the following relation:
e2 mv2
= . (3.23)
r2 r
To eliminate the unknown velocity, we use the expression for angular momentum
L to solve for v:
nh
v= . (3.24)
2π mr
Substituting into (3.23), we find
n2 h2
r= , n = 1, 2, 3, . . . . (3.25)
4 π 2 m e2
As a result of quantizing the angular momentum, we thus find that the distance
between the electron and the proton is quantized. Since all quantities are known,
the radius is a direct function of quantization level n. The resulting orbits are
called Bohr orbits.
Given these orbits, we can now compute the energy of the electron in terms of
its orbit. The total energy of a hydrogen atom is given by
mv2 e2
E = T +U = − (3.26a)
2 r
p 2 e2
= − . (3.26b)
2m r
Using the equations above, this may be restated as
e2 e2 e2
E= − =− . (3.27)
2r r 2r
i i
i i
i i
i i
Substituting (3.25), the total energy associated with a hydrogen atom is given as
2 π 2 m e4 1
E =− , n = 1, 2, 3, . . . . (3.28)
h2 n2
The ground state of an atom occurs for n = 1. In this state, the energy is most
negative, and the atom is therefore most stable. For larger values of n, we speak of
excited states. As the energy for excited states is less negative than for the ground
state, these states are also less stable. The energy associated with the ground
state is the negative of the ionization potential, which is the energy required to
remove an electron. Thus, the quantization of the angular momentum has led to
the quantization of possible energy states of the atom.
Now, an electron may jump from one state to another, thereby changing the
total energy of the atom. If changing from state n1 to n2 , the corresponding change
in total energy is
2 π 2 m e4 1 1
ΔE = − − . (3.29)
h2 n22 n21
If this energy change is due to a jump to a lower excited state, i.e., n2 < n1 , then
this energy must have been dissipated in some manner, for instance by emission of
a photon. The energy of the photon must equal the change in energy of the atom.
We can now determine the frequency f of the photon, given that its energy equals
ΔE = h f . Solving for f yields an expression for the frequency of the photon:
2 π 2 m e2 1 1
f =− 2
− 2 . (3.30)
h3 n2 n1
With the frequency, as well as the medium through which the photon is traveling
known, the wavelength is also determined according to (2.63). As a result, the
change in energy state of an electron from a higher excited state to a lower excited
state has caused the emission of a photon with a wavelength determined by the
quantitative change in the atom’s energy. This change is quantized according to
the equations given above, and thus the color of the light emitted is quantized as
well.
Conversely, a photon may be absorbed by an electron. The extra energy im-
parted onto the electron is used to bring it to a higher excited state. This absorption
can only happen if the energy of the photon equals the amount of energy required
to make the electron jump between states of excitation. Thus, a photon of the right
wavelength may be absorbed, whereas photons of longer or shorter wavelengths
may pass through the atomic structure, or alternatively get deflected.
i i
i i
i i
i i
5 12 250
Ultra-violet
10 300
4
8
Violet
3 400
Visible range
Blue
6 Green 500
Yellow
2 Orange 600
Red 700
4 800
1000
Infra-red
1
2
Figure 3.2. Energy diagram showing four different scales to represent energy.
λ ΔE = 1239.9. (3.31)
i i
i i
i i
i i
p2
H =E = +U. (3.34)
2m
The equalities remain intact if all terms are multiplied by ψ . This yields the time-
independent Schrödinger equation:
H ψ = E ψ. (3.35)
p2 ψ
= (E −U) ψ . (3.36)
2m
To derive an expression for the Hamiltonian operator, in this case, we begin by
differentiating ψ twice:
∂ 2ψ 4 π 2 p2 2π pz 2π pz
=− A sin + B cos (3.37a)
∂ z2 h2 h h
4 π2 2
=− p ψ. (3.37b)
h2
3 This expression will be extended to three dimensions later in this section.
i i
i i
i i
i i
p2 ψ h2 ∂ 2 ψ
=− 2 , (3.38)
2m 8 π m ∂ z2
and, therefore,
h2 ∂ 2 ψ
− = (E −U) ψ . (3.39)
8 π 2 m ∂ z2
Thus, the Hamiltonian is now given in operator form as follows:
h2 ∂ 2 ψ
H =− +U. (3.40)
8 π 2 m ∂ z2
This operator may be extended to three dimensions by replacing the partial deriva-
tive in the z-direction with the Laplacian operator, as given in (A.19). The Hamil-
tonian operator is then formulated as
h2
H =− ∇2 +U. (3.41)
8π2 m
The integer l is called the angular momentum quantum number, and its value,
describing the rotation of an electron around a nucleus, is strictly less than n.
This implies that for higher energy states, i.e., larger values of n, there are more
possible angular momentum states. The magnetic quantum number m can have
both positive and negative integral values between −l and l. It describes the
projection of the angular momentum of an electron onto an axis determined by an
external magnetic field [370]. The fourth and final quantum number is the spin
quantum number with values of either + 12 or − 12 ; it is a number describing the
rotation of an electron around itself.
An atom has a nucleus surrounded by an electron cloud. The electrons are
arranged in orbits (also called shells), with each orbit containing between zero
i i
i i
i i
i i
l 0 1 2 3 4 5 ...
Symbol s p d f g h ...
Table 3.1. Symbols associated with values of the angular momentum quantum number l.
and two electrons. If there are two electrons in an orbit, then these will have
opposite spin.
By convention, orbital states are indicated with a value for n, followed by a
letter for l. The letter scheme is as indicated in Table 3.1. The first four letters
stand for sharp, principal, diffuse, and fundamental.The letters following these
four are arranged alphabetically [768]. Lowercase letters are used for single elec-
trons, whereas the total angular quantum number L of all the electrons in an atom,
taking values 0, 1, 2, 3, 4, 5, are given the uppercase letters S, P, D, F, G, H.
It is not possible for any two electrons in the same atom to have the same set
of quantum numbers. This is known as Pauli’s exclusion principle [750, 886].
Atoms are formed by filling orbitals in an order dictated by the lowest available
energy states.
The lowest energy orbit has principal quantum number 1 and has a single
atomic orbital, which is referred to as the s-orbital. If this orbital has one elec-
tron, then the electronic configuration corresponds to hydrogen (H), whereas the
1 0 0 + 12 ,− 12 1s 2
2 0 0 + 12 ,− 12 2s 2
2 1 -1 + 12 ,− 12
2 1 0 + 12 ,− 12 2p 6
2 1 1 + 12 ,− 12
3 0 0 + 12 ,− 12 3s 2
3 1 -1 + 12 ,− 12
3 1 0 + 12 ,− 12 3p 6
3 1 1 + 12 ,− 12
3 2 -2 + 12 ,− 12
3 2 -1 + 12 ,− 12
3 2 0 + 12 ,− 12 3d 10
3 2 1 + 12 ,− 12
3 2 2 + 12 ,− 12
Table 3.2. Quantum numbers associated with the lowest electron orbitals.
i i
i i
i i
i i
Atom Configuration
Lithium Li 1s2 2s1 [He] 2s1
Beryllium Be 1s2 2s2 [He] 2s2
Boron B 1s2 2s2 2p1 [He] 2s2 2p1
Carbon C 1s2 2s2 2p2 [He] 2s2 2p2
Nitrogen N 1s2 2s2 2p3 [He] 2s2 2p3
Oxygen O 1s2 2s2 2p4 [He] 2s2 2p4
Fluorine F 1s2 2s2 2p5 [He] 2s2 2p5
Neon Ne 1s2 2s2 2p6 [He] 2s2 2p6
Table 3.3. Electron configurations for atoms with electrons in the first two shells.
i i
i i
i i
i i
Atom Configuration
Table 3.4. Electron configurations for atoms with electrons in the first three shells (ex-
cluding d-orbitals).
Atom Configuration
Scandium Sc 1s2 2s2 2p6 3s2 3p6 3d1 4s2 [Ar] 3d1 4s2
Titanium Ti 1s2 2s2 2p6 3s2 3p6 3d2 4s2 [Ar] 3d2 4s2
Vanadium V 1s2 2s2 2p6 3s2 3p6 3d3 4s2 [Ar] 3d3 4s2
Chromium Cr 1s2 2s2 2p6 3s2 3p6 3d5 4s1 [Ar] 3d5 4s1
Manganese Mn 1s2 2s2 2p6 3s2 3p6 3d5 4s2 [Ar] 3d5 4s2
Iron Fe 1s2 2s2 2p6 3s2 3p6 3d6 4s2 [Ar] 3d6 4s2
Cobalt Co 1s2 2s2 2p6 3s2 3p6 3d7 4s2 [Ar] 3d7 4s2
Nickel Ni 1s2 2s2 2p6 3s2 3p6 3d8 4s2 [Ar] 3d8 4s2
Copper Cu 1s2 2s2 2p6 3s2 3p6 3d10 4s1 [Ar] 3d10 4s1
Zinc Zn 1s2 2s2 2p6 3s2 3p6 3d10 4s2 [Ar] 3d10 4s2
Helium He 1s2
Neon Ne [He] 2s2 2p6
Argon Ar [Ne] 3s2 3p6
Krypton Kr [Ar] 3d10 4s2 4p6
Xenon Xe [Kr] 4d10 5s2 5p6
i i
i i
i
i
i
i
3.2. Quantum Mechanics
Ia II a III b IV b Vb VI b VII b VIII Ib II b III a IV a Va VI a VII a O
1 H 2 He
1.0079 4.0026
Hydrogen Helium
Colorless Colorless
Legend: Color coding:
3 Li 4 Be 5 B 6 C 7 N 8 O 9 F 10 Ne
6.941 9.01218 Atomic number Symbol Metals 10.81 12.011 14.0067 15.9994 19.9984 20.179
Lithium Beryllium Atomic weight Transition Metals Boron Carbon Nitrogen Oxygen Fluorine Neon
Silvery white Lead gray Name Metalloids Black Black Colorless Pale blue Pale yellow Colorless
gray Color Non-Metals Colorless
Inert Gases
Figure 3.3. The periodic table of elements.
11 Na 12 Mg 13 Al 14 Si 15 P 16 S 17 Cl 18 Ar
22.98977 24.305 26.98154 28.086 30.97376 32.06 35.453 39.948
Sodium Magnesium Aluminium Silicon Phosphorus Sulfur Chlorine Argon
Silvery white Silvery white Silvery Dark gray, Colorless/red Lemon Yellowish Colorless
bluish tinge silvery white yellow green
19 K 20 Ca 21 Sc 22 Ti 23 V 24 Cr 25 Mn 26 Fe 27 Co 28 Ni 29 Cu 30 Zn 31 Ga 32 Ge 33 As 34 Se 35 Br 36 Kr
39.098 40.08 44.9559 47.90 50.9414 51.996 54.9380 55.847 58.9332 58.70 63.546 65.38 69.72 72.59 74.9216 78.96 79.904 83.80
Potassium Calcium Scandium Titanium Vanadium Chromium Manganese Iron Cobalt Nickel Copper Zinc Gallium Germanium Arsenic Selenium Bromine Krypton
Silvery white Silvery white Silvery white Silvery Gray Silvery Silvery Metallic, Metallic, Metallic, Copper, Bluish, pale Silvery white Grayish Metallic Metallic Red-brown, Colorless
metallic metallic metallic metallic grayish tinge grayish tinge silvery tinge metallic gray white gray gray metallic
37 Rb 38 Sr 39 Y 40 Zr 41 Nb 42 Mo 43 Tc 44 Ru 45 Rh 46 Pd 47 Ag 48 Cd 49 In 50 Sn 51 Sb 52 Te 53 I 54 Xe
85.4648 87.62 88.9059 91.22 92.9064 95.94 98.9062 101.07 102.9055 106.4 107.868 112.40 114.82 118.69 121.75 127.60 126.9045 131.30
Rubidium Strontium Yttrium Zirconium Niobium Molybdenum Technetium Ruthenium Rhodium Palladium Silver Cadmium Indium Tin Antimony Tellurium Iodine Xenon
Silvery white Silvery white Silvery white Silvery white Gray Gray Silvery gray Silvery white Silvery white Silvery white Silver Silvery gray Silvery Silvery Silvery Silvery Violet-dark Colorless
metallic metallic metallic metallic metallic metallic metallic gray gray gray gray gray
55 Cs 56 Ba 71 Lu 72 Hf 73 Ta 74 W 75 Re 76 Os 77 Ir 78 Pt 79 Au 80 Hg 81 Tl 82 Pb 83 Bi 84 Po 85 At 86 Rn
132.9054 137.34 174.97 178.49 180.9479 183.85 186.207 190.2 192.22 195.09 196.9665 200.59 204.37 207.2 208.9804 210 211 222
Caesium Barium Lutetium Hafnium Tantalum Tungsten Rhenium Osmium Iridium Platinum Gold Mercury Thallium Lead Bismuth Polonium Astatine Radon
Silvery gold Silvery white Silvery white Gray steel Gray blue Grayish Grayish Bluish gray Silvery white Grayish Gold Silvery white Silvery white Bluish Reddish Silvery Metallic Colorless
white white white white white
87 Fr 88 Ra 103 Lr 104 Rf 105 Db 106 Sg 107 Bh 108 Hs 109 Mt 110 Ds 111 Rg 112 Uub
223 226.0254 257 257 264 266 264 269 268 281 272 285
Francium Radium Lawrencium Rutherfordium Dubnium Seaborgium Bohrium Hassium Meitnerium Darmstadtium Roentgenium Ununbium
Metallic Metallic Unknown Unknown Unknown Unknown Unknown Unknown Unknown Unknown Unknown Unknown
57 La 58 Ce 59 Pr 60 Nd 61 Pm 62 Sm 63 Eu 64 Gd 65 Tb 66 Dy 67 Ho 68 Er 69 Tm 70
138.9055 140.12 140.9077 144.24 147 150.4 151.96 157.25 158.9254 162.50 164.9304 167.26 168.9342 Yb
Lanthanide
Lanthanum Cerium Praseodymium Neodymium Promethium Samarium Europium Gadolinium Terbium Dysprosium Holmium Erbium Thulium 173.04
series
Silvery white Silvery white Silvery white Silvery white Metallic Silvery white Silvery white Silvery white Silvery white Silvery white Silvery white Silvery white Silvery white Ytterbium
yellowish yellowish
135
i
i
i
i
i i
i i
x
z
s
px py pz
d yz d xz d xy d x2 - y 2 d z2
Figure 3.4. A schematic of s-, p-, and d-orbitals. Note that this is a representation of a
probability density function, and that therefore the boundary is not sharp in reality, nor is
the shape accurate.
i i
i i
i i
i i
σ-bonds
π-bonds
The three 2p-orbitals also form bonds, see Figure 3.5. In particular, the px -
orbitals of two atoms may directly overlap, forming a strong 2pσ -bond or 2pσ ∗ -
anti-bond. The two 2py - and 2pz -orbitals can also bond, forming weaker π -bonds
and π ∗ -anti-bonds, dependent on whether the wave-functions have identical or
opposite signs.
Molecules can bond more than once, i.e., double or triple bonding can occur.
For instance, oxygen (O2 ) produces a double bond—one 2pσ -bond due to the px -
orbital and a 2pπ -bond due to either the py - or pz -orbital. Nitrogen N2 , which has
a 2s2 2p3 configuration, forms a triple bond involving px -, py -, and pz -orbitals, as
shown in Figure 3.6.
Each bonding between atoms creates an energy level lower than the sum of
the energies of the orbitals associated in the bonding. Similarly, anti-bonding
involves a higher energy level than two unbonded orbitals. As such, we may
expect that molecules with orbitals when bonding with orbitals in other atoms,
absorb photons with different energies than single atoms. Thus, much of the
richness and variety of color in substances is due to the formation of bonds in
molecules, or to the presence of atoms and ions inside a crystal lattice consisting
of different types of atoms.
i i
i i
i i
i i
Figure 3.7. The strong ionic bonding between Na+ and Cl − causes absorption at wave-
lengths outside the visible range. As a result, kitchen salt is colorless.
i i
i i
i i
i i
of the bonds in this cubic form of carbon, resulting from the pairing of electrons,
causes diamond to be colorless.
i i
i i
i i
i i
A second example is given by emerald. The shape of the crystal lattice is the
same, albeit somewhat larger. The impurities are also formed by Cr3+ ions. This
results in an energy diagram which is essentially the same, although the upper
energy levels are closer to the ground state. This shifts the absorption spectrum
towards red, yielding the green color with a hint of blue typically seen in emeralds.
The impurities that give ruby and emerald their color (Cr2 O3 in this case),
are called transition metal impurities. The resulting colors are called allochro-
matic transition metal colors . Minerals may be colored with chromium, cobalt,
iron, manganese, nickel, samarium, and vanadium impurities, and, in general, the
transition metals listed in Table 3.5.
If the transition element is also the main compound, we speak of idiochro-
matic transition metal colors. Elements involved in idiochromatic transition met-
als are chromium, manganese, iron, cobalt, nickel, and copper. Green, for in-
stance, can be formed in several ways in crystals, including chrome green
(Cr2 O3 ), manganese (II) oxide (MnO), melantite (Fe2 O3 ), cobalt (II) oxide
(CoO), bunsenite (NiO), and malachite (Cu2 (CO3 )(OH)2 ). An example of a red
colorization is rhodochrosite (MnCO3 ), whereas blue can be formed for instance
by Thenard’s blue (Al2 CoO4 ), azurite (Cu3 (CO3 )2 (OH)2 ), and turquoise
(CuAl6 (PO4 )(OH)8 ) [815].
Here, an electron has moved from the iron ion to the titanium ion. The difference
in energy between the two states is 2.11 eV. Thus, absorption of photons with en-
ergies of 2.11 eV is possible, resulting in a deep blue color. Corundum containing
both titanium and iron is known as blue sapphire.
The crystal structure of corundum is such that the above charge transfer occurs
for ions that are 2.65 Å apart.4 However, the crystal lattice also allows Ti and Fe
ions at a distance of 2.79 Å, resulting in a weaker second absorption which creates
a blue-green color. These two configurations occur at right angles, which means
4 The Ångstrom is a measure of distance, and equals one tenth of a nanometer.
i i
i i
i i
i i
that the blue and blue-green colors are seen dependent on the orientation of the
crystal. Blue sapphire thus exhibits dichroism.
Other examples of colors caused by charge transfer include many pigments,
such as Prussian blue, as well as yellow, brown, and black colors seen in rocks
and minerals that contain iron.
Antibonding
Bonding
>0
Figure 3.8. Morse curve showing energy as a function of distance. The inter-atomic
distance is normally measured in Ångstrom, which is one-tenth of a nanometer (see also
Figure 2.4).
i i
i i
i i
i i
Figure 3.8 shows the energy of the ensemble of two hydrogen atoms as func-
tion of the distance between the two atoms. The top curve is for two electrons hav-
ing parallel spin, whereas the bottom curve is for electrons having opposite spin.
These types of plot are called Morse curves or potential energy diagrams. For
hydrogen with electrons in opposite spin, the distance between atoms is 0.74 Å.
The molecular hydrogen orbitals in one H2 molecule can accommodate the
same number of electrons in two orbitals as two atomic hydrogen orbitals in one
orbital each. Extending this approach by involving N hydrogen atoms, the number
of orbitals and energy levels increases by N. These energy levels will be very close
together (and depend on the inter-atoms distance between each pair of hydrogen
atoms), such that individual levels can not be distinguished. Thus, a large number
of hydrogen atoms in close proximity creates an energy band.
Typically, electrons in partially filled outer orbitals form bands, whereas fully
filled inner orbitals are not involved in bands unless their inter-atomic distance
becomes very small. Unfilled shells can form bands at much larger distances, as
shown in Figure 3.9.
For some materials, two or more of the outer orbitals, such as the 2s- and 2p-
orbitals of lithium, each broaden into two bands [781]. For metals, these bands
may overlap, causing interactions between electrons in different bands. This aids
conductivity in metals.
Energy
Empty outer
orbitals
Partly filled
valence orbitals
Filled inner
shells
0 d
Distance (Å)
i i
i i
i i
i i
The band structure of metals has the effect that light is strongly absorbed at
many different wavelengths. Refracted light usually travels into the metal for
less than a single wavelength (several hundred atoms), before it is completely
absorbed. This absorption creates a current through the metal near the surface,
which in turn produces new photons which appear as reflected light. As a conse-
quence, metals strongly reflect light.
Within the band structure, some transitions in energy levels are favored over
others. While the precise mechanism is currently poorly understood [815], the
limitations of state transitions give rise to a wavelength dependent reflectivity as
a function of the atomic structure of different metals. For instance, silver reflects
light over a broader range of wavelengths than gold, which reflects less light at
shorter wavelengths.
In non-conducting materials, bands formed by different orbitals may repel,
as in the case of the 2s- and 2p-electrons in diamond. This causes a gap to occur
between the 2s- and 2p-bands, which for diamond are well separated at the carbon
bond distance of 1.54 Å. The lower energy band is called the valence band and
the upper band the conduction band. As the conduction band is at a higher energy
level than the most energetic electrons in the material,5 the conduction band is
empty, and diamond therefore acts as an insulator [815].
The different atomic structures of materials thus give rise to different interac-
tions between bands, as well as different sizes of the energy gap between bands. If
the bands overlap in energy level, then we speak of metals. If there is a large gap
between bands, such as the 5.4 eV band gap of pure diamond, the material is an
insulator. If the gap is relatively small, then we speak of semiconductors—a class
of materials that plays a crucial role in the manufacture of transistors and chips.
The energy, and therefore the wavelength, of light that can be absorbed is
dependent on the size of the band gap. Light needs to have sufficient energy to
excite an electron such that it can jump from the valence band into the conduction
band. For diamond the band gap is so large that no visible light can be absorbed
and, hence, the diamond appears colorless.
For a medium band-gap semiconductor, such as the pigment cadmium yellow
(CdS), the band gap is 2.6 eV, and this is small enough to absorb some light in the
violet and blue range. The pigment vermillion (HgS) has a yet smaller band gap
of 2.0 eV, allowing the absorption of most visible wavelengths except red. Band
gaps below 1.77 eV allow absorption of all visible wavelengths, and materials
with small band gaps therefore appear black.
It is possible to add impurities to pure semiconductors. This involves replac-
ing a small amount of semiconductor material with a different substance. Such
5 The energy of the most energetic electrons in a material is called the Fermi energy or Fermi level.
i i
i i
i i
i i
impurities are called activators or dopants. For instance, adding a small concen-
tration of nitrogen (N) to diamond will introduce extra electrons that form a new
energy level within the band gap, called a donor level. As a result, the band gap is
effectively split into two smaller gaps, and now light in the blue and violet range
can be absorbed. Thus, diamond containing some nitrogen impurities will have a
yellow color.
Similarly, impurities may be added that create an electron deficiency, and
therefore they may insert an acceptor level within the band gap. Wherever
an electron deficiency exists, we speak of a hole. An example is the addition of
boron which, when added to diamond, causes a blue coloration.
3.4 Molecules
Molecules consist of collections of atoms that form bonds. Collections of atoms in
a single molecule can be of a particular form and can occur in different molecules.
Such collections are called groups. Some groups form bonds with energy levels
enabling the absorption of photons within the visible range of wavelengths. In
that case, the group is said to be color-bearing.
Other groups may bring about shifts in intensity and/or wavelength. Groups
that increase or decrease intensity are referred to as hyperchromic and hypo-
chromic, respectively. Substances that aid in color shifts are called bathochromic
if the shift is towards the red end of the spectrum and hypsochromic if the shift is
towards the blue end.
In (organic) molecules, color-bearing groups of atoms are called chromo-
phores. Examples of chromophores include carbon-carbon double bonds as well
as nitrogen double bonds. The molecules containing such chromophores are
called chromogens, although they frequently have other groups as well that are
involved in color formation. These extra groups are called auxochromes [815,
1128]. The auxochromes shift as well as strengthen the color, i.e., they are both
bathochromic and usually hypochromic. Auxochromes include NH2 , NO2 , and
CH3 structures as well as NHR, and NR2 , where R is a more complex organic
group.
Several important classes of organic molecules exists, and these are briefly
discussed in the following sections.
3.4.1 Porphyrins
Carbon atoms often form double bonds, yielding an absorption spectrum that
peaks outside the visible range. However, chains of carbon atoms that alternate
i i
i i
i i
i i
Table 3.7. Peak absorption wavelength as a function of the number of conjugated double
bonds (after [1099]).
between single and double bonds create absorption spectra that are shifted toward
the visible range. Such chains are called conjugated double bonds. For instance,
a chain of six such double bonds:
reaches the blue end of the spectrum. Molecules containing such a group therefore
appear yellow. Examples are α - and β carotene, which appear in carrots. The
number of conjugated double bonds determines where the peak in the absorption
spectrum lies. The π -electrons in such chains are mobile across the chain. Adding
further double bonds increases their mobility and, thereby, the number of resonant
configurations. This lowers the energy level of the lowest excited level and, thus,
shifts the absorption spectrum further into the visible range [767]. Table 3.7 lists
the peak absorption wavelength for several aldehydes.
Conjugated bonds also appear in more elaborate molecular structures, for in-
stance, in ones where bonds circle a single metal atom. Such structures are called
porphyrins, with chlorophyll being one of the most important naturally occur-
ring one, giving plants and algae their green color (Figure 1.1). The chlorophyll
molecule has a magnesium atom at its center.
Iron is the center of a different porphyrin, called heme, which in many species
is part of the hemoglobin molecule. This molecule is responsible for transporting
iron through the bloodstream. Heme is responsible for the color of blood. How-
ever, this color is derived from π to π ∗ transitions, rather than from the presence
of an iron atom.
i i
i i
i i
i i
ticular, some molecules turn colorless when illuminated, a process called called
bleaching.
The most important photochromic molecules are the ones that transduce light
in the visual system. For instance, the molecules present in the rods, called
rhodopsin, have a red-purple color, changing to colorless under the influence of
light. Under illumination, rhodopsin transforms via several intermediate forms,
to a range of bleached molecules called metarhodopsins. In the dark, these mole-
cules revert back to rhodopsin, through a range of intermediate forms.
The chromophore in rhodopsin is called 11-cis-retinal. It is connected to an
opsin through the amino acid lycene. The opsin acts as an auxochrome and causes
a bathochromic shift by raising the peak of the absorption spectrum of 11-cis-
retinal from 370 nm to 570 nm.
Under medium light intensities, the incidence of a photon causes the 11-cis-
retinal to change to the trans form, called all-trans-retinal. The reaction that ef-
fects this change also causes a chain of events that leads to the rod emitting a
signal, thereby mediating vision. The configuration of all-trans-retinal connected
to the opsin via lycene is called metarhodopsin.
In cones, the retinal is believed to be connected to different opsins, which
cause different bathochromic shifts. This creates different peak sensitivities in the
three cone types, thereby enabling color vision.
Under low light levels, the trans-retinal may become fully dislodged from
its opsin and will be transformed back to its cis form by means of enzymes lo-
cated in the eye. Under yet lower light conditions, the trans-retinal may leave
the eye altogether. It is then carried through the bloodstream to the liver, where
it is regenerated into the cis-form. This complex sequence of events under low
light conditions causes dark adaptation to progress slowly over a long period of
time [1128] (see Section 4.3.2).
Another application of a photochromic material is found in sunglasses that
change to a darker state under UV illumination. Hence, outside in the sun, they
will be darker than indoors.
3.4.3 Colorants
Color may be given to objects by means of colorants, which is the general term
used for substances that produce color. A distinction is normally made between
pigments and dyes, as seen in the definition of pigments offered by the Color
Pigments Manufacturers Association (CPMA) [678]:
i i
i i
i i
i i
and chemically unaffected by, the vehicle or substrate in which they are in-
corporated. They alter appearance by selective absorption and/or by scatter-
ing of light.
Pigments are usually dispersed in vehicles or substrates for application,
as for instance in inks, paints, plastics or other polymeric materials. Pigments
retain a crystal or particulate structure throughout the coloration process.
As a result of the physical and chemical characteristics of pigments, pig-
ments and dyes differ in their application; when a dye is applied, it penetrates
the substrate in a soluble form after which it may or may not become insolu-
ble. When a pigment is used to color or opacify a substrate, the finely divided
insoluble solid remains throughout the coloration process.
If after application the substrate evaporates, leaving only the pigment particles
behind, we speak of an ink; an example is watercolor. In other cases, the substrate
may harden and form a protective coating, as for instance in oil paints.
Dyes, on the other hand, are usually dissolved in a solvent [1247]. The dye
molecules do not stay together, but mix throughout the solvent. These molecules
absorb light selectively, which gives the solution its color. In between dyes and
pigments is a third category of colorants, called lakes. These are essentially pig-
ments in which the color of the particles is created by dyeing. Before dyeing,
the particles, which are typically made of aluminium oxide, are colorless and
translucent.
Almost all colorants can be synthetically produced, with the exception of
chlorophyll, which is used to color, for instance, soap and chewing gum green.
For yellow, orange, and red colorants, a single absorption band in the blue-violet-
green region is required. A single absorption band in the red-orange-yellow region
creates blue and violet colors. However, to produce green, two absorption bands
in the red and violet parts of the spectrum are required, which presents a difficult
problem for designers of organic colorants [815]. Using a mixture of yellow and
blue colorants tends to produce less saturated colors than a single green colorant
and may cause a change in color when one of the two colorants fades faster than
the other.
There are many ways to specify a colorant. One is the color index, which
is produced by the Society of Dyers and Colourists and the American Associ-
ation of Textile Chemists and Colorists, and contains some 8000 dyes and pig-
ments [1067]. In addition, pigments and dyes can have many designations, given
by different associations and agencies. For instance color index C.I. 14700, which
was formerly used to color food, is also known as D & C Red No 4 by the U.S.
Food and Drug Administration, as well as Food Red 1 and Ponceau SX [815].
Some substances are neither dyes or pigments, such as substances used to color
glasses, glazes, and enamels.
i i
i i
i i
i i
3.4.4 Dyes
As dye molecules are fully dissolved, the interaction of light with dyes is rela-
tively straightforward and follows Beer’s Law (see (2.191)). This means that the
transmittance of a material depends directly on its concentration of dye molecules.
Similarly, the amount of light absorbed depends on the thickness of the layer, as-
suming that the dye is applied in layers.
If a dye is constructed by mixing multiple types of dye, then the wavelength-
dependent transmittance of the mixtures can be inferred from the transmittance
functions Ti of each dye i separately by simply multiplying the transmittance func-
tions together for each wavelength:
n
Tmix (λ ) = ∏ Ti (λ ). (3.45)
i=0
Changing the concentration of a colorant may alter its perceived color. Several
mechanisms contribute to this, including scattering, which for instance causes
the white head on a glass of beer. A second mechanism is dye dichroism,6 as
illustrated in Figure 3.10. The two curves in this figure show absorption in the
same region, which is partly in the visible range. The low concentration absorbs a
small amount in the violet region, causing a yellow color. As the concentration is
increased, a more significant portion of the visible spectrum is absorbed, including
blue and green, yielding an orange to red color. This effect would not occur
Absorption
Visible range
Dye concentration:
High
Low
200 250 300 350 400 450 500 550 600 650
Wavelength (nm)
Section 3.7.
i i
i i
i i
i i
Figure 3.11. Undiluted yellow food coloring, containing E102, E110, and E124, is red
(left), whereas diluting this substance with water produces a yellow color (right).
if the absorption spectrum were entirely located in the visible region. Some yel-
low food colorings show this dependency of color on concentration, as shown in
Figure 3.11. Dichroic dyes find application in the construction of certain LCD
display devices [537].
A related phenomenon is outlined in Figure 3.12, where all absorption takes
place in the visible spectrum. At low concentrations, the dent around 600 nm
causes the dye to look yellow. Increasing the concentration makes this dent rela-
tively unimportant, as yellow wavelengths are now mostly absorbed. As a result,
the lack of absorption in the violet range causes the material to appear violet.
Assume that a surface is painted with a dye where the chromogens are dis-
solved at a concentration c, and that the layer has a thickness of h. The painted
layer can then be modeled using Beer’s law (Section 2.9.4). Dyes can thus be
characterized by their thickness as well as the concentration of their chromogens.
As dyes do not have a particulate structure, more complex phenomena, such as
scattering, do not need to be taken into account. This is in contrast to pigmented
paints, which are discussed in Section 3.4.6.
3.4.5 Bleaching
The chromogens of organic dyes can be split by chemical reactions. The active
ingredients involved in these reactions are usually either chlorine or hydrogen
i i
i i
i i
i i
Absorption
Dye concentration:
High
Low
350 400 450 500 550 600 650 700 750 800
Wavelength (nm)
peroxide—substances used to bleach fabrics, paper, and hair. In each case, the
reaction breaks the π component of a double bond, leaving only a single σ -bond,
or it may even break both bonds.
As a result, the reaction causes a hypsochromic shift towards the blue end of
the spectrum and may move the peak of the absorption spectrum into the ultravi-
olet region. As the peak of absorption is then outside the visible region, the result
of chemical bleaching is a loss of color.
3.4.6 Pigments
As pigments are insoluble particles, their index of refraction is usually different
from the material they are suspended in. As a result, pigments can be used both to
give color to a material and to make the material more opaque. To maximize the
hiding power of a pigment, the difference between the index of refraction of the
pigment and its carrier should be as large as possible. This then creates particles
that mostly reflect and scatter light, rather than transmit it. Pigments in white
paints, for instance, usually have a refractive index greater than 2.0.
A paint usually contains a carrier, such as linseed oil in the case of artists’ oil
paints, with pigment particles suspended in it. A full account of light interaction
with a layer of paint would involve scattering, which could be modeled as Mie
scattering. However, this approach would be too complicated [629], and a simpler
theory is required to model how light is absorbed and scattered in a pigmented
paint. In this case, Beer’s law is not appropriate either, as this empirical law does
not assume that scattering occurs in the medium.
Instead, Kubelka-Munk theory is normally applied to describe light behavior
in a layer of paint [628]. Light traveling through such a layer has at every point
i i
i i
i i
i i
Paint dx
layer L L x
Substrate Rg
(K + S) L↓ dx. (3.46)
The attenuation induced by the same sublayer for light traveling upward is
similarly
(K + S) L↑ dx. (3.47)
Since we assume that the light that is scattered is always scattered in the opposite
direction, the attenuation in each direction is modified by −S L dx. Hence, the
total change for light traveling in either direction due to a sublayer of thickness
dx is
This pair of equations can be solved as follows [412]. First, we rearrange terms
and substitute a = 1 + K/S:
dL↓
= a L↓ − L↑ , (3.49a)
S dx
−dL↑
= a L↑ − L↓ . (3.49b)
S dx
i i
i i
i i
i i
Using the quotient rule and making the substitution r = L↑ /L↓ , we get
dr
= r2 − 2ar + 1. (3.51)
S dx
Rearranging and integrating yields
dr
=S dx. (3.52)
r2 − 2ar + 1
For a thickness of 0, the value of r will be equal to the reflectance of the un-
derlying material, Rg . For thicknesses larger than 0, the reflectance of the paint
plus its substrate is R. Hence, the integration limits of the left-hand side of the
√ equation are Rg and R, allowing us to carry out the integration. Using
above
b = a2 − 1, we have
R R
dr 1 dr 1 dr
= − R (3.53a)
Rg r2 − 2ar + 1 2b Rg r − (a + b) 2b Rg r − (a − b)
1 (R − a − b)(Rg − a + b)
= ln , (3.53b)
2b (R − a + b)(Rg − a − b)
= St, (3.53c)
where t is the thickness of the layer of paint. For a layer of paint with a hypo-
thetical thickness of infinity, the hiding of the substrate is complete. In this case,
the value of Rg can be set to any convenient value, such as Rg = 0. A further
rearrangement gives
(R − a − b)(−a + b)
lim = (R − a + b)(−a − b), (3.54a)
t→∞ exp (2Stb)
0 = (R − a + b)(−a − b). (3.54b)
Hence, for complete hiding, the reflectance of a paint is given by
1
R= √ (3.55a)
a + a2 − 1
1
= . (3.55b)
K K 2
1+ + 1+ −1
S S
i i
i i
i i
i i
This last equation is the Kubelka-Munk equation, which expresses the reflectance
of a layer of paint given absorption and scattering coefficients, K and S, respec-
tively. Equivalent forms of this equation are
K K 2 K
R = 1+ − +2 , (3.56a)
S S S
K (1 − R)2
= . (3.56b)
S 2R
These equations are valid for pigmented layers of paint where the pigment is
dispersed homogeneously throughout the carrier and the pigment particles them-
selves all have equal density.
It is assumed here that both absorption and scattering coefficients are wave-
length-dependent functions. Their absolute units are unimportant, as only their
ratio is ever used. This simplifies measuring K and S values substantially. Con-
sider, for instance, that the spectral reflectance function R(λ ) of a paint is given.
It is then possible to simply set S(λ ) = 1, and compute K(λ ) using the Kubelka-
Munk equation (3.56b).
A paint containing several different pigments with absorption and scattering
coefficients Ki (λ ) and Si (λ ) yields absorption and scattering coefficients KM and
SM for the mixture as follows:
n
KM (λ ) = ∑ Ki (λ ) ci , (3.57a)
i=1
n
SM (λ ) = ∑ Si (λ ) ci . (3.57b)
i=1
Here, ci is the concentration of the ith pigment. Once the K and S functions
for a given pigment are known, the K and S values for paints containing this
pigment can be derived on the basis of the Kubelka-Munk equation and the linear
superposition of the known and unknown pigments.
One of the areas where the Kubelka-Munk theory is useful is in predicting the
difference between organic and inorganic paint mixtures. A mixture of white with
an inorganic pigment tends to appear more gray than a mixture of white with an
organic pigment at comparable concentrations. Haase and Meyer show an exam-
ple where a mixture of titanium dioxide (white) with cadmium red (inorganic) is
compared with a mixture of titanium dioxide with naphthol red (organic) [412].
Both reds appear similar when applied directly. However, when mixed with tita-
nium dioxide, the cadmium red looks more gray.
The Kubelka-Munk theory has several applications, including color match-
ing. In this case, color matching refers to the process of taking a given color and
i i
i i
i i
i i
mixing a paint to match that color. It also has uses in computer graphics. The
discussion of both applications is deferred until Section 8.13, because the appli-
cations require knowledge of color spaces and color difference metrics, which are
not explained until Chapter 8.
v1 v2 v3
H 2O
Figure 3.14. The three normal modes of vibration of water molecules (after [815].)
i i
i i
i i
i i
Figure 3.15. The pale blue color of water can be observed against sand, as seen here in
this aerial shot of a coast line; Grand Bahama, 2004.
Figure 3.16. The silt in this glacier-fed lake colors the water turquoise; Alberta, Canada,
July 2001. (Image courtesy of Greg Ward.)
i i
i i
i i
i i
1.5
0.5
0.0
350 400 450 500 550 600 650 700 750
Wavelength (nm)
als, the index of refraction is high, for instance 2.0. The amount of light reflected
depends on the ratio of the two indices of refraction. Going from air to dry porous
material in our example, this ratio would be 1.0 / 2.0. Wet material on the other
hand is coated with a thin layer of water with an index of refraction of around
1.33. The above ratio is then reduced to 1.0/1.33, and therefore less light is re-
flected (Figure 3.18) [716].
For non-porous materials, water forms a thicker layer. Light penetrating this
layer is partially reflected by the underlying material. This reflected light under-
Figure 3.18. Water seeping down this rock changes the appearance of the material.
i i
i i
i i
i i
goes further reflections and refractions at the water-to-air boundary. Some total
internal reflection occurs, so that less light emerges [668]. A computational model
to synthesize imagery of wet materials was described by Jensen et al [548].
3.4.8 Glass
Glass is formed by melting and cooling. The cooling occurs sufficiently rapidly
so that crystallization does not occur. The material is said to be in a glassy or
vitreous state. Different types of glass are used for different applications. For
instance, mercury vapor lamp jackets are made of fused silica (SiO2 ). Ordinary
crown glass (soda-flint-lime) contains SiO2 as well as Na2 O and CaO and is used
for window panes and bottles. Ovenware is made from borosilicate crown and
contains SiO2 , B2 O3 , Na2 O, and Al2 O3 . Finally, crystal tableware is made out of
flint, containing SiO2 , PbO, and K2 O.
Typical crown glass has a green tinge, as can be observed by looking at a
window pane from the side, or, as shown in Figure 3.19, by observing the side
of a block of glass. This colorization is due to the use of iron in the manufac-
ture of glass. It is possible to minimize the amount of green ferrous iron (Fe2+ )
by means of chemical decolorizing or physical decolorizing. Adding some man-
ganese oxide (MnO2 ), also known as glassmaker’s soap, achieves both. First,
physical decolorizing occurs because manganese oxide has a purple color, offset-
Figure 3.19. The top of this block of glass shows a green tinge.
i i
i i
i i
i i
ting the green of the ferrous iron. Second, chemical decolorizing is achieved by
the following reaction:
Figure 3.20. Colored glass in stained-glass windows; Rennes, France, June 2005.
i i
i i
i i
i i
If an object is a perfect absorber, i.e., all incident radiation is absorbed, then the
fraction αλ equals 1 for all wavelengths λ . Such an object appears black and is
7 This is a radiometric term and will be explained along with other radiometric terms in Chapter 6.
i i
i i
i i
i i
Incident light
Stack of razorblades
Figure 3.21. The geometry of a stack of razor blades is such that it can absorb most light,
and thus simulate a blackbody.
8 This demonstration was given by Alain Fournier at his keynote speech for the 6th Eurographics
i i
i i
i i
i i
Figure 3.23. If the light is incident at a shallow angle, and the viewpoint is also chosen at
a shallow angle, the grooves may reflect rather than absorb.
i i
i i
i i
i i
Figure 3.24. The stack of razor blades from Figure 3.22 is heated with a butane fueled
torch.
hc
λmax T = (3.60a)
4.965114kB
= 2.897791 × 10−3 , (3.60b)
i i
i i
i i
i i
Iλ
Wienʼs displacement law
Planckʼs radiation law:
T = 9000 K
T = 8000 K
T = 7000 K
T = 6000 K
T = 5000 K
T = 4000 K
Figure 3.25. The emission spectrum for blackbody radiators heated to different
temperatures.
The spectral radiant exitance Meλ as function of wavelength and temperature is:
c μeλ
Meλ = (3.62a)
4
−1
2π hc2 hc
= exp − 1 . (3.62b)
λ5 λ kB T
This function is plotted for various values of T in Figure 3.25. The same figure
also shows in red the curve determined by Wien’s displacement law. Note that the
range of wavelengths shown is much larger than the visible range. Within the vis-
ible range, each of the curves is relatively flat, with the exception of very high and
very low temperatures. This means that for intermediate temperatures, blackbod-
ies radiate a broad spectrum, thereby appearing predominantly white with perhaps
a tinge of blue for higher temperatures and red for lower temperatures.
Planck’s radiation law for blackbody radiators gives a spectral distribution of
energy with its peak located at λmax . Wien’s displacement law predicts that this
peak shifts if the temperature is increased. Heating an object not only increases
the amount of light that is emitted, but also changes the apparent color. It is
therefore natural to refer to a given spectral distribution of a blackbody radiator
by its color temperature, which is given in degrees Kelvin.
The area under each of the curves in Figure 3.25 has special significance and
represents the total radiant exitance of the blackbody radiator. The radiant exi-
tance Me can be computed by integrating (3.62) over all wavelengths:
∞
Me = Meλ d λ = σ T 4 . (3.63)
0
i i
i i
i i
i i
3.5.2 Incandescence
Objects that are heated radiate energy. Some of the energy is radiated in the vis-
ible spectrum, dependent on the object’s temperature. Blackbodies are a special
case of radiating bodies, in that they absorb all incident light. In the more gen-
eral case, any radiation occurring as a result of heating objects is referred to as
incandescence.
Candles, fires, tungsten filaments, hot coal, and iron are all examples of in-
candescent light sources. The heat of a candle flame, for instance, melts the wax
which subsequently flows upward in the wick by means of capillary action and is
then vaporized. Such a flame emits yellow light near the top and blue light near
the bottom.
The vaporized wax is the fuel that can react with oxygen. The center of the
flame contains a limited amount of oxygen, which diffuses in from outside the
flame. The reaction in the center occurs at around 800 ◦ C to 1000 ◦ C [815]:
Near the outside of the flame where more oxygen is available, further reactions
occur at the higher temperature of between 1000 ◦ C to 1200 ◦ C, yielding yellow
light:
i i
i i
i i
i i
Figure 3.26. The flame of a lighter. The lower part of the flame contains a pre-mixed
combination of air and fuel, causing total combustion which leads to a blue flame. The
fuel in the top part of the flame is only partially combusted, leading to soot as well as a
yellow flame.
Thus, the above reactions produce energy in the form of light, as well as water and
carbon dioxide. A candle flame is classified as a diffusion flame, because oxygen
diffuses inward before it reacts with the fuel.
Near the bottom of the flame, a different situation arises, since before ignition
the fuel is mixed with oxygen. Such premixed flames appear blue and burn cleanly
since no soot particles are formed in the process.
Both premixed and diffusion flames occur in candle light, but they are gener-
ally easier to observe in the flame of a cigarette lighter, as shown in Figure 3.26.
The lower temperatures in the center of a flame due to a lack of oxygen in the
flame’s interior can be demonstrated by lowering a wire mesh into the flame. The
flame will burn underneath the mesh, but not above it. The mesh therefore allows
us to observe a cross-section of the flame, as shown in Figure 3.27. The dark
center is easily observed through the mesh.
i i
i i
i i
i i
Figure 3.27. The emission of light from an open flame occurs near the surface of the
flame. The interior lacks sufficient oxygen to burn. That a flame is hollow is demonstrated
here by lowering a mesh into the flame of a candle, revealing a dark center. The flame
emits light underneath the mesh, but not above.
i i
i i
i i
i i
Other forms of incandescent light are found in pyrotechnics, for instance when
magnesium powder is burned. This reaction produces very high temperatures,
leading to very bright white light, which may be colored by adding nitrates, chlo-
rates, or perchlorates. This reaction is also used in flares and tracer bullets. A final
example of incandescence is the photographic flashlight. The bulb of a flashlight
contains shredded zirconium metal and oxygen. On ignition, this produces molten
zirconium and zirconium oxide which radiates at a color temperature of around
4000 ◦ C [815].
Incandescent light is characterized by a lack of polarization. In addition, the
phase of each emitted electro-magnetic wave is random. Light is generally emit-
ted equally in all directions. The temperature of an incandescent light source de-
termines its color, similar to blackbody radiation. With temperature, the radiation
progresses from black to red, orange, yellow, white, and finally blue-white.
i i
i i
i i
i i
Sodium
Neon
Xenon
Figure 3.30. Line spectra of sodium, neon, and xenon. Note that the re-
produced colors are approximate. (These spectra were generated using an online
applet provided by John Talbot, available from http://www.physik.rwth-aachen.de/ ˜
harm/aixphysik/atom/discharge/index1.html.)
i i
i i
i i
i i
Wavelength (nm)
5 250
500
V
5e
1000
2.03
2.0
0
3S 1/2
i i
i i
i i
i i
used are helium, krypton, and xenon because they are either not producing suf-
ficient intensity, or they are too costly to produce. These gases produce yellow,
pale lavender, and blue, respectively. Further colors may be produced by adding
phosphor powders or by using phosphorescent glass tubes.
Sodium vapor lamps operate on a similar principle, although sodium is a solid
metal at room temperature. To produce sodium vapor, neon is added which pro-
duces the initial discharge, as well as heat.9 The heat then vaporizes the sodium
(Na) which is then ionized due to collisions with electrons:
Na → Na+ + e− . (3.68)
Figure 3.32. Sodium street lights emit near-monochromatic light, making it difficult to
distinguish object reflectance; Zaragoza, Spain, November 2006.
9 A sodium vapor lamp emits pink light for the first couple of minutes after being switched on as a
i i
i i
i i
i i
The ion may later recombine with an electron and thereby return to a lower excited
energy level, before returning to a yet lower level and emitting a photon:
There are several different energy states the sodium may reach, as shown in Fig-
ure 3.31. Many of the energy states are outside the visible range and, therefore,
produce radiation that is not detected by the human eye. Around half of the light
emitted is due to a doublet of states at 2.03 eV and 2.05 eV, which both correspond
to yellow light (λ ≈ 589.6 and 589.0 nm). The strong yellow color makes sodium
vapor lamps less suitable for a range of applications, as it is difficult to determine
the color of objects under sodium lighting. Sodium lamps are predominantly used
in street lighting (Figure 3.32 and Section 9.3).
Mercury vapor lamps produce a bluish overall spectrum by a mechanism sim-
ilar to sodium vapor lamps. Their principle emission lines in the visible region
occur at around 405, 436, 546, and 578 nm. The color of mercury vapor lamps
can be improved by coating the glass envelope with a phosphor. Such phosphor-
coated mercury vapor lamps are in wide use today.
3.5.4 Fluorescence
In most cases, when a photon is absorbed by an atom, an electron jumps to a
higher state and immediately returns to its ground state, whereby a new photon is
emitted of the same wavelength as the photon that was absorbed. Some materials,
however, allow electrons to return to their ground state via one or more intermedi-
ate energy states. This creates state transitions during which photons of different
wavelengths are emitted. The wavelength of the emitted photon is determined by
the difference in energy levels of the two states involved in the transition. This
means that new photons are emitted at longer wavelengths, a phenomenon known
as Stokes’ law. The process of absorption and re-emission at a different wave-
length is called fluorescence.
As indicated in the preceding section, the inside of a mercury vapor lamp may
be coated with a phosphor, creating a fluorescent light source. An example of a
fluorescent light source is shown in Figure 3.33. The mercury vapor emits light,
predominantly in the ultra-violet region (253.7 nm), which excites the phosphor.
The phosphor in turn re-radiates light at a longer wavelength. The color of a
fluorescent tube is determined by the type of phosphor used. For instance, mag-
nesium tungstate creates an emission spectrum with a maximum emittance at 480
nm. Maximum emittance at longer wavelengths is possible with, for instance,
i i
i i
i i
i i
90
80 F2
70 F7
F11
60
50
40
30
20
10
0
350 400 450 500 550 600 650 700 750 800
Wavelength (nm)
Figure 3.34. Relative spectral power distributions for CIE F2 , F7 and F11 fluorescent light
sources (data from [509]).
i i
i i
i i
i i
Figure 3.35. This set-up demonstrates frequency up-conversion. The small transparent
panel is coated with phosphors that emit at blue wavelengths when subjected to near-
infrared light. (Demonstration kindly provided by Janet Milliez, School of Optics, Uni-
versity of Central Florida.)
i i
i i
i i
i i
Figure 3.36. Red, green, and blue phosphors shown in bright and dim variants. The
substrate was opaque. (Demonstration kindly provided by Janet Milliez, School of Optics,
University of Central Florida.)
3.5.5 Phosphorescence
For some substances, electrons may be excited into a higher energy state, and
they may alternate between two meta-stable states before returning to the ground
state. These alternate states may be reached by further absorption of radiant en-
ergy. If the eventual return to the ground state emits a photon, it will be at a
later time than the initial absorption. Materials exhibiting this behavior are called
phosphorescent.
If the source of radiation that caused the transition between states is switched
off, there will be a delay before all atoms in the substance have returned to their
ground states. The decay time can be measured by noting the time necessary
before the emitted radiant power is reduced to a fraction of e−1 of its original
value.
Fluorescent materials generally have a very short delay time, on the order of
10−8 seconds. Phosphorescent materials have decay times that range from 10−3
seconds to many days [1262]. The decrease in energy is typically exponential.
However, when an electron absorbs enough energy to ionize the atom, the electron
will follow a power-law decay. The radiance L emitted by a phosphor as function
of time t may be modeled with an empirical formula [675]:
1
L= 2 , (3.70)
1
b √ +t
Lo b
i i
i i
i i
i i
Figure 3.37. The hands of this pocket watch are painted with a phosphorescent paint.
Before this photograph was taken, the watch was lit for around a minute by a D65 daylight
simulator. The photograph was taken in low light conditions to show the phosphorescent
behavior of the paint.
Figure 3.38. Red, green, and blue dots of a phosphorescent cathode ray tube. At a dis-
tance, the human visual system fuses the separate dots and the result is perceived as a
uniform white.
i i
i i
i i
i i
3.5.6 Luminescence
Fluorescence and phosphorescence discussed in Sections 3.5.4 and 3.5.5 are two
examples of a general principle, called luminescence. The term luminescence
refers to all production of light that is not caused by thermal radiation. This leaves
many forms of luminescence [815]:
Figure 3.39. Glow sticks emit light by chemiluminescence. The yellow, blue and green
glow sticks are enclosed in clear plastic holders, while the pink color occurs in part due to
the red plastic of the tube.
i i
i i
i i
i i
i i
i i
i i
i i
The principle used by LEDs can be reversed, such that light absorbed at a p-n
junction creates a voltage, a principle used by solar panels (Figure 3.41).
Figure 3.41. Solar panels use the same semiconductor technology as LEDs, albeit in
reverse by absorbing light and thus producing an electric current.
i i
i i
i i
i i
3.5.9 Lasers
As we have seen in Section 3.5.4, an atom may absorb energy in the form of
a photon, causing an electron to jump to a higher energy level, or equivalently,
a different orbital. When the electron jumps back to its ground state, possibly
via one or more intermediate energy levels, new photons may be emitted, lead-
ing to fluorescence. If energy is added to the atom in the form of kinetic en-
ergy (heat), then electrons also jump to higher energy levels. Upon return to the
ground state, emission of photons in the visible range may also occur, leading to
incandescence.
We will assume that there is a population of N1 atoms at a given lower energy
level E1 , and a population of N2 atoms in a higher (excited) energy level E2 . In
a material with vast quantities of atoms, the relaxation to lower energy levels
occurs at an exponential rate, determined by the material itself. With t indicating
time, and γ2 the spontaneous decay rate, the size of the population of atoms at the
i i
i i
i i
i i
The reciprocal of γ2 is given by τ2 = 1/γ2 and signifies the lifetime of the upper
level E2 . The above decay rate is independent of which lower energy levels are
reached. The decay rate from level E2 to level E1 can be modeled with a similar
equation, except that γ2 is replaced by a different constant γ21 . For convenience,
we may write this equation as a differential equation, which models the sponta-
neous emission rate:
dN2 (t)
= −γ21 N2 (t). (3.72)
dt
So far we have not discussed how atoms end up in a higher energy state,
other than that photons at the right wavelengths may be caught. Normally, this
happens irregularly dependent on the temperature of the material and the amount
of incident light. However, it is possible to force atoms into higher energy levels,
for instance, by flashing the material with a short pulse of light that is tuned to the
transition frequency of interest. This process is called pumping.
After such a short pulse is applied, the population of atoms in the higher en-
ergy state can be modeled by the following atomic rate equation:
dN2 (t)
= K n(t) N1 (t). (3.73)
dt
Here, n(t) is the photon density of the applied signal measured in number of
photons per unit volume, or equivalently as the electromagnetic energy density
divided by h̄ ω . This is the process of stimulated absorption, i.e., the application
of a short pulse of light forces atoms that are initially in the lower energy state E1
into the higher energy state E2 .
Similarly, the same pulse forces atoms that are initially in the higher energy
state E2 into the lower energy state E1 , thereby emitting photons. This process is
called stimulated emission and is modeled by the following atomic rate equation:
dN2 (t)
= −K n(t) N2 (t). (3.74)
dt
Note that the constant of proportionality K in the above two equation is identical,
but of opposite sign. This constant will be largest if the pulse of light that is
applied is of the same energy (frequency/wavelength) as the transition energy
between levels E1 and E2 of the material.
i i
i i
i i
i i
dN2 (t)
= K n(t) (N1 (t) − N2 (t)) − γ21 N2 (t). (3.75)
dt
Stimulated emission produces photons with energies that are able to force
stimulated absorption or emission in other atoms. As a result, stimulated emission
causes amplification of the initial pulse of light (hence the acronym laser, which
stands for light amplification by stimulated emission of radiation). Whether a
population of atoms, i.e., a volume of material, amplifies light or absorbs light
depends on the ratio of atoms in the lower and higher energy states, i.e., N2 /N1 .
In essence, if we can force a majority of the population of atoms into the higher
energy state, then the volume of material will act as an amplifier; otherwise, the
material will absorb energy.
This can also be seen from (3.75), where the difference between N1 and N2
is on the right-hand side of the equation. If this difference is positive, then the
material absorbs energy. If this difference is negative, i.e., when the material is in
a condition known as a population inversion, then more state transitions are from
E2 to E1 , leading to emission of additional photons which cause further stimulated
emissions. Under this condition, the initial pulse of light is amplified.
For a material that is in thermal equilibrium at temperature T , the ratio N2 /N1
adheres to a fundamental law of thermodynamics, known as Boltzmann’s
Principle:
N2 E2 − E 1
= exp − . (3.76)
N1 kT
This ratio can also be expressed as a difference:
h̄ ω
N1 − N2 = N1 1 − exp − . (3.77)
kT
This equation shows that for a material in thermal equilibrium, the majority of the
population is in the lower energy level, and therefore light is absorbed. Thus, to
create laser light, a pumping mechanism may be used to move the majority of the
population into a non-equilibrium state.
When such a pumping process is applied, each atom may react to it by chang-
ing energy level. Since each atom responds in the same manner, the resulting
amplification is coherent in both phase and amplitude. Thus, the output of a laser
is an amplified reproduction of the input (pumping) signal—at least for a narrow
band around the wavelength associated with the transition energy E2 − E1 . How-
ever, some noise will be introduced due to spontaneous emission and absorption.
i i
i i
i i
i i
A laser medium, which is the collection of atoms, molecules, or ions that will
mediate amplification.
A pumping process,which will excite the laser medium into a higher energy level,
causing a population inversion.
Optical elements, which will mirror a beam of light to pass through the laser
medium one or more times. If the beam of light passes through once, we
speak of a laser amplifier, otherwise of a laser oscillator.
Lasers emit light at a narrow band of wavelengths, and therefore produce near
monochromatic light. In addition, as outlined above, this light is typically coher-
ent, i.e., in phase and amplitude, making laser light very different from most light
sources normally encountered. Most lasers are set to one specific wavelength, al-
though high-end lasers can be made to be tunable. For such lasers, it is possible to
dial in the wavelength at which they will emit monochromatic light (Figure 3.42).
i i
i i
i i
i i
distance away from the nucleus. As a result, the negative charge of the electrons
is slightly offset with respect to the positive charge of the nucleus. An atom under
the influence of an electromagnetic field may thus form an electric dipole. The
atom is said to be polarized under these conditions.
Some dielectrics are known as polar substances, where the molecular struc-
ture is such that dipoles exist even in the absence of an external electromagnetic
field. This is for instance the case with water. However, for polar substances,
the orientation of each molecule is random, and therefore the net polarization of
the material in bulk is zero. When an electric field is applied, the forces acting
upon the molecules will align them with the field, and the material will become
polarized. This phenomenon is called orientational polarization.
Some materials consist of ions that are bound together. As an example,
sodium chloride NaCl consists of positive and negative ions. Application of an
external electric field will separate the ions and thus form electric dipoles. The
result is called ionic polarization.
A polarized atom may be modeled as a positive point charge +Q and a neg-
ative point charge −Q separated by a distance r. The dipole moment p is then
defined as
p = Q r. (3.78)
It would be difficult to model the interaction of an electromagnetic wave with a
material if the dipole moments of each individual atom or molecule need to be
considered. It is more convenient to compute the average behavior of the material
at the microscopic scale. The electric polarization P of a material is then given
by
P = N Q ra = N p, (3.79)
where N is the number of atoms per unit volume and ra is the average separa-
tion between positive and negative charges in atoms. It is now possible to model
the electric displacement D as an additive quantity rather than a multiplicative
quantity as shown in (2.5):
D = E + 4 π P. (3.80)
For suitably weak fields, the electric polarization can be taken to be proportional
to the magnitude of the electric field, i.e., P ∝ E. The constant of proportionality
is then called the dielectric susceptibility η :
P = η E. (3.81)
In this case, by comparing (2.5) and (3.81), we find a relation between the dielec-
tric susceptibility and the material constant ε :
ε = 1 + 4 π η. (3.82)
i i
i i
i i
i i
p = α E . (3.84)
The constant α is called the mean polarizability. With N atoms per unit volume,
as given in (3.79), the electric polarization can be rewritten as
P = N p = N α E . (3.85)
i i
i i
i i
i i
3.6.2 Dispersion
Newton’s famous experiment, in which light is refracted through a prism to reveal
a set of rainbow colors, cannot be explained with the machinery described so far.
In this experiment, the different wavelengths impinging upon a dispersing prism
are refracted by different amounts. This suggests that the index of refraction is
not a simple constant related to the permittivity and permeability of the material,
but that it is dependent on wavelength λ (or equivalently, angular frequency ω ).
It is indeed possible to derive a wavelength-dependent model for the index
of refraction to account for the wavelength-dependent nature of certain dielectric
materials.
We begin by assuming that electrons move with a velocity much smaller than
the speed of light. The Lorentz force acting on electrons, as given in (2.9), then
simplifies since the second term may be neglected. With a mass m, the position
of the electron, as a function of forces acted upon it, is modeled by the motion
equation:
∂ 2r
m 2 + q r = e E . (3.91)
∂t
If the electric field acting upon the electron is given by
i i
i i
i i
i i
e2
Nα =N . (3.95)
m ω02 − ω 2
We can now use the Lorentz-Lorenz formula (3.89) to relate the index of refrac-
tion to the angular frequency:
n2 − 1 4 π N e2
= . (3.96)
n2 + 2 3 m ω02 − ω 2
It can be seen that the right-hand side of this equation tends to infinity for angular
frequencies ω approaching the resonance frequency ω0 . This is a consequence of
the formulation of the motion equation (3.91), which does not include a damping
term that models a resisting force. If this term were added, the position of an
electron would be modeled by
∂ 2r ∂r
m +g + q r = e E , (3.97)
∂ t2 ∂t
and the solution would be
e E
r= . (3.98)
m ω02 − ω 2 − i ω g
i i
i i
i i
i i
Figure 3.43. Dispersion through a prism. The slit is visible towards the left. The dispersed
light is projected onto a sphere for the purpose of visualization.
i i
i i
i i
i i
Figure 3.44. Dispersion through a prism. The light source is a Gretag Macbeth D65
daylight simulator, which produces light similar to that seen under an overcast sky.
Figure 3.45. A Gretag Macbeth D65 daylight simulator lights two sheets of polarizing
material from behind. Wedged between the two sheets are two pieces of crumpled cello-
phane.
i i
i i
i i
i i
Figure 3.47. Two polarizing sheets are aligned. The wedged pieces of cellophane produce
complementary colors to those shown in Figure 3.46.
i i
i i
i i
i i
3.7 Dichroism
There exist at least three phenomena which are each called dichroism. In this
section, we briefly describe each for the purpose clarifying the differences. In
each case, dichroism refers to the ability of a material or substance to attain two
different colors. However, the mechanism by which this two-coloredness occurs
is different.
First, dichroism can be related to polarization. In Section 2.3, we described
how birefringence may occur in certain minerals. In that case, the index of re-
fraction was dependent on the polarization of light. For certain materials, the
wavelength composition of transmitted light may depend on polarization as well.
Ruby crystals, for instance, have octahedral structures. If such crystals are lit by
polarized light, it is possible to change the color of the ruby from purple-red to
orange-red.
The absorption of polarized light by rods in certain species of fish has been
shown to differ by the angle of polarization [248]. Thus, at least some fish retinas
show dichroism of this form, particularly if the retina is unbleached.
A second effect occurs in other minerals, where the color of the transmitted
light is determined by the angle at which light strikes the stone. As such, the color
of transmitted light depends on the angle it makes with the optical axis of the ma-
terial. This effect is also known as dichroism, but it should be distinguished from
the case where the wavelength composition of transmitted light is dependent on
polarization. Iolite, a relatively common gem stone, exhibits this form of dichro-
ism, as shown in Figure 3.48. Some materials show different colors along three
or more optical axes, yielding an effect known as pleochroism.
Thirdly, dye dichroism refers to the different colors that dye solutions may
obtain as function of the concentration of the dye, as discussed in Section 3.4.3.
Finally, Dichroic filters and dichroic mirrors are based on interference effects.
Such devices selectively reflect part of the visible spectrum while transmitting the
remainder. They are built by coating a glass substrate with very thin layers of
i i
i i
i i
i i
Figure 3.48. The stone in this pendant is made of iolite, a dichroic material. As a result,
viewing this stone from different angles reveals a different set of colors; in this case the
color varies between purple and yellowish-gray.
i i
i i
i i
i i
ferent aspects of the fluid/gas. For flames, viscosity is unimportant, and therefore
the equations for incompressible flow are appropriate.
Considering a velocity field with each discrete position holding a velocity u,
then incompressibility can be enforced by ensuring that the divergence is zero:
∇·u = 0. (3.102)
Then, at time t the velocities are computed from the velocities of the previous
time step t − 1 using:
∇p
ut = − (ut−1 · ∇) ut−1 − + f, (3.103)
ρ
where p is the pressure, ρ is the density, and f models any external forces affecting
the flow. These quantities are defined at each grid point, and may or may not
themselves be updated for each time step. The above two equations together
are termed the Navier-Stokes equations for incompressible flow. Solutions to
this equation can be obtained for instance using semi-Lagrangian methods [307,
1079].
Unfortunately, the gas expansion discussed above cannot be taken into ac-
count with this model (it would require the more complex equations for com-
pressible flow). However, the Navier-Stokes equations for incompressible flow
can still be used, for instance by modeling the reaction zone where fuel is burnt as
a thin surface in 3D space, rather than a volume [741]. Recall from Section 3.5.2
that a deflagration requires oxygen which diffuses in from the surrounding air,
leaving a core within the flame where no reaction occurs. Thus, the reaction zone
is reasonably well modeled with an implicit surface.
i i
i i
i i
i i
which fuel is burned (a fuel-dependent property [1155]), the surface area of the
blue core can be estimated to be AS using
v f A f = S AS , (3.104)
where v f is the speed of fuel injection. Thus, the size of the blue core depends on
the speed of injection in this model.
The reactions occurring in the flame, including the blue core, cause expansion
which is normally modeled as the ratio of densities ρ f /ρh , where ρ f is the density
of the fuel, and ρh is the density of the hot gaseous products. Across the dynamic
implicit surface modeling the thin flame, the expansion of the gas is modeled by
requiring conservation of mass and momentum, leading to the following equa-
tions:
Here, p f and ph are pressures, and V f and Vh are normal velocities of the fuel and
hot gaseous products. The speed of the implicit surface in its normal direction is
given by
D = V f − S. (3.106)
For the special case of solid fuels, the pressure will be zero; with the density and
normal velocity of the solid fuel given by ρs and Vs , we have
ρ f V f − D = ρs (Vs − D) , (3.107)
and therefore
ρs
V f = Vs + − 1 S. (3.108)
ρf
Hence, the velocity of the gasified fuel in its normal direction at the boundary
between its solid and gaseous state is related to the velocity of the solid fuel plus
an expansion-related correction.
The dynamic implicit surface indicating the boundary of the blue core moves
at each time step with a velocity given by w:
w = u f + S n, (3.109)
where the velocity of the gaseous fuel is u f and S n is a term describing the veloc-
ity induced by the conversion of solid fuel to gas. At each grid point, the surface
normal n is determined by
∇φ
n= , (3.110)
|∇φ|
i i
i i
i i
i i
where φ defines the region of the blue core. Its value is positive where fuel is
available, zero at the boundary, and negative elsewhere. The motion of the implicit
surface φ is then defined as
φt = −w · ∇ φ . (3.111)
This equation can be solved at each grid point, after which an occasional condi-
tioning step is inserted to keep the implicit surface φ well-behaved [835, 1027,
1146].
The implicit surface indicates the boundary between the fuel and hot gaseous
products. These two types of gas have their own velocity fields, u f and uh , which
are both independently updated with the Navier-Stokes equations above. As these
velocity fields are valid on opposite sides of the implicit surface, at this boundary
care must be taken to update the velocities appropriately. For instance, when
computing velocities uh for hot gaseous products near the implicit surface, values
on the other side of the boundary must be interpolated with the aid of the normal
velocity of the fuel V f , which is computed with
V f = u f · n. (3.112)
The corresponding normal velocity for the hot gaseous products can be computed
with
ρf
Vh = V f +
G
−1 S (3.113)
ρh
The superscript G indicates that this is a ghost value, i.e., it is extrapolated for
a region where no hot gaseous fuel is present. The velocity of the hot gaseous
products on the opposite side of the implicit surface can now be computed with
uGh = Vh n + u f − u f · n n.
G
(3.114)
fT = α (T − Tair ) z, (3.115)
i i
i i
i i
i i
fC = ε h (nω × ω ) . (3.116)
ω = ∇× u, (3.117a)
∇ |ω |
nω = . (3.117b)
| ∇ |ω ||
As fuel is converted from solid to gas, ignites, and moves through the blue
core out into the area where the flame is categorized as a diffusion flame (see
Section 3.5.2), the temperature follows a profile which rises at first and then cools
down again. To model this both over space and over time, it is useful to track fuel
particles during their life cycle, as follows:
Yt = − (u · ∇) Y − 1. (3.118)
This equation is solved once more with the aid of semi-Lagrangian fluid solvers.
With suitably chosen boundary conditions, the value 1−Y can be taken to indicate
the time elapsed since a particle left the blue core. The temperature T is then
linearly interpolated between the ignition temperature Tignition and the maximum
temperature Tmax , which occurs some time after leaving the blue core. In the
temperature fall-off region, with the help of a cooling constant cT , the temperature
can be computed with [835]
T − Tair 4
Tt = − (u · ∇) T − cT . (3.119)
Tmax − Tair
Finally, the density ρt at time step t is computed similarly, using
ρt = − (u · ∇) ρ . (3.120)
i i
i i
i i
i i
Figure 3.49. A rendering of a campfire using the techniques discussed in this section.
(Image courtesy of Henrich Wann Jensen.)
1 − g2
p(Θ, Θ ) = 1.5
. (3.121)
4π (1 + g2 − 2g Θ · Θ )
This function is then used in (2.234), which models in-scattering. Accounting for
in-scattering, out-scattering, as well as light emitted along each ray segment, we
i i
i i
i i
i i
use (2.233). However, light emission for modeling flames is governed by Planck’s
radiation law, so that Leλ equals the spectral radiant exitance Meλ given in (3.62).
An example result obtained with this combined modeling and rendering tech-
nique is shown in Figure 3.49, showing a campfire where a cylindrical log emits
fuel which then catches fire.
i i
i i
i i
i i
Chapter 4
Human Vision
In the preceding chapters, light and its interaction with matter was discussed. One
particularly important interaction occurs when light enters the human eye. Light
falling on the retina triggers a most remarkable sequence of events. In this and the
following chapter, we will focus on this chain of events, insofar the current state of
knowledge allows us. It should be noted that much of the neurological underpin-
nings of color still forms an active area of research, and knowledge of higher-level
color processing in the human brain therefore remains relatively sparse [496]. In
addition, the more we learn about the functional organization of the visual cortex,
the more complex the human visual system turns out to be. Nonetheless, we con-
sider this topic the other half of essential background information, necessary to
understand and effectively use theories of color.
In this chapter, we primarily deal with the substrate that forms the human
visual system: the eye, the retina, and neural circuitry in the brain. Much of
the information collected in this chapter stems from studies of Old-World mon-
keys [227, 1032]. It has been shown that their photoreceptors have the same spec-
tral tuning as human photoreceptors [78, 1009, 1010], as well as a similar retinal
layout [972]. In addition, cell types and circuits found in the macaque visual
system are very similar to those in the human retina [224–226,604,606,608,892].
Both anatomy and histology use terms and descriptions for structures relative
to the orientation of the body. Figure 4.1 shows the planes of reference used to
describe where structures are. These planes are [329]:
The midsagittal plane, which divides the head into a left and right half. Objects
located closer to this plane are called medial and those that are further away
are called lateral.
The coronal plane divides the head into an anterior and posterior region.
199
i i
i i
i i
i i
Posterior
Anterior
Lateral
Medial
Inferior
Median or
midsagittal plane Coronal plane
The horizontal plane divides the head into a superior and inferior region. These
regions are also called cranial and caudal.
One further term of reference is deep, indicating distance from the surface of the
body.
i i
i i
i i
i i
The two orbital cavities lie between the cranium and the facial skeleton, and
are separated by the nasal cavity. They serve as sockets to hold the eyes as well as
the adnexa, and they transmit nerves and blood vessels that supply the face around
the orbit [329].
The shape of each orbit is roughly a quadrilateral pyramid with the apex form-
ing the optic canal (which provides passage for the optic nerve and the ophthalmic
artery between the orbits and the cranium). Their walls are approximately trian-
gular, except for the medial wall which is oblong. The medial walls are parallel
to the midsagittal plane, whereas the lateral walls are angled at 45 degrees to this
plane. The height, width, and depth of the orbits are around 40 mm each, with a
volume of 30 ml.
Pupil Cornea
Trabecular Anterior chamber
meshwork Iris
Posterior chamber
Canal of Limbus
Schlemm
{
Zonular apparatus
Lens Rectus muscle,
Ciliary body
tendon & belly
Retina
Sclera Optic disk
Fovea Vitreous
{
Choroid
{
Corneo-scleral coat
Retinal Uveal tract
{
i i
i i
i i
i i
The eye has a layered structure, with three layers (or tunics) present: the outer
corneoscleral envelope, the uveal tract, and the inner neural layer. The sclera and
the cornea together form the corneoscleral coat, which provides a tough outer
shell supporting and protecting the inner layers as well as the contents of the eye.
Extraocular muscles attach to this layer to mediate eye movement. The uveal
tract, or uvea, consists of the iris, ciliary body, and the choroid and contains two
openings: the pupil anteriorly and the optic nerve canal posteriorly. The innermost
layer consists of the retina which is located in the image plane of the eye. It has
two layers, the inner neurosensory retina and the retinal pigment epithelium. The
retina is described in greater detail in Section 4.3.
The transparent contents of these three layers are the aqueous humor, the lens,
and the vitreous body. Light passes through these parts as well as through the
cornea before it reaches the retina. In the following, we describe each of the
components of the eye that are involved in the projection of images on the retina.
i i
i i
i i
i i
1.0
p(Θ) 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
-30 -20 -10 0 10 20 30
Θ
Figure 4.3. The normalized probability of the eye deviating from the fixation point as a
function of deviation angle (after [75, 321]).
the probability of the eye deviating from a fixation point as a result of miniature
eye movements can be modeled as the following function of angle Θ (plotted in
Figure 4.3) [75]:
These miniature movements are important for normal vision. If they are artifi-
cially suppressed, the visual image may break down or disappear altogether [256].
i i
i i
i i
i i
The cornea consists of fibrils which are large compared with the wavelength of
light. This was established by measuring the birefringence of the fibrils [236,755],
suggesting that scattering in the cornea should lead to a highly opaque material.
However, these fibrils are uniform in shape and are arranged in parallel rows.
The diffracted light thus cancels out, leaving the undiffracted light which passes
through the cornea unhindered [755].
For medical purposes, some work has focused on modeling the cornea [70,
418].
i i
i i
i i
i i
Figure 4.4. An ocular prosthesis. The iris was created by alternating dozens of layers of
paint with clear coats. Compare with the green eye of Figure 2.46, which served as the
model for this prosthesis. (Prosthesis courtesy of Richard Caruso, Eye Prosthetics of Utah,
Inc.)
i i
i i
i i
i i
-4 -2 0 2 4
log luminance (cd/m 2)
Figure 4.6. Entrance pupil diameter as a function of ambient illumination for young
adult observers (after [305]). The photometric unit of candela per square meter is formally
introduced in Chapter 6.
including age, accommodation, drugs, and emotion [166]. The range of pupillary
constriction affords a reduction in retinal illumination by at most a factor of seven
(0.82 log units [118]) to ten [1262].
i i
i i
i i
i i
1.0
Diopter D = 1/f
0.0
-1.0
-2.0
Figure 4.7. Chromatic aberration in the human eye, measured in diopters D, as function
of wavelength λ (after [80]).
The curvature of the lens is not the same in horizontal and vertical direc-
tions, leading to astigmatism (the phenomenon that horizontal and vertical lines
have different focal lengths). A consequence of astigmatism is that, at the retina,
horizontal edges will consistently show colored fringes, chromatic aberration,
whereas vertical edges will show differently colored fringes [477]. Both Young
and Helmholtz have demonstrated chromatic aberration in their own eyes [164,
165], and this effect was later measured by Wald and Griffin [1191] and Bed-
ford and Wyszecki [80]. The amount of chromatic aberration in the human eye is
measured in diopters, i.e., the reciprocal of focal length, as shown in Figure 4.7.
This effect is believed to be corrected with later neural mechanisms and possi-
bly also by waveguiding due to the packing of photoreceptors, as discussed in
Section 4.2.7.
Other than accommodation, a second function of the lens is to prevent optical
radiation between 295 and 400 nm from reaching the retina [344]. This filter-
ing is required to protect the retina from damage by ultra-violet (UV) radiation.
With age, the lens yellows and also becomes more opaque, which increases the
amount of UV filtering that occurs. The yellowing is due to photochemical re-
actions rather than biological aging [254, 255, 344, 1223] and causes changes in
color perception with age. The decrease in light transmittance is in part due to
disturbances in the regular packing of crystallins, which in turn induces increased
scattering [329], thus yielding a more opaque lens. This is known as a cataract
(see also Section 5.11).
Finally, the lens is a source of fluorescence (Section 3.5.4), in that incident
UV radiation is absorbed by the lens and re-emitted in the visible range. This
phenomenon increases with age. For a healthy lens of a 30-year old, the ratio
between incident (sky) light and the amount of produced fluorescence is around
i i
i i
i i
i i
0.002 [1222]. This ratio increases with age to around 0.017 (noticeable) for a
60-year old and 0.121 for an 80-year old. At these ratios, the occurrence of fluo-
rescence is seen as veiling glare.
Dependent on the wavelength of the incident UV light, the range and peak
wavelength of the fluorescence varies. For instance, incident light at 360 nm
produces fluorescence with a range between 380 nm to over 500 nm, with a peak
at 440 nm [672,1302]. The wavelength of 360 nm is near the absorption maximum
of the lens.
In the aging lens, the UV absorption band additionally broadens into the blue,
suggesting that human vision may be aided by employing filters that block UV
and short wavelengths in the visible range. This would reduce both veiling glare
as well as fluorescence [657, 1297].
Medium n
Cornea 1.376
Aqueous humor 1.336
Lens (center) 1.406
Lens (outer layers) 1.386
Vitreous body 1.337
i i
i i
i i
i i
Image formation on the retina is thus mediated by the cornea, the aqueous
humor, the lens, and the vitreous body. Their respective refractive indices are
summarized in Table 4.1 [447]. Together, they are able to focus light on the
retina. However, the ocular media is not a perfect transmitter. Light losses occur
due to Fresnel reflection, absorption, and scattering [1262]. Fresnel reflection is
generally less than 3 to 4 percent [166]. The optical system is optimized for use in
open air. For instance, in underwater environments, the index of refraction from
water to cornea is different, and vision appears blurred as a result [208].
Further, the index of refraction of the ocular media is dependent on wave-
length, causing edges projected on the retina to show colored fringes, known as
chromatic dispersion [50].
Aberrations are strongest at the first refractive surface, the anterior corneal
surface. The aberrations occurring here only slightly increase with age and are
largely corrected by subsequent ocular surfaces in young individuals. However,
the corrective power of later refractive boundaries in the ocular media decreases
with age, so that the total aberration at the retina increases with age [43]. For
instance, the continued growth of the lens causes aberration of the lens to increase
with age [367].
Absorption of short wavelengths occurs at the cornea. This may disrupt sur-
face tissue, leading to photokeratitis (snow blindness or welder’s flash) [166].
The damage threshold is lowest (0.4 J/cm2 ) for light at 270 nm [904]. Further
absorption at short wavelengths occurs in the center of the lens [772]. This ab-
sorption increases with age. Finally, anterior to the outer segments of the cones
in the fovea lies a layer of yellow pigment (xanthophyll), which absorbs light at
short wavelengths [1065]. The average transmittance of the human eye is plotted
in Figure 4.8.
1.0
Normalized Transmittance
0.8
0.6
0.4
0.2
0.0
400 600 800 1000 1200 1400
λ
i i
i i
i i
i i
1.4
p(Θ) 1.2
1.0
0.8
0.6
0.4
0.2
0.0
-10 -8 -6 -4 -2 0 2 4 6 8 10
Θ
Figure 4.9. Point spread function for ocular light scatter (after [75]).
Some light is scattered in the eye. This means that if the eye is exposed to
a point light source, a small area of the retina is illuminated, rather than a single
point. Ocular light scatter is also known as veiling glare. The intensity fall-off
as a function of radial angle (in arc minutes) can be modeled by a point spread
function p(Θ) [75]:
⎧
⎨1.13 e−5.52 Θ2 + 1.42
, |Θ| ≤ 7 ,
p(Θ) = 9.85 + Θ3.3 (4.2)
⎩9.35 × 10−6 (7.8 + Θ)−2.5 , |Θ| > 7 .
This function is plotted in Figure 4.9, and following Baxter et al. [75], it consists
of data collected for visual angles less than 5 arc minutes [615] superposed with
data for angles larger than 9 arc minutes [336, 402, 1185].
The area centralis is circular with a diameter of 5 to 6 mm. It lies between the
superior and inferior temporal arteries.
i i
i i
i i
i i
Figure 4.10. The retina consists of various regions of interest (after Forrester et al. [329]).
The fovea is 1.5 mm in diameter and lies 3 mm lateral to the optic disk. It contains
a layer of xanthophyll pigment which gives it a yellowish color (hence the
clinical term macula lutea).
The optic disk contains no retinal layer or photoreceptors. It is known as the blind
spot. It lies 3 mm medial to the center of the fovea.
i i
i i
i i
i i
passing through the pupil nearer the pupillary margin. This is known as the Stiles-
Crawford effect of the first kind [167, 789, 1088].
At the same time, the absorption shows a wavelength dependence with angle
of incidence, causing changes in hue with eccentricity of retinal location. This
is known as the Stiles-Crawford effect of the second kind [34, 1089, 1262]. One
of the explanations for this effect is that light incident upon the retina at an angle
passes through matter over a longer distance before being absorbed. As a result,
more attenuation has occurred, hence explaining color difference with angle of
incidence [85]. In addition, backscattering off the choroid may account for some
of this effect [85].
Waveguiding is also implicated in the explanation of the Stiles-Crawford ef-
fects [40, 1183]. The size of the photoreceptors and their dense packing creates
an effect similar to fiber-optics (more on waveguiding in Section 4.7), creating di-
rectional sensitivity. The waveguiding appears to reduce the effect of aberrations
in the ocular media [1183].
Fovea
Photoreceptors
Periphery Neural layers
Figure 4.11. Neural cell layers are pushed aside in the fovea to allow more light to reach
the photoreceptors.
i i
i i
i i
i i
0.0
Temporal Nasal
Figure 4.12. The thickness of the retinal layers varies with position (after [329]).
The layers of cells preceding the retina are varied in thickness, and, in partic-
ular in the fovea, light passes through a much thinner layer before reaching the
photoreceptors, as shown in Figure 4.11. A plot of the variation in thickness of
the retina is given in Figure 4.12.
The neurosensory retinal layer transduces light into neural signals that are then
transmitted to the brain for further processing. To this end, the retina consists of
several types of cells, including photoreceptors, bipolar, ganglion, amacrine, and
Incident light direction
Ganglion cells
C Amacrine cells
Bipolar cells
D Horizontal cells
Photoreceptors
F
Rods Cones
Figure 4.13. Connectivity of neural cells in the retina (after [329]). In this figure, light
comes from the top, passes through the layers of ganglion, amacrine, bipolar, and half cells
before reaching the light sensitive photoreceptors.
i i
i i
i i
i i
half cells [262]. Within each class, further sub-classifications are possible based
on morphology or response type. Dependent on species, there are between one
and four types of horizontal cells, 11 types of bipolar cells, between 22 and 40
types of amacrine cells, and around 20 types of ganglion cells [227, 609, 723, 749,
1177, 1217]. Although the predominant cell types are all involved in neural pro-
cessing of visual signals, the retina also contains glial cells, vascular endothelium,
pericytes, and microglia.
The photoreceptors, bipolars, and ganglion cells provide vertical connectiv-
ity [915], whereas the horizontal and amacrine cells provide lateral connectivity.
A schematic of the cell connectivity of neural cells in the retina is given in Fig-
ure 4.13. These cells are organized into three cellular layers, indicated in the
figure with blue backgrounds, and two synaptic layers where cells connect. The
latter are called the inner and outer plexiform layers [262]. Each of the cell
types occurring in the retina is discussed in greater detail in the following sec-
tions. Visual processing typically propagates from the outer layers towards the
inner retina [935].
Cells involved in the transmission of the signal to the brain can be stimulated
by a pattern of light of a certain size, shape, color, or movement. The pattern
of light that optimally stimulates a given cell, is called the receptive field of the
cell [430]. Photoreceptors, for instance, respond to light directly over them. Their
receptive field is very narrow. Other cells in the visual pathways have much more
complex receptive fields.
4.3.1 Photoreceptors
Photoreceptors exist in two varieties, namely rods and cones [1015]. The rods
are active in low light conditions and mediate the perception of contrast, bright-
ness, and motion. Such light conditions are called scotopic. Cones function in
bright light (photopic light levels) and mediate color vision. Scotopic and pho-
topic lighting conditions partially overlap. The range of light levels where both
rods and cones are active is called the mesopic range.
There are approximately 115 million rods and 6.5 million cones in the retina.
The density of rods and cones varies across the retina. The periphery consists
predominantly of rods, having a density of around 30 000 rods per mm2 . The
cone density is highest in the fovea with around 150 000 cones per mm2 . In the
fovea there are no rods [1015]. The rod and cone density across the retina is
plotted in Figure 4.14.
Rods and cones form the beginning of a sequence of processing stages called
pathways. Pathways are sequences of cells that are interconnected. For instance,
photoreceptors are connected to bipolar cells as well as to half cells. Bipolars are
i i
i i
i i
i i
+ 10 5 /mm 2
1.6
1.2
0.8 Cones
Rods
0.4
0.0
-60 0 60
Angular extent
Figure 4.14. The density of rods and cones in the retina (after [901]).
4.3.2 Rods
Under favorable circumstances the human visual system can detect a single pho-
ton [446, 1018, 1205], a feature attributed to the sensitivity of the rods, and the
pooling occurring in the rod pathway. Rods have a peak density in a ring 5 mm
from the center of the fovea [862] (see also Figure 4.14). Rods are packed in a
hexagonal pattern, separating cones from each other.
Much is known about the molecular steps involved in light transduction in
both rods and cones [289,928]. A simplified overview is given here. Rods contain
a visual pigment called rhodopsin, which has a peak spectral sensitivity at λmax =
496 nm (see Figure 4.15). Whenever a rhodopsin molecule absorbs a photon, a
chemical reaction occurs that bleaches the molecule, leaving it in a state that does
i i
i i
i i
i i
-2
-4
not allow it to absorb another photon. This reduces the sensitivity of the rod. After
some period of time, the process is reversed, and the rhodopsin is regenerated.
The regeneration process is relatively slow, although not as slow as the process
of dark adaptation. Dark adaptation can be measured psychophysically as the time
it takes to regain sensitivity after entering a dark room. While dark adaptation may
take at least 30 minutes, rhodopsin regeneration requires around 5 minutes. Some
of the reasons for the duration of dark adaptation are outlined in Section 3.4.2.
The rate of regeneration is related to the amount of rhodopsin present. If
p ∈ [0, 1] is the fraction of unbleached rhodopsin, the rate of change d p/dt can be
described by [118]:
dp 1− p
= , (4.3)
dt 400
where 400 is a time constant. A solution to this differential equation is given by
−t
p = 1 − exp . (4.4)
400
This behavior was indeed observed [987]. The bleaching process follows a similar
rate of change [118]:
dp −Lv p
= . (4.5)
dt 400 L0
Here, Lv is the intensity of the light causing the bleaching, and L0 is the intensity
that bleaches 1/400 of the rhodopsin per second. The solution to this differential
equation is
−Lv t
p = exp . (4.6)
400 L0
i i
i i
i i
i i
4.3.3 Cones
The bleaching and regeneration processes known to occur in rhodopsin also occur
in the opsins (visual pigments) associated with cones. The rate of change of opsins
p as function of intensity Lv is given by [118]
dp 1 Lv p
= 1− p− . (4.9)
dt 120 L0
As only the constant 120 is different from (4.7), the steady-state behavior for
opsins is the same as for rhodopsin and is given by (4.8).
The cones come in three types which have peak sensitivities to different wave-
lengths. These three types can be broadly classified as sensitive to long, medium,
and short wavelengths (abbreviated as L, M, and S cones), as shown in Fig-
ure 4.16.1 The peak sensitivities λmax lie around 440 nm, 545 nm, and 565 nm,
respectively [1036]. With three different cone types, human color vision is said
to be trichromatic.
Figure 4.16 also shows that each cone type is sensitive to a wide range of
wavelengths, albeit in different relative proportions. The signal produced by a
photoreceptor is, therefore, proportional to the mix of wavelengths incident upon
1 Note that the curves shown here are not corrected for scattering and absorption of the ocular
media.
i i
i i
i i
i i
1.0 S cone
M cone
Normalized sensitivity
L cone
0.5
0.0
400 500 600 700
λ
Figure 4.16. The normalized sensitivities of the three cone types peak at different wave-
lengths (after [1092]).
0
-1
log S (λ max /λ)
-2
-3
-4
-5
-6
-7
-8
0.7 0.8 0.9 1.0 1.1 1.2 1.3
λ max /λ
Figure 4.17. The log spectral sensitivity template plotted against normalized wavelength
λmax /λ .
i i
i i
i i
i i
On a linear scale, the curves for the three peak sensitivities mentioned above are
shown in Figure 4.18. These curves compare well with those of Figure 4.16,
which were obtained through measurement [1092]. They also fit data obtained
through psychophysical experiments on human observers, as well as electro-
physiological experiments on single photoreceptors in monkey, squirrel, and hu-
man retinas [644].
1.0
S cone
S(λ)
M cone
0.8 L cone
0.6
0.4
0.2
0.0
400 500 600 700
λ
Figure 4.18. The spectral sensitivity template plotted against wavelength λ for peak sen-
sitivities of λmax = 440, λmax = 545, and λmax = 565.
i i
i i
i i
i i
logLv -0.82 0 1 2 3 4 5 6
A 7.18 6.56 5.18 3.09 1.87 1.31 1.09 1.00
i i
i i
i i
i i
Figure 4.19. Photoreceptor mosaic. (From Mark D Fairchild, Color Appearance Models,
2nd edition, Wiley, Chichester, UK, 2005.)
a diameter of 100 μ m (0.34 deg) [220]. This means that even in humans with
normal color vision, the fovea is tritanopic (loss of S-cone function) for small
and brief targets [610, 1248].
With the peak response of L- and M-cones being only 30 nm apart, they ex-
hibit large amounts of overlap in spectral sensitivity, thus contributing more or
less evenly to a luminance signal. The S-cones, on the other hand, have a very dif-
ferent peak sensitivity and are a poor representation for a luminance signal [861].
The central part of the fovea, therefore, appears to be geared towards maintaining
high spatial acuity rather than high contrast color vision [352].
An alternative explanation for the absence of S-cones in the center of the fovea
is possible, however. For instance, it can be argued that this absence of S-cones
in the fovea helps to counteract the effects of Rayleigh scattering in the ocular
media [1036].
The S-cones in the periphery are packed at regular distances in non-random
order [1245]. The packing of L- and M-cones is essentially random. A schematic
of the packing of cones in the retina is given in Figure 4.19.
The ratio between cone types in the human retina varies considerably between
individuals [785, 977]. For instance, the measured ratio of L- and M-cones was
3.79:1 for one individual and 1.15:1 for another [1245]. This difference has only
minor influence on the perception of color [121, 911], which may be due to a
i i
i i
i i
i i
combination of various factors. First, the statistics of natural scenes are such that
high frequency, high contrast signals are rare. Further, the scatter occurring in
the ocular media may help to prevent aliasing. Lastly, post-receptoral process-
ing may compensate for the idiosyncrasies of the photoreceptor mosaic [824].
Nonetheless, some variability in photopic spectral sensitivity remains between
healthy individuals, which is largely attributed to the random arrangement of L-
and M-cones and the variability in ratio between these cone types [227, 670].
The ratio of the combined population of L- and M-cones to S-cones is around
100:1 [1205]. If we assume that the L/M ratio is 2:1, then the ratio between all
three cone types is given by L:M:S = 0.66 : 0.33 : 0.01.
Finally, the packing density of cones in the fovea varies significantly be-
tween individuals. Measurements have shown individuals with as few as 98,200
cones/mm2 in the fovea up to around 324,100 cones/mm2 [219]. Outside the
fovea, the variance in receptor density is much lower. It can be argued that the
cone density outside the fovea is just enough to keep the photon flux per cone
constant, given that the retinal illumination falls off with eccentricity. Between
φ = 1◦ and φ = 20◦ degrees of eccentricity, the cone density d(φ ) can be mod-
eled by [1157]
φ −2/3
d(φ ) = 50, 000 . (4.14)
300
Finally, the packing density is slightly higher towards the horizontal meridian as
well as towards the nasal side [219].
i i
i i
i i
i i
In the fovea, L- and M-cones each connect to a single midget bipolar cell. As
a consequence, midget bipolar cells carry chromatic information. However, the
fovea also contains diffuse bipolar cells that connect to different cone types, and
thus carry achromatic information [1217]. Diffuse bipolar cells have a center-
surround organization [223] (see Section 4.3.6).
The S-cones form the start of a separate ON pathway. Separate blue-cone
bipolar cells connecting to blue cones have been identified [612], which mediate
the S-cone ON pathway. In addition, each S-cone connects to a single midget
bipolar cell [597, 598], which potentially forms the substrate of a blue OFF sig-
nal [227].
i i
i i
i i
i i
feedback mechanism also affords the possibility of color coding the response of
bipolar cells [609]. As a result, the notion that retinal processing of signals is
strictly feed-forward is false, as even the photoreceptors appear to receive feed-
back from later stages of processing.
to a color axis that goes from pinkish-red to cyan [209], see Section 4.6.
i i
i i
i i
i i
receptive field size gradually increases with eccentricity, and here midget ganglion
cells receive input from L- and M-cones for both center and surround. This is
consistent with psychophysical evidence that red/green sensitivity declines with
eccentricity [223, 801].
Small-bistratified ganglion cells form a further cell type that carry
blue-ON/yellow-OFF signals [221].3 They are not spatially opponent, in that
their center and surrounds have nearly the same size. The ON-signal is derived
from S-cones (through the blue-cone bipolars), whereas the surround is derived
from L- and M-cones (by means of diffuse bipolar cells). The bistratifed ganglion
cells thus carry an S-(L+M) signal.
Several other ganglion cell types with complex receptive fields exist in the
retina, for instance, ones that are selective for direction and motion [60, 62].
i i
i i
i i
i i
Ganglion cells are tuned to spatial frequency, i.e., their sensitivity to spatially
varying patterns is highest for certain spatial frequencies of the pattern. The fre-
quency response of ganglion cells can be characterized by a grating composed of
a sine wave oriented in some direction. The contrast of the sine wave is measured
as the difference between the lightest and darkest parts of the wave. By intro-
ducing such a grating into the receptive field of the ganglion cell, and reducing
the contrast until the cell just barely responds, the threshold of visibility can be
deduced [826].
By repeating this process for sine waves of different spatial frequencies, a
threshold curve can be created. Such curves, known as contrast-sensitivity func-
tions (CSFs), show a peak at a given frequency [284]. This is the frequency to
which the ganglion cell is most responsive.
In addition to peaked-contrast sensitivity, ganglion cells may be characterized by
the sensitivity to motion within their receptive fields (called hyper-acuity). The
smallest detectable displacements are frequently much smaller than might be ex-
pected given the size of the center receptive field, or the peak contrast sensitiv-
ity [1031].
A useful feature of both contrast sensitivity and hyper-acuity measures is that
they apply to ganglion cells as well as human psychophysical performance. This
leads to the conclusion that organisms cannot detect visual stimuli that are not
detected by ganglion cells [826].
Ganglion receptive fields can be reasonably well modeled by the subtraction
of two Gaussian profiles. The center is modeled by a positive Gaussian function
with a smaller radius and larger amplitude than the surround. The Gaussian profile
modeling the surround is subtracted from the center profile. This is the so-called
Difference of Gaussians or DoG model.
i i
i i
i i
i i
i i
i i
i i
i i
Finally, in the mesopic range, rod vision has a (relatively small) effect on color
vision [599, 1077, 1078].
i i
i i
i i
i i
Parvocellular (6)
6
Parvocellular (5) Striate cortex V1
5
4Cβ / 6
Parvocellular (4)
Right eye 4
CO blobs / 4a
Parvocellular (3)
3 4Cα / 6
Magnocellular (2)
Left eye
2
Ipsilateral (same side) connections Magnocellular (1)
Contralateral (opposite side) connections 1
Koniocellular layers
Figure 4.21. A diagram of the LGN showing both its main inputs and main outputs, which
project to the striate cortex.
Magnocellular layers. These layers receive input from parasol ganglion cells and
are therefore responsible for carrying the achromatic L+M signal [227,
660]. These cells have a spatial center-surround configuration, where ON-
center cells receive excitatory input from both L- and M-cones in the center
and inhibitory input from both L- and M-cones in the surround. OFF-center
cells have opposite polarity [352].
Parvocellular layers. Receiving input from midget ganglion cells, these layers
are responsible for carrying the L-M signal, i.e., they mediate red-green
color opponency [673,891]. Wiesel and Hubel first proposed the possibility
of neural circuitry whereby the center of the receptive field was excited by
one cone type, and the surround by another [1244]. Evidence for such
organization has been confirmed, for instance, in the red-green opponent
ganglion cells [661] and the LGN [746, 946, 947]. However, it should be
noted that the dendritic field sizes of midget ganglion cells in the parafovea
increase in size, and as a result both their centers and surrounds take input
5 Note that the following description of these three layers in the LGN is simplified and not necessar-
ily universally accepted. However, for the purpose of this book, the current description is sufficiently
complete.
i i
i i
i i
i i
Koniocellular layers. These layers are also sometimes called intercalated [322,
406], although the preferred naming is koniocellular [564]. There are three
pairs of layers of this type. The dorsal most pair transmits low-acuity infor-
mation to the primary visual cortex (also known as area V1). The middle
pair projects the signal that originates in bistratified ganglion cells and car-
ries the blue yellow S-(L+M) signal to the cytochrome oxydase blobs of
the primary visual cortex [257, 458]. The ventral-most pair is related to
the superior colliculus. In addition to these dominant pathways, konio-
cellular neurons project to parts of the extrastriate cortex, bypassing V1
altogether [458, 973, 1052, 1286].
V1. This is the principle part of the visual cortex that receives input from the
LGN. In an older classification scheme, this area is also known as Brod-
mann area 17 [138, 710]. It is also known as the primary visual cortex or
the striate cortex.
V2. Area V2, after staining with cytochrome oxydase, reveals thin stripes, thick
stripes, and pale stripes, each receiving input from different areas in V1.
V3. This is an area thought to be involved in the integration of signals from dif-
ferent pathways.
V4. This area is thought to play an important role in the processing of color in-
formation.
i i
i i
i i
i i
V1
V2 V3
V4
Parvo Magno
pathway pathway
Figure 4.22. The main feed-forward projections between several cortical areas (af-
ter [592]).
MT. Area MT is also known as area V5, or the middle temporal area. Cells in
this area tend to be strongly sensitive to motion. It predominantly receives
input from the geniculate magno system.
Many other areas exist, but most is known about areas V1, V2, and MT [563].
The main feed-forward projections between these areas are diagrammed in Fig-
ure 4.22. Color-selective cells have been found in several areas, including V1,
V2, V3, and in particular V4 [206,624,705,1113,1289,1290]. However, it should
be noted that the localization of V4 as a center for color processing is still un-
der debate, as some studies suggest a different organization [282, 407, 1139].
In the following we briefly discuss some of the cortical areas relevant to color
processing.
4.5.1 Area V1
Area V1 is the main entry point to the visual cortex for signals from the lateral
geniculate cortex. It consists of six layers, with some layers divided into sub-
lamina [138, 710]. In addition, several layers, including layers two and three,
have columnar structures called blobs or patches. The areas in between are called
inter-blobs or inter-patches. The separation between patches and inter-patches is
based on a histochemical technique called cytochrome oxydase (CO) that differ-
entiates cells based on metabolism, revealing structure in layers two and three of
the primary visual cortex.
One of the challenges still remaining is to determine what the function might
be of cells located in different layers. This problem may be interpreted as un-
i i
i i
i i
i i
3 3
4A 4A
4B 4B
4Cα 4Cα
4Cβ 4Cβ
5 5
6 6
Figure 4.23. Area V1 (primary visual cortex) and its intracortical circuitry (after [1051]).
i i
i i
i i
i i
not remain separate, but are combined instead [148, 636, 1051, 1265, 1281]. A
summary of intra-cortical connectivity is shown in Figure 4.23. This finding was
corroborated by studies that revealed that orientation-selective cells can be tuned
to color-selective as well [209, 491, 669, 674, 1127, 1181, 1189].
A further complicating factor is that area V1 has both feed-forward and feed-
back connections from many other functional areas in the brain [1008]:
• Feedback projections from other areas to V1 originate in: V2, V3, V4, MT,
MST, frontal eye field (FEF), LIP, and the inferotemporal cortex [69, 889,
970, 1041, 1108, 1164, 1165].
• Area V1 projects forward to V2, V3, MT, MST, and FEF [115, 323, 693,
709, 753, 1041, 1164, 1165].
For the purpose of modeling the micro-circuitry of area V1, as well as understand-
ing the function of V1, these feedback mechanisms are likely to play roles that
are at least as important as the geniculate input [747].
While the color processing in the LGN occurs along dark-light, red-green, and
yellow-blue axes, this is not so in area V1. LGN projections are recombined in
both linear and non-linear ways, resulting in color sensitivity that is more sharply
tuned to specific ranges of color, not necessarily aligned with the color directions
seen in the LGN [209, 419, 1175]. In particular, clusters of cells have been iden-
tified that respond to purple [1189]. Psychophysical studies have confirmed the
likelihood of a color transform from the LGN to the striate cortex [1174, 1176].
There is a possibility that these non-linear recombinations underlie the human
perception of color categories. It is also possible that separate color transfor-
mations in V1 contribute to the perception of color boundaries and colored re-
gions [1030]. A proposed three-stage color model, accounting for phenomena
measured through neurophysiological experiment as well as psychophysics, is
discussed in Section 4.6.
In summary, many cells in area V1 respond to multiple stimuli, such as color,
position, orientation, and direction. The notion that pathways for the processing
of color, shape, and the like remain separate is unlikely to be true. In addition,
area V1 is one of many areas where color processing takes place, thus moving
i i
i i
i i
i i
4.5.2 Area V2
Area V2 is characterized by thick, thin, and pale stripes that are revealed under
cytochrome oxydase staining. There is some degree of functional segregation
along these histochemically defined zones, although many cells are found that are
selective for multiple stimulus attributes [592]. The thick and pale stripes have
a higher degree of orientation selectivity than the thin stripes [726]. The thin
stripes have a tendency towards color selectivity and are found to be sensitive to
wavelength composition [800]. Nonetheless, the precise nature of functional seg-
regation in area V2 is still under dispute [1051] and significant cross-talk between
the processing of various stimulus attributes exists.
The pathway between V1 and V2 is currently known to be bipartite [1050],
with patches in V1 connecting to thin stripes and inter-patches connecting to both
thick and pale stripes in V2 [1051]. These projections are well segregated [490].
There is also significant feedback from V2 to V1 [971], although the organi-
zation of these feedback projections is only beginning to emerge [38, 726, 1044,
1251]. The effect of removing these feedback connections produces only subtle
changes in the responses of V1 neurons [511, 993]. It is possible that these feed-
back projections alter the spatial extent of surround inhibitory circuits [1033],
increase the saliency of stimuli [510], or possibly aid in disambiguation [854].
4.5.3 Area V3
Current techniques have not revealed anatomically or functionally distinct com-
partments within area V3 [592], although a columnar organization has been re-
ported [4]. In addition, little is known about the physiology of neurons in this
area [308]. V3 may play a role in the integration of different stimulus attributes,
aiding in visual perception. It has also been suggested that this area is involved in
the analysis of three-dimensional form [4].
The selectivity of neurons in V3 for different stimulus attributes is similar to
those in V2, except that a higher sensitivity to direction was found (40% of the
cells in V3 versus 20% of the cells in V2 [592]). Orientation selectivity is found
in around 85% of the cells, selectivity for size is found in approximately 25%,
and color sensitivity in approximately 50% of the cells in both V2 and V3 [592].
There is no correlation between selectivity for different attributes, suggesting that
different attributes are not processed along separate pathways in V3 (or in V2).
i i
i i
i i
i i
4.5.4 Area V4
Area V4 has been the subject of significant controversy. It has been implicated
as a color center on the basis of a rare condition called achromatopsia. This is a
deficiency that causes the world to appear colorless and is also known as cerebral
color blindness. This deficiency has been related to lesions in a particular part of
the brain, known as V4. As a result, this area may be involved in computations re-
lating to the perception of color [71, 230, 887, 1289]. It should be noted, however,
that this viewpoint is challenged by several studies [282, 407, 1139] and, as such,
the question of whether V4 can be deemed a color center remains open [392].
4.5.5 Area MT
Area MT is known to be involved in the perception of motion. It responds more
strongly to moving versus stationary stimuli. Direction-selective populations of
neurons have been found in both humans [498,1138] and macaque monkeys [135,
754, 991]. This area is predominantly sensitive to achromatic stimuli.
i i
i i
i i
i i
Retina V1
y r
Fovea x θ
Figure 4.24. The retinal image is laid out in the cortex according to a log-polar mapping
(after [392]).
4.5.7 Summary
As the simple model of segregated pathways through the modules in the visual
cortex is being refined, it is too early to attribute specific functionality to separate
regions in the visual cortex. The exact processing that takes place is yet to be
unraveled, and it is therefore not appropriate to assign a substrate to the higher
levels of functioning of the human visual system. This is the case for the primary
visual cortex, as well as for processing elsewhere in the visual cortex.
Dependent on the techniques used to measure functional aspects of the vi-
sual cortex, it is possible to arrive at different conclusions. For instance, there are
those that argue in favor of segregated pathways for color, motion, form, etc. Oth-
ers have argued in favor of a distributed processing of features, where functional
areas in the visual cortex are sensitive to combinations of motion, form, direction,
orientation, color, and the like.
On perceptual grounds, it is also not possible to arrive at a single conclusion.
While some perceptual attributes appear to be dissociated from others, this is not
always strictly true [1202]. For instance, the apparent velocity of an object is
related to its contrast and color [159], and high spatial and temporal frequencies
appear desaturated [908].
One difficulty for determining the functional architecture of the visual cor-
tex is due to the limitations of the currently available methodologies. Most use
point methods, i.e., record responses from single neurons by means of single-
cell recordings or tracer micro-injections. Given the number of neurons present,
and their vast interconnection network, the problems involved are obvious [494].
Nonetheless, there is promise for the future, in that new techniques are becoming
available that allow large populations of cells to be imaged at cell resolution [846].
In the following chapter, we therefore turn our attention to an altogether dif-
ferent strategy to learn about the (color) processing that takes place in the human
visual system, namely psychophysics. Here, the human visual system is treated
i i
i i
i i
i i
as a black box, which reveals some of its functionality through measuring task
response.
The relative strength of each color is adjusted such that all colors appear
equally bright. In particular, the contrasts along the L-M axis are 8% for the
L-cones and 16% for the M-cones, whereas the S-cone contrast is 83% [1175].
Contrasts for intermediate angles are interpolated. The resulting colors are called
iso-luminant. This configuration is known as the MBDKL color space, named
after Macleod, Boynton, Derrington, Krauskopf, and Lennie [250, 722]. Compu-
tational details of this color space are given in Section 8.6.3.
Although its axes are commonly referred to as red-green and yellow-blue, the
actual colors they encode are as given above. The axes of the color space used by
the LGN therefore do not coincide with the perceptual dimensions of red, green,
blue, and yellow, which are rotated in the MBDKL color space. It was postulated
by Russell and Karen de Valois that the red, green, yellow, and blue perceptual
color systems are formed from the LGN output by adding or subtracting the S
signal from the L and M opponent cells [1171].
The multi-stage color model proposed by Russell and Karen de Valois is
composed of four separate stages. They will be discussed in the following sec-
tions [1171]:
2. Cone opponency;
i i
i i
i i
i i
3. Perceptual opponency;
In the discrete model, diffuse bipolars are weighted as 6L - 5M. In either case, the
result is close to a conventional “red-green” color opponent cell. The same is true
for Mo and So , which end up being weighted as 11M - 10L - S and 15S - 10L -
5M, respectively.
6 An alternative model is also presented, whereby the receptive fields are modeled as L-M, -L+M,
M-L, and -M+L. This model assumes that the surround consists of only L and M inputs, and is there-
fore termed “discrete.”
i i
i i
i i
i i
The diffuse bipolars appear to receive input from a small number of L- and
M-cones, as well as a surround input derived from horizontal cells. The signals
computed by diffuse bipolar cells is therefore designated L+M - (L+M+S) and
-L-M + (L+M+S), forming the start of the magnocellular pathway.
i i
i i
i i
i i
Table 4.5. Cone contributions to each of the four perceptual color axes. The weights given
to So are 2 for all four axes in the original model, but have been adjusted here to account
for more recent hue-scaling results [1174].
i i
i i
i i
i i
Spikes/s (normalized)
100
+S - (L+M)
+L - M
50 -L + M
-50
-100
0 90 180 270 360
MBDKL Color Angle
Figure 4.25. Sinusoidal fit to color responses of various LGN cells (after [1175]).
modeled by sinusoidals, indicating that the input transmitted from the photore-
ceptors is linearly recombined in the LGN.
The optimal tuning of LM opponent cells is along the 0◦ -180◦ axis in MBDKL
space. The M center cells (+M-L and -L+M) on average have their maximum
response at 180◦ and 0◦ . The L center cells (+L-M and -L+M) peak slightly off-
axis at approximately 345◦ and 165◦ . The +S-(L+M) cells peak at 90%. The LGN
cells, in general, respond strongly to colors along the 0◦ -180◦ and 90◦ -270◦ axes
and show little response to other colors, as shown in Figure 4.26.
This figure also shows the color tuning of a set of cortical cells. Cortical
cells are tuned to a variety of different colors, albeit with dips around the 0◦ -180◦
and 90◦ -270◦ axes [1175]. Other studies have found a similarly wide variety of
tunings in V1 [669, 1127], although other experiments have shown cell types that
respond discretely to red, green, yellow, and blue [1147, 1180].
20%
10%
0%
0 90 120 150 180 210 240
MBDKL Color Angle
Figure 4.26. The tuning distribution of 100 LGN cells and 314 striate cortex cells. Oppo-
site phases are combined, i.e., 0◦ and 180◦ are both mapped to 180◦ (after [1175]).
i i
i i
i i
i i
60
Cells
LGN cells
50
Cortical cells
40
30
20
10
0
0.1 1 10 100
n
Figure 4.27. The distribution of exponents for cells in the LGN and in the striate cortex.
The median values for LGN and V1 cells are 1.08 and 1.90 (after [1175]).
While the LGN cells respond linearly to their inputs, V 1 cells exhibit a range
of non-linearities. Such non-linearities can be modeled with an exponentiated
sine function in the MBDKL space:
R = A sinn (C − φ ) . (4.16)
Here, the response R is modeled as a function of gain A, the color axis C under
investigation, and phase φ . The exponent n models the degree of non-linearity.
For a group of LGN and V1 cells, the distribution of exponents that best fits the
individual cells’ responses are plotted (Figure 4.27). The median value found for
V1 cells is 1.90, indicating that nearly one half of the cells exhibit significant
non-linear behavior.
1.0
sin (x)
n=1
n
0.8 n = 1.9
n = 10
0.6
0.4
0.2
0.0
0 40 80 120 160
x
Figure 4.28. A sine function exponentiated with different values of n. As can be seen
here, larger exponents cause the function to be more peaked, i.e., tuned to a narrower set
of wavelengths.
i i
i i
i i
i i
100
60
40
20
0
0 90 180 270 360 450
MBDKL Color Angle
i i
i i
i i
i i
Spikes/s (normalized)
100
Blue
80 Green
Yellow
60 Red
40
20
0
0 90 180 270 360 450
MBDKL Color Angle
Figure 4.30. Color scaling results for 0.1◦ stimuli. Near-zero responses, as well as data
around 90◦ and 270◦ , are not shown (after [1176]).
significant contributions to the perception of both green and blue. This is consis-
tent with the possibility that in V1, S signals are added to the +M-L cells to form
blue and subtracted from the -M+L cells to form green. Similarly, these findings
are in accord with a model which adds S-cone signals to +L-M signals to form
red and subtracts them from -L+M signals to create yellow.
Moreover, the responses in the absence of S-cone input are not peaked. This
suggests that the sharpening of responses in V1 is due to the contribution the
S-cone pathway.
4.6.7 Discussion
The magnocellular pathway is traditionally seen as predominantly responsible
for carrying the luminance signal. In the multi-stage model, this pathway car-
ries essentially luminance information akin to the V (λ ) curve (introduced in Sec-
tion 6.1). However, there also exists evidence that the parvocellular pathway may
carry achromatic information which is multiplexed with color information [1170,
i i
i i
i i
i i
1173]. The multi-stage color model is able to model this and then separate the
chromatic and achromatic information at the cortical stage.
In this model, the L- and M-cones provide the dominant input to both the red-
green and yellow-blue systems. The S-cones modulate this dominant pathway in
the third stage to construct perceptual color axes. It suggests that the dominant
input to the blue system is formed by the M-cones, not the S-cones. This feature
is in accord with measurements of tritanopes (see Section 5.11) who are missing
or have non-functional S-cones. They typically see all wavelengths below 570
nm as blue or partially blue. However, S-cones do not respond to wavelengths
above around 520 nm. The gap between 520 and 570 nm where blue is detected
cannot be explained by conventional color models, but can be explained with the
multi-stage model.
Finally, the rotation to perceptual color axes is in accord with both hue-cancel-
lation experiments [515], as well as cross-cultural color-naming research [87,
449].
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
cones; a center cone surrounded by six hexagonally packed cones. In each fol-
lowing iteration, a ring of new cones is placed outside this cluster. Based on the
location where cones are created, they have a larger or smaller anatomical target
radius. The cone centers are then migrated inward.
This migration is governed by two rules, the first being adherence to each
cone’s target radius and the second a tendency to move inward as far as possible.
For a pair of cones with positions pi and target radii ri , the normalized distance d
is given by
|pi − p j |
d(i, j) = . (4.18)
ri + r j
Cones i and j are defined as neighbors if d(i, j) ≥ 1.5.
The cone positions are then iteratively updated using the cone force equation:
pi − p j
pi = pi + k1 n + k2 r + k3 ∑ s(d(i, j))
|pi − p j |
. (4.19)
j|d(i, j)≥1.5
Here, n is a vector pointing from pi to the center of the retina, and r is a random
vector such that |r| ∈ [0, 1]. The factors k1 , k2 , and k3 are constant, and s is an
interpolation function given by
⎧
⎪
⎨0, d > 1,
s(d) = 3.2 (1 − d), 0.75 < d ≤ 1, (4.20)
⎪
⎩
0.266 (3.75 − d), 0 ≤ d ≤ 0.75.
These migration steps are executed between 25 and 41 times to let the cones settle
into a pattern. The upper bound of migration iterations leads to a retinal mosaic
that is mixed regular and irregular in a manner similar to real retinas. With this
parameterization, each cone has on average 6.25 neighbors, which is a number
also found in human retinas. Outside the fovea, fewer migration iterations are
carried out to simulate that there the retina is less regularly packed.
Once the current ring of cones has finished migrating, the next iteration starts
by surrounding the current cluster with a new ring of cones, which in turn will
migrate inwards.
It would be possible to model the response of each cone with any of a num-
ber of models. A recent model that predicts a wide range of measurements was
presented by van Hateren [434]. It is based on state-of-the-art knowledge of the
various stages involved in photo-transduction. For a full account of this model,
the reader is encouraged to track down the reference [434], which is freely avail-
able on the Journal of Vision website. As the details of the model are beyond the
scope of this book, we only give an overview of this model in the remainder of
this section.
i i
i i
i i
i i
The basic building blocks are temporal low-pass filters, static (non-) linear
filters, and subtractive and divisive feedback loops. Each of these building blocks
forms a plausible model of some aspect of cone functionality, whether cellular or
molecular. A low-pass filter transforms an input x(t) to an output y(t) with
∞
y(t) = h(t − s) x(s) ds, (4.21a)
−∞
⎧
⎨ 1 exp −t , t ≥ 0
h(t) = τ τ (4.21b)
⎩
0 t <0
dy(t)
τ = x(t) − y(t). (4.22)
dt
In both forms, τ is a time constant.
In Figure 4.31, the cascading of processing functions is indicated with a con-
trol flow diagram. The four main blocks are the photo-transduction stage, a cal-
cium feedback loop, processing occurring in the inner segment, and finally the
1/α C
1+(aCC)nc τC
gi Vh
τis aisVisγ aIτh
To horizontal cells
τ Low pass filter with time constant τ
/ Divisive feedback
- Subtractive feedback
i i
i i
i i
i i
horizontal cell feedback. While the constants and functions indicated in this flow
chart are detailed in van Hateren’s paper [434], it is included here to show the sort
of complexity that is required to model only the very first stage of human vision,
the process of photoreception in a single cone.
Compare this complexity with the simple sigmoidal response functions
(Michaelis-Menten or Naka-Rushton) discussed in Section 4.3.3. These mod-
els treat photoreceptors as black boxes with an input and an output, whereas the
present model incorporates most of the separate molecular and cellular compo-
nents that define cone functionality. In addition, it takes temporal aspects into
account (except slow processes such as bleaching).
No matter which model is preferred, it is now known that most of the sensi-
tivity regulation in the outer retina of primates is already present in the horizontal
cells and can be measured there [1057]. This is necessarily so, and is consistent
with the model discussed here.
It makes the model useful for applications in visual computing. For instance,
it has been shown to be capable of encoding high dynamic range video [435].
i i
i i
i i
i i
Chapter 5
Perception
So far, we have described components of the human visual system (HVS) and
some of what is known about their interconnectivity. Long before anything was
known about the components of the human visual system—the substrate that me-
diates vision—inferences about how humans make sense of the world were based
on psychophysical experiments.
In a typical psychophysical experiment, a participant is shown a stimulus and
is given a task. By varying aspects of the stimulus, the task performance can be
related to the variation in the stimulus. For instance, it is possible to ask a par-
ticipant if a patch of light can be distinguished from its background. By varying
the intensity of the patch between trials, the threshold at which the patch becomes
visible can be determined.
In a sense, psychophysical experiments treat the human visual system as a
black box. The input is the stimulus, and the output is the task response given by
the observer. With suitably designed psychophysical experiments, many aspects
of human vision can be inferred. These types of experiments can be designed
to ask questions at a much higher level than neurophysiology currently is able to
address. In particular, many experiments exist that answer questions related to
human visual perception.
Although many different definitions of human visual perception exist, we will
refer to human perception as the process of acquiring and interpreting sensory
information. Human visual perception is then the process of acquiring and in-
terpreting visual information. Thus, perception is related to relatively low-level
aspects of human vision.
High-level processing of (visual) information is the domain of cognition and
involves the intelligent processing of information, as well as memory, reasoning,
251
i i
i i
i i
i i
252 5. Perception
attention, thought, and emotion. In general, the higher-level processes are much
more difficult to understand through psychophysical means than the lower-level
perception attributes of human vision. As such, most of this chapter will focus on
perception, rather than cognition.
One of the cornerstones of color perception is the triple of trichromatic, color-
opponent, and dual-process theories. After giving general definitions and dis-
cussing some of the problems faced by the human visual system, we therefore
discuss these early in this chapter.
The trichromatic, opponent-color and dual process theories are able to explain
several aspects of human vision. However, there are many more peculiarities
of vision that they do not explain. Thus, dependent on the application, further
refinements and additions to these models will have to be considered. Rather than
present a comprehensive catalogue of visual illusions, we show a more-or-less
representative selection. It will be clear that human vision has many complexities
that may have to be taken into account when developing visual applications. In
addition, visual illusions play an important role in understanding the HVS [272].
After all, knowing when visual processing breaks down allows us to learn about
the operation of the HVS.
In the remainder of this chapter, we then discuss some of the mechanisms of
early vision that have been uncovered that help to explain some of these visual
illusions. These mechanisms, which include various forms of adaptation, sen-
sitivity, and constancy, can be used as building blocks in engineering solutions.
As an example, models of lightness and brightness perception have been used
as part of tone-reproduction operators (which are the topic of Chapter 17), and
retinex theory has been used in several different applications. Chromatic adapta-
tion (Chapter 10) is important in white-balancing applications in photography.
We conclude this chapter with a discussion of higher-level processing involv-
ing color naming, as well as color deficiencies and tests for them.
Light is that aspect of radiant energy of which a human observer is aware through
the visual sensations that arise from the stimulation of the retina of the eye
by radiant energy.
i i
i i
i i
i i
Hue is the attribute of the perception of a color denoted by blue, green, yellow,
red, purple, and so on.
Unique hues are hues that cannot be further described by the use of the hue name
other than its own. There are four unique hues, each of which shows no
perceptual similarity to any of the others. They are red, green, yellow, and
blue.
Lightness is the attribute of visual sensation according to which the area in which
the visual stimulus is presented appears to emit more or less light in propor-
tion to that emitted by a similarly illuminated area perceived as the “white”
stimulus. It is sometimes referred to as relative brightness. Lightness
ranges from “dark” to “light”.
In addition to these basic terms, it should be noted that the above definition of
color refers to a perceived quantity. It is therefore also sometimes referred to as
perceived color.
i i
i i
i i
i i
254 5. Perception
Unrelated colors are perceived to belong to an area seen in isolation from other
colors.
nodak.edu/mccourt/CV/Hyperlinkedpapers/YorkCD/.
i i
i i
i i
i i
Light spectrum
A Reflectance spectrum
Spectrum reaching retina
Wavelength
Figure 5.1. The light spectrum and the surface reflectance together determine the spectrum
of light reaching our eyes (after [868]).
i i
i i
i i
i i
256 5. Perception
come from something other than the retinal input. As the human visual system
has evolved to view natural scenes, i.e., environments as commonly encountered
in daily life, it is reasonable to believe that the HVS has come to rely on statistical
regularities available in such environments. There may be many assumptions that
the HVS makes about its environment. Some of these are known and others have
yet to be discovered. In any case, these assumptions serve the role of additional
information that helps disentangle illumination and geometry.
The key insight, therefore, is that human visual processing is beautifully
matched to commonly encountered environments. On the other hand, if some
of these known or unknown assumptions are broken, the HVS may not be able to
infer its environment correctly. In those cases, a visual illusion may result.
i i
i i
i i
i i
Hence, it was surmised that the human visual system has three different mech-
anisms to detect combinations of wavelengths. It was confirmed many years later
that the human visual system indeed has three types of photoreceptors that are
active under photopic conditions, as outlined in Section 4.3.3.
i i
i i
i i
i i
258 5. Perception
Chromatic valence
1.0
Achromatic
Blue-yellow
Yellow Red-green
Red
0.5
Red
0.0
Blue
-0.5 Green
Figure 5.2. The spectral response of the red-green and yellow-blue opponent processes.
Also shown is the spectral response of the achromatic channel (after [516]).
i i
i i
i i
i i
i i
i i
i i
i i
260 5. Perception
Figure 5.3. The elephants in both panels are identical, yet they appear to be darker on the
left and lighter on the right.
For the perception of lightness, it is now accepted that the retinal input is split
into separate components that are processed more or less independently [602].
However, there are two proposals as to how such a decomposition may occur,
namely into frameworks and into layers [366].
For frameworks, the retinal image is thought to be decomposed into contigu-
ous regions of illumination. Within each region, the highest intensity serves as an
anchor (see Section 5.8.5), and all other values within a region are relative to this
anchor.
The alternative theory presumes that the retinal image is split into overlapping
layers that are each treated separately. The image is then assumed to consist of
an illumination pattern that is projected onto a pattern of surface reflectances. Of
course, this would enable the human visual system to reverse engineer reflectance
from illumination by a process known as inverse optics [742]. There are many
real-world examples where reflectance and illumination are perceived simultane-
ously. An example is a white house reflected in a black car [366]. The intensity
of the reflected pattern is gray; yet, neither the house nor the car appear gray.
Instead, the image is perceptually split into two separate layers.
Another example where an image is split into separate layers is shown in Fig-
ure 5.3 [35]. Here both images show an elephant with identical pixels. Only the
surround is changed, namely lightened in the left image and darkened in the right
image. Both images, however, still appear to be composed of a cloudy pattern
with a dark elephant on the left and a light elephant on the right. The result of this
apparent layered processing is that the human visual system has not perceived the
two elephants as identical.
Thus, the accuracy with which the human visual system is able to disentan-
gle reflectance from illumination is not perfect, and this is one source of visual
i i
i i
i i
i i
illusions. Visual illusions can be seen as systematic errors made by the visual sys-
tem, and they constitute a “signature of the software used by the brain” [365,366].
Many of the errors made by the human visual system can be explained by a model
of frameworks [13, 363]. Likewise, a layers model can also be expanded to in-
clude an account for errors [35, 366]. It is currently not known which model is
more plausible, or even if the two models can be unified.
In any case, visual illusions are important to understand aspects of human
visual processing. In addition, some of them have been directly applied in some
applications. For instance, counter-shading is a well-known technique used by
artists to increase the apparent contrast of images and painting (see Section 17.9).
The Cornsweet-Craik-O’Brien illusion, introduced in Section 5.4.4, has been used
to assess tone-reproduction operators (Section 17.11.4).
Figure 5.4. The Ponzo illusion. Two identical colored horizontal bars appear to be of
different length.
i i
i i
i i
i i
262 5. Perception
Figure 5.5. An example of size constancy. The menhirs in this photograph are all ap-
proximately the same size; Parque Escultórico de la Torre de Hécules, A Coruña, Spain,
August 2005.
i i
i i
i i
i i
Figure 5.6. The simultaneous-contrast illusion. The two gray patches appear somewhat
different, but they are in fact identical.
science that deals with this is called color-appearance modeling, and is discussed
in detail in Chapter 11.
Figure 5.7 shows two checkerboard patterns each surrounded by a larger
checkerboard pattern with either a higher or a lower contrast. Although the
two center panels have the same contrast, the one on the right appears to have
higher contrast. This phenomenon is known as contrast constancy or contrast
contrast [180, 1069]. Compare this illusion to the one shown in Figure 5.3.
Figure 5.7. Contrast constancy. The center part of both checkerboard patterns has the
same contrast (after [180]).
i i
i i
i i
i i
264 5. Perception
Figure 5.8. The café wall at the bottom of St. Michael’s Hill in Bristol, UK. The mortar
between the tiles forms straight lines that are perceived to be curved; March 2005.
Figure 5.9. The Cornsweet-Craik-O’Brien illusion. The left and right quarters of this
image have the same value, whereas the middle half contains two smooth ramps, separated
by a sharp jump. This configuration is perceived as a step function, with the left half and
right half of unequal brightness (after [208]).
i i
i i
i i
i i
Figure 5.10. The Hermann grid (left), showing phantom dark blobs on the intersections
of the white bands. The classical explanation of the Hermann grid can be invalidated by
creating curved sides to each block, which makes the illusion disappear (right).
i i
i i
i i
i i
266 5. Perception
Figure 5.11. By placing white dots at the cross sections of a Hermann grid, the scintillation
effect can be induced. The dark blobs take on the same color as the blocks.
versions. Note that the black boxes give rise to black illusory patches, whereas
the green boxes give rise to green illusory patches.
i i
i i
i i
i i
Figure 5.13. Neon color-spreading examples. On the top left, an achromatic version is
shown, which is also known as neon brightness spreading. The top-right figure is due to
Varin [1179], whereas the bottom-left illusion was developed by van Tuijl [1151]. The
bottom right illusion is from [127].
i i
i i
i i
i i
268 5. Perception
i i
i i
i i
i i
i i
i i
i i
i i
270 5. Perception
i i
i i
i i
i i
Log intensity
8
7
Cones
Rod-cone break
6
5
Rods
4
2
0 5 10 15 20 25 30
Time in the dark (min)
over the course of time the human visual system adapts to a much larger range
of illumination. These processes are not instantaneous. The time course of adap-
tation for the transition between a light environment and a dark environment is
called dark adaptation.
There are several mechanisms at play during dark adaptation, including di-
lation of the pupil, photochemical activity in the retina, and neural processes in
the remainder of the visual system. While the pupil rapidly adjusts to new light-
ing conditions, the purpose of constriction is to improve the optical resolution
of the stimulus falling on the retina, rather than a mechanism to stabilize image
luminance [150, 151, 1238].
Dark adaptation can be measured by letting an observer gaze at a pre-adapting
light for approximately five minutes before placing the observer in a dark room
and increasing the intensity of a test stimulus until it is detected by the observer
[53]. The intensity of this threshold stimulus can be plotted as a function of time,
as shown in Figure 5.15. The adaptation curve followed by individual observers
varies somewhat, but tends to lie within the light gray region in this figure.
The curve consists of two separate regions that are marked by a break. During
the first few minutes of dark adaptation, the cones gain in sensitivity and dominate
the adaptation curve. The threshold level at the end of the cone adaptation curve
is called the cone threshold. The rod system is slower to adapt, and only begins
to dominate the adaptation process after the cones are at equilibrium. The curve
followed for the rods ends with a maximum sensitivity after around 40 minutes.
The threshold reached at that point is called the rod threshold and has a value of
around 10−5 cd/m2 .
i i
i i
i i
i i
272 5. Perception
If the test stimuli used during dark adaptation are colored, then before the rod-
cone break is reached, these stimuli appear colored. After the rod-cone break, they
appear colorless.
Dark adaptation is affected by several factors, including the intensity of the
pre-adapting light, the exposure duration, the size and location of the spot on the
retina used to measure adapting behavior, the wavelength of the threshold light,
and rhodopsin regeneration [208, 444, 445, 899].
-3
Log threshold intensity
-4
-5
Cones
-6
-7
Rods
-8
-9 -7 -5 -3 -4
Log intensity
i i
i i
i i
i i
Log ΔI
Saturation
2
0 Weber’s
Law
-1
-2
Dark
Light
-3
Square Root
Law
-4
-8 -4 -2 0 2 4
Log I
by [310]
⎧
⎪
⎨−0.72 if log(LA ) ≤ −2.6,
log(TVI(LA )) = log(LA ) − 1.255 if log(LA ) ≥ 1.9,
⎪
⎩
(0.249 log(LA ) + 0.65)2.7 − 0.72 otherwise.
(5.1)
i i
i i
i i
i i
274 5. Perception
threshold, because the sensitivity at these levels is determined by neural noise (the
dark light).
The second part of the curve behaves according to the square root law, also
known as the de Vries-Rose law [979, 1188]:
ΔI
√ = k. (5.3)
I
Here, light levels are such that quantal fluctuations determine the threshold of
vision. Thus, to determine a test stimulus, it must be sufficiently different from
the background to exceed these fluctuations. Under such conditions, an ideal
light detector would record a threshold that varies according to the square root of
the background intensity. In a log-log plot, as shown in Figure 5.17, this would
yield a curve segment with a slope of 0.5. The rod pathway typically has a slope
of 0.6 [417]. The shape of the transition to the next curve segment is determined
by several factors, including the size of the test spot and the duration of the test
stimulus.
A large part of the threshold-versus-intensity curve has a slope of around 1.
This means that the intensity threshold ΔI is proportional to the background in-
tensity I [114, 1225, 1226]. This relation is known as Weber’s law [100, 900]:
ΔI
= k. (5.4)
I
The constant k is the Weber constant or the Weber fraction. The intensity thresh-
old ΔI is also known as the just noticeable difference (JND) . To account for neural
noise, the dark light may be modeled as a small constant I0 that is added to the
background intensity:
ΔI
= k. (5.5)
I + I0
The last part of the curve in Figure 5.17 steepens, indicating that at those back-
ground levels, very large intensity differences become undetectable. Thus, the
system saturates. For the rod system, this occurs at around 6 cd/m2 for a natural
pupil in daylight [63].
The cone system does not saturate, but shows a constant difference threshold
over a very large range of background intensities [133].
i i
i i
i i
i i
well above the detection threshold. To yield a relation between the magnitude of
the stimulus and the strength of the evoked sensation, it is tempting to integrate
Weber’s law, after noting that the Weber fraction can be interpreted as a constant
times the differential change in perception d ψ , i.e., k = k d ψ . Restating Weber’s
law and integrating yields
ΔI
dψ = , (5.6a)
k I
ψ = k ln(I) +C. (5.6b)
ψ = k(I − I0 )n , (5.8)
where the exponent n varies according to which of the many sensations is mea-
sured. For example, the exponent is 0.33 for brightness,2 1.2 for lightness, and
3.5 for electric shock.
i i
i i
i i
i i
276 5. Perception
500
-1
(Modulation threshold)
0.0009 tr
200 0.009 tr
0.09 tr
100 0.9 tr
9 tr
50 90 tr
20
20
1
0.5 1 2 5 10 20 50
Spatial frequency (cycles/deg)
Figure 5.19. Contrast sensitivity for different levels of retinal illumination. (Data from
van Ness [827]; figure redrawn from [646].)
i i
i i
i i
i i
where xmax and ymax are the maximum display dimensions. The resulting image
is shown in Figure 5.18, which is called a Campbell-Robson contrast-sensitivity
chart. In this chart, it is possible to trace from left to right where the sine grating
transitions into a uniform gray. In the uniform gray area there is still contrast,
but it falls below the threshold of visibility. The trace forms an inverted U-shape,
which delineates the contrast sensitivity as a function of spatial frequency.
Contrast sensitivity is, in addition, dependent on the amount of light falling
on the retina. This amount itself is modulated by both the environment as well
as the pupil size. For this reason, the preferred measure for retinal illuminance is
the troland, which is luminance multiplied by the pupillary area.3 The contrast
sensitivity for different levels of retinal illuminance are plotted in Figure 5.19.
Finally, the sensitivity to contrast may be altered in individuals with certain
diseases such as multiple sclerosis and disorders such as cataracts and ambly-
opia [41].
i i
i i
i i
i i
278 5. Perception
These relations are known as the von Kries coefficient law [622]. The coefficients
kl , km , and ks are inversely related to the relative strengths of activation [544]. Von
Kries chromatic adaptation is further discussed in Chapter 10.
Figure 5.20. An example of chromatic adaptation. Stare for 20 seconds at the small cross
in the top panel. Then focus on the cross in the bottom panel. The image should look
normal; Rathaus Konstanz, Germany, June 2005.
i i
i i
i i
i i
i i
i i
i i
i i
280 5. Perception
Figure 5.21. An image processed to simulate human vision after 0, 4, and 8 weeks of
development (top row), and 3 and 6 months of development followed by adult vision (bot-
tom row). The simulation is from tinyeyes.com. Statue by James Tandi; photographed at
Native Visions Art Gallery, Orlando, FL.
i i
i i
i i
i i
Visual acuity
Max cone
1.6
resolution
1.2 Cones
0.8
Max rod
0.4
resolution
Rods
0.0
-4 -2 0 2 4
Log I
Figure 5.22. The visual acuity changes with illumination, shown here separately for the
rods and the cones (after [962].)
diffraction [51]. Thus, the size of the pupil has a direct effect on the point-spread
function of the projected light.
The amount of background illumination has a direct impact on visual acuity.
For recognition tasks, the relationship between the two is given in Figure 5.22.
The resolving power is not uniformly distributed over the retina, but decreases
with distance to the fovea. The relationship is shown in Figure 5.23.
Under photopic lighting conditions, the best visual acuity is achieved for test
stimuli with the same intensity as the intensity to which the eye is adapted. Under
scotopic conditions, the visual acuity is much lower and is determined by the AII
amacrine cell [606, 1218].
1.0
Snellen fraction
0.7
0.5
0.3
0.2
0.1
0.07
0 5 10 15 20 25 30
Eccentricity (degrees)
Figure 5.23. Visual acuity as function of retinal position, measured as the eccentricity
from the fovea (after [1239]).
i i
i i
i i
i i
282 5. Perception
Figure 5.24. Landolt rings (left) and illiterate Es (middle) are used for target detection,
whereas Snellen charts (right) are useful for target recognition.
Figure 5.25. Vernier acuity can be measured by having observers locate small displace-
ments. In both panels, the displacements are increased from left to right.
vision has been proposed. An overview of arguments for and against this hypothesis is given by
i i
i i
i i
i i
Figure 5.26. An example of simultaneous contrast. The central circle on the left appears
larger than the one on the right, even though they are the same size (after [868]).
The human visual system enhances such differences in cues, allowing for eas-
ier detection of objects. Take the identical inner circles in Figure 5.26, for ex-
ample. In this artificial scenario, the circle on the left is surrounded by smaller
circles, and the one on the right is surrounded by larger circles. The human visual
system enhances the difference between the sizes of the central circles and their
surrounds, making the circle on the left appear larger and the one on the right
appear smaller. As a result, the two central circles appear to have different sizes.
The enhancement of the difference between cues from the object and its surround
is termed simultaneous contrast.
Similarly, a gray region appears brighter when viewed on a darker surround,
and darker when viewed on a brighter surround [171]. This type of simultaneous
contrast is called simultaneous lightness contrast. Simultaneous color contrast
refers to the perceived enhancement of differences in the color cue. Both these
types of simultaneous contrast are discussed in the following sections.
i i
i i
i i
i i
284 5. Perception
Figure 5.27. A gray ring of uniform lightness appears to have four sections of different
lightness. The perceived lightness of each section is influenced by the lightness of the
surround (after [868]).
the region on the right. The graph shows the difference between the perceived and
the actual lightness across the edge.
1.0
Intensity
0.8 Perceived intensity
0.6
0.4
0.2
0.0
0.0 0.5 1.0
Figure 5.28. Bands are perceived on either side of the edge shown above. The presence
of these bands enhances the physical difference across the edge.
i i
i i
i i
i i
Figure 5.29. The uniformly colored ring is perceived to have different colors, determined
by the background color behind the ring (after [868]).
i i
i i
i i
i i
286 5. Perception
Green
Wavelength (nm)
Figure 5.30. Nulling technique for measuring the effect of simultaneous color contrast
(after [614]).
i i
i i
i i
i i
Figure 5.31. The left and right half of the light house each stimulate the photoreceptors
differently, as demonstrated by the insets. However, the wall of the light house is still
perceived as one continuous surface. The difference in illumination is discounted by the
human visual system, as long as the context is present; Light house at Ponce’s Inlet, FL,
June 2005.
i i
i i
i i
i i
288 5. Perception
Figure 5.32. The check shadow illusion. Although the two squares marked A
and B have the same luminance, they are perceived differently due to the pres-
ence of a shadow (left). If the illusion of a shadow is removed, then the
two checks appear the same (right). (Images courtesy of Edward H. Adelson,
(http://web.mit.edu/persci/people/adelson/checkershadow illusion.html).)
Though adaptation may aid constancy, it cannot give a full account of color
and lightness constancy. Adaptation theories are only valid to explain global
changes in lighting conditions; they do not address local conditions such as the
change in lighting due to shadows in the same environment. In particular, the
simultaneous effects in Figure 5.31 and 5.32 are not well explained by adaptation
theories.
i i
i i
i i
i i
Figure 5.33. The experimental setup of Wallach’s experiment. Subjects were asked to
modify the light of the circle on the right so that it was perceived to have the same re-
flectance as the circle on the left (after [1196]).
Participants in this experiment are then asked to adjust the intensity of the second
disk such that its reflectance is perceived to be identical to the first disk.
If participants are able to perceive absolute values, then the values of the two
disks should be matched. If participants are only able to perceive relative values,
then the ratios of the values of the disk and ring should be matched. Wallach’s
experiment revealed that participants tended to match ratios rather than absolute
values. This gives weight to the hypothesis that ratios of light entering the eye are
important in the perception of lightness.
5.8.3 Filling-In
Given that relative contrast is important to human vision, the question arises at
which spatial scale such computations may occur. In particular, such computa-
tions could be carried out on either a local or global scale. Visual phenomena,
such as the Cornsweet-Craik-O’Brien illusion imply that this computation may
be local (see Section 5.4.4).
In this illusion, the left and right quarters of the image have identical values.
Nevertheless, these parts of the image are perceived to be different due to the
presence of an edge. This edge is easily perceived, but the gradual ramps directly
adjoining the edge contain no C1 -discontinuities, and they are therefore not no-
ticed. As a result, the change in value due to the edge is propagated along the
surface; hence, the perceived intensity difference between left and right sides of
this illusion.
The propagation of edge information to the interior of surfaces is called filling-
in [280, 395, 616, 1197]. This process is under-constrained, since many different
i i
i i
i i
i i
290 5. Perception
ways to reconstruct a scene from edge information are possible. It is thought that
the human visual system makes specific assumptions about the world to enable
a reasonable reconstruction. For instance, it may be assumed that the luminance
distribution between sharp discontinuities is more or less uniform. This leads to
a computational model of filling-in which employs a (reaction-) diffusion scheme
to propagate edge information into the interior of uniform regions [830].
A different useful assumption is that the real world exhibits the well-known
1/ f image statistic, which is discussed further in Section 15.1 [52, 55, 58, 82, 83,
144, 172, 259, 313, 314, 651, 878, 879, 983, 985, 986, 1001]. Given that low spatial
frequencies are attenuated in the edge signal, but are not completely absent, a
signal may be reconstructed by boosting low frequencies according to this 1/ f
statistic [228].
i i
i i
i i
i i
Illumination
Luminance
32 40
A
80% Reflectance
48
60
24
B
40% Reflectance
32
80
16
B
20% Reflectance
20 100
Luminance edge calculation (A to C):
(48/24) (32/16) = 4/1
Figure 5.34. Retinex theory proposes an explanation for how humans achieve color con-
stancy despite gradual changes in incident light falling on surfaces (after [647]).
5.8.5 Anchoring
While human vision is predominantly driven by relative values, the absolute im-
pression of how light or dark an environment, or part thereof is, also plays an
important role. The Retinex theory offers no model for how humans are able to
perceive absolute values. To compute absolute values given the relative values
from the Retinex theory, several heuristics may be applied. Finding the absolute
values from relative values is termed the scaling problem.
An example of a heuristic that may be involved includes the anchoring heuris-
tic, which maps the surface with the highest reflectance to white [679]. Evidence
for this heuristic can be demonstrated by showing a range of gray values between
black and dark gray. The lightest of these will be perceived as white, whereas the
black will be perceived as gray [158]. A related example is shown in Figure 5.35,
where in the left image, a piece of paper is shown that appears white. When suc-
cessively whiter sheets are added, the original piece of paper will appear less and
less white. This is known as the Gelb effect and is an example of a situation where
lightness constancy breaks.
In addition to a tendency for the highest luminance to appear white, there is
a tendency for the largest surface area to appear white [363]. If the largest area
also has the highest luminance, this becomes a stable anchor which is mapped
to white. However, when the largest area is dark, the highest luminance will be
perceived as self-luminous.
In all likelihood, a combination of both heuristics is at play when anchoring
white to a surface in an environment. In addition, the number of different sur-
i i
i i
i i
i i
292 5. Perception
Figure 5.35. The paper in the left image appears less and less white when whiter paper is
added to the scene.
faces and patches in a region affects the perception of white. This is known as
articulation. A higher number will facilitate local anchoring (see Figure 5.36).
Finally, insulation is a measure of how different surfaces or patches in a scene are
grouped together [363]. The patches surrounding the test square in Figure 5.36,
for instance, may be considered a local framework, and the rest of the page may
be considered a global framework.
In general, frameworks can be viewed as regions of common illumination. In
addition, proximity is an important grouping factor. The latter essentially means
that nearby regions are more likely to be in the same framework than distant
regions.
Another anchoring rule may be that the average luminance in an environment
is mapped to middle gray [456]. Similar rules are used frequently in photography,
as well as in tone reproduction [950] (see Chapter 17). However, experimental ev-
idence suggests that anchoring to highest and lowest luminance as well as largest
area provides a better model of human visual perception [362, 679]. A compu-
tational model of lightness constancy was implemented for the purpose of tone
reproduction, as discussed in Section 17.8.
Figure 5.36. The patches in the centers of all four groups have the same reflectance. The
right pair of patches look somewhat different because of their surrounds. If the surrounds
are made of a larger number of patches, the effect of lightness difference is enhanced (left
pair), even though the average intensity of light reaching the eyes from the surrounds in
both groups is the same (after [13]).
i i
i i
i i
i i
Once an anchor is found, all other surface reflectances can then be derived
from the lightest surface by relative Retinex-like computations [647].
Observer’s view
without mask
Bird’s eye view of
experimental set-up
Display
Shadow
Occluder
Observer’s view
with mask
Removable
mask
Observer
Figure 5.37. The set-up of Gilchrist’s experiment to determine the importance of shadow-
edge identification for constancy. Subjects were asked to select from the squares on the
right, the one that best matched the reflectance of the square on the left. They performed
the experiment twice. Once without the mask (top right), so they knew the edge in the
center was a shadow edge, and once with the mask (bottom right), so they did not know
what type of edge it was (after [364]).
i i
i i
i i
i i
294 5. Perception
Without mask
3
2
1
3 With mask
2
1
Munsell value
Figure 5.38. The results from Gilchrist’s experiment. When subjects were aware that
the edge in the middle was a shadow edge, they were able to ignore it and select a square
such that its reflectance was almost identical to the original patch. When this information
was not available, they chose squares that had the same amount of light reflected off their
surfaces as the original square. Thus, lightness constancy failed when the subjects were
unaware that an edge was a shadow edge (after [364]).
When the subjects knew that the edge was due to a change in incident light,
they chose squares from the candidate squares that were very close to the actual
reflectance of the original square. However, when they were not aware that the
edge was a shadow edge, they chose squares that reflected the same intensity of
light as the original square, as shown in Figure 5.38.
These results imply that the human visual system is able to ignore shadow
edges, but not reflectance edges. This, in turn, implies that we may encode light-
ing information separately from reflectance information by computing local con-
trasts separately in light maps and reflectance maps.
i i
i i
i i
i i
i i
i i
i i
i i
296 5. Perception
environment is different depending on whether there are clouds, what time of day
it is, etc. However, these changes are relatively minor. For indoor environments,
a similar argument can be made. Here, there are several different kinds of light
sources, but their variation is not significant.
Finally, only a limited number of different surface reflectances are typically
encountered [233]. Thus, the limited variability in lighting and surface reflectance
may be exploited by the human visual system to achieve color constancy.
Photons
Region 2
Region 1
Figure 5.39. Possible changes in spectra across shadow and reflectance edges (af-
ter [868]).
i i
i i
i i
i i
Photons
Photons
Region 2
Region 1
Region 2
Region 1
Region 1 Region 2
Figure 5.40. The four different cases in the edge-classification algorithm proposed by
Rubin and Richards [982].
i i
i i
i i
i i
298 5. Perception
dent of the illuminant [317, 319, 474]. Recent evaluations of collections of com-
putational color-constancy algorithms were presented by Barnard et al [66] and
Hordley and Finlayson [487]. Such algorithms are important in situations where
the illumination is not controlled, for instance, in remote-sensing applications,
guiding robots, as well as color reproduction [1203, 1204]. It also finds practical
application in matching paints [474]. Finally, Retinex theory has been adapted to
solve the color constancy problem [738, 965].
i i
i i
i i
i i
are predominantly covered with snow, and who thus have a greater need to dis-
tinguish between different kinds of snow, have more than a dozen names for it.
Cultural relativists further believed that this was the only factor influencing the
naming of colors in different languages, and that human physiology played no
part in determining it. Hence, there would be no correlation in the division of
color space into various categories across different languages.
The first cross-cultural research to study the nature of color categorization was
performed by Brent Berlin and Paul Kay in 1969 [87]. This work was prompted
by their intuitive experience in several languages of three unrelated stocks. They
felt that color words translated too easily among various pairs of unrelated lan-
guages for the relativity thesis to be valid.
Basic color terms are names of colors with the following properties. First,
they are single lexical terms such as red, green, or pink; not light-green or
greenish-blue. Second, they refer to the color of objects, rather than objects them-
selves. Therefore, colors such as bronze and gold are not valid color terms. Third,
the colors are applicable to a wide range of objects, so that colors like blonde are
not valid. Fourth, the colors are in frequent use.
Berlin and Kay use an array of Munsell color chips (see Section 8.9.1) to
determine the references of basic color terms of a language. For each basic color
term, participants in the experiment are asked to specify the best example of the
color, as well as the region of chips on the array that can be called by the same
name as the basic color term. Figure 5.41 shows a copy of the Munsell Chart that
was shown to the participants.
Berlin and Kay experimentally examined 20 languages directly and 78 more
through literature reviews. They discovered two interesting trends. First, there
Figure 5.41. An illustration of the Munsell Chart that was shown to partici-
pants in the work by Berlin and Kay [87]. (See also the World Color Survey,
http://www.icsi.berkeley.edu/wcs/.)
i i
i i
i i
i i
300 5. Perception
Type Basic terms White Black Red Green Yellow Blue Brown Pink Purple Orange Gray
1 2 • • ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
2 3 • • • ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
3 4 • • • • ◦ ◦ ◦ ◦ ◦ ◦ ◦
4 4 • • • ◦ • ◦ ◦ ◦ ◦ ◦ ◦
5 5 • • • • • ◦ ◦ ◦ ◦ ◦ ◦
6 6 • • • • • • ◦ ◦ ◦ ◦ ◦
7 7 • • • • • • • ◦ ◦ ◦ ◦
8 8 • • • • • • • • ◦ ◦ ◦
9 8 • • • • • • • ◦ • ◦ ◦
10 8 • • • • • • • ◦ ◦ • ◦
11 8 • • • • • • • ◦ ◦ ◦ •
12 9 • • • • • • • • • ◦ ◦
13 9 • • • • • • • • ◦ • ◦
14 9 • • • • • • • • ◦ ◦ •
15 9 • • • • • • • ◦ • • ◦
16 9 • • • • • • • ◦ • ◦ •
17 9 • • • • • • • ◦ ◦ • •
18 10 • • • • • • • • • • ◦
19 10 • • • • • • • • • ◦ •
20 10 • • • • • • • • • ◦ •
21 10 • • • • • • • • ◦ • •
22 11 • • • • • • • • • • •
appear to be just 11 universal basic color terms.5 This small number is due to the
fact that color categories are shared by languages. The 11 basic colors are: white,
black, red, green, yellow, blue, brown, purple, pink, orange, and gray.
Second, there are only 22 different subsets of these basic color terms from a
total of 216 − 1 possibilities. If a language does not have all of these basic color
terms, then it has a specific subset of these color terms. For example, if a language
has only two basic color terms, then these color terms are the equivalent of the
English color names white and black. Table 5.1 shows the possible subsets that
are found across all these languages. Based on this result, it is hypothesized that
as languages develop, they acquire the basic color terms in a specific order that
can be termed evolutionary.
These results indicate that there are physiological reasons for the very specific
evolution of languages in terms of color. It is worth noting that colors important
in Hering’s opponent theory (red, green, blue, yellow, white, and black) are also
5 This number was increased to 16 by later studies [584].
i i
i i
i i
i i
i i
i i
i i
i i
302 5. Perception
This model predicts that it is easier to reliably detect and remember focal
colors. A focal color clearly belongs to one category, in which its membership
value is high. For colors on the boundary, belonging to more than one category,
a decision has to be made regarding which category they belong to. This could
possibly make the task of detection and recognition less reliable.
Table 5.2. A table listing the different types of dichromacies and anomalous trichromacies.
i i
i i
i i
i i
ally, but not always, inherited. Tritans tend to confuse blues and yellows, a defect
that is almost always acquired. Figure 5.42 and Figure 5.43 show images as they
appear to a normal trichromat, as well as how they might appear to the different
types of dichromats. These images were created with Vischeck, a freely available
program to treat images so that they appear as they would to people with different
color anomalies. Computer simulation of color deficiencies is further discussed
by Brettel et al [129].
Achromotopsias are rare anomalies in which the affected person behaves
largely as a monochromat [3, 910]. As the name implies, monochromats see the
world as shades of a single color. The most common form of achromatopsia
is rod monochromacy, in which individuals have no cones. Rod monochromats
do not show the Purkinje effect (see Section 6.1), have achromatic vision, low
visual acuity, high sensitivity to bright lights, nystagmus (involuntary eye move-
ment), and macular dystrophy. It is a color deficiency that occurs in about 1 in
30,000 [236, 329].
i i
i i
i i
i i
304 5. Perception
i i
i i
i i
i i
Luminous efficiency
1.0
Trichromacy
Deuteranopia
Protanopia
0.5
0.1
Figure 5.44. Achromatic response function for normal trichromats, protanopes, and
deuteranopes (after [516]).
One such mechanism was introduced in Section 5.3.3 and quantifies an ob-
server’s spectral sensitivity through a nulling method [516]. When the experiment
is performed by an observer with a color anomaly, the different spectral sensitiv-
ity of this observer may be compared with that of a normal observer, resulting
in the achromatic response functions shown in Figure 5.44. The protanope’s re-
sponse function is significantly different from the trichromat’s and deuteranope’s
response function. Figure 5.45 shows the theoretical chromatic and achromatic
responses of a protanope and a deuteranope. In this case, there is no perception
of reds and greens, leading to only a single chromatic response.
1.0
Relative visual responses
0.0
Deuteranopia
Protanopia
White
Yellow
Blue
-1.0
400 500 600 700
Wavelength (nm)
Figure 5.45. Spectral response for dichromats with red-green defect (after [516]).
i i
i i
i i
i i
306 5. Perception
Δλ (nm)
Trichromacy
14 Deuteranopia
Protanopia
10
Trichromacy
Tritanopia
i i
i i
i i
i i
plates. Each plate is drawn such that its pattern is only detectable by observers
with normal color vision.
The Ishihara test set exists in two variations, one consisting of 24 plates and
one consisting of 34 plates. If administering an Ishihara test, it is recommended
to use plates from the larger set, as the 24-plate set contains relatively few reliable
plates. Both sets contain two groups of plates. The first group is suitable for
numerates, containing numerals painted in a different color from the background.
The second group is suitable for innumerates, contains snaking patterns, but is
rarely used because the results are much more difficult to assess, and these plates
are less reliable.
The plates containing numerals can be classified into four colorimetrically
different groups:
Transformation plates cause anomalous color observers to give different answers
than those with normal vision. Plates 2–9 are in this group (of the 34-plate
set).
Disappearing digit plates, also known as vanishing plates contain numerals that
can only be observed by those with normal color vision. For color-deficient
viewers, these plates appear as a random dot pattern. These are Plates 10–
17.
Hidden digit plates are designed such that anomalous observers will report a nu-
meral, whereas normal observers do not see any pattern. Plates 18–21 are
in this group.
Qualitative plates are designed to assess the severity of color blindness and sep-
arate protan from deutan color perception. These are Plates 22–25.
Of these four types, only the transformation and disappearing plates are reliable
and should be used to test for color blindness. Figure 5.48 shows examples of all
four plate types.
It should be noted that the Vischeck simulation applied to the hidden-digit
plate does not reveal the number that was intended. Hidden-digit plates are typi-
cally not very reliable and, in addition, the Vischeck algorithm may not be opti-
mally matched to this particular stimulus. It is included for completeness.
A typical color plate test procedure begins by showing the observer Plate 1
(see Figure 5.49), which is a demonstration plate. If the observer does not give
the correct answer for this plate, then the remainder of the test should not be
administered.
To interpret the results, the false positives and false negatives should be taken
into consideration. The false-positive rate varies significantly with age, and to a
i i
i i
i i
i i
308 5. Perception
Figure 5.48. Ishihara test plates from the 24-set to determine color blindness. From left
to right: Ishihara plate, same plate as seen by a protanope, a deuteranope, and a tritanope.
Top to bottom: a transformation plate, a disappearing plate, a hidden-digit plate, and a
qualitative plate. Note that due to color reproduction issues as well as viewing conditions,
these reproductions are not suitable to determine color blindness. (Test plates from [531].
Color anomalies simulated using Vischeck (www.vischeck.com).)
lesser extent with education [638]. The false-negative rate depends on the type of
anomalous vision, but tends to be low. The question then is where to place the
cut-off, i.e., how many errors before we can decide that someone is color normal?
The answer depends on many factors. For a male population, a good balance
between false positives and false negatives is achieved by placing the cut-off be-
tween four and six errors, provided Plates 2–17 are used (in particular excluding
the hidden-digit plates).
For females, the occurrence of anomalous color vision is lower, meaning the
cut-off point would have to be chosen differently, and even then the interpretation
i i
i i
i i
i i
Figure 5.49. Demonstration test. This numeral should be visible by both color blind
observers as well as color normal observers. (Test plate from [531].)
i i
i i
i i
i i
310 5. Perception
where Θi, j is the reflectance of pixel j with respect to pixel i and P is the path
from pixel i to pixel j. This computation requires knowledge about the location
of the pixel with the highest reflectance. Since the value at a pixel is affected
by its reflectance, as well as the intensity of incident light, it cannot be used to
determine the pixel with the highest reflectance. Such a pixel may only be found
with an exhaustive evaluation of all possible sequential products, comparing the
results and selecting the lowest value. This value corresponds to the ratio of pixels
with the highest and lowest reflectance in the scene.
i i
i i
i i
i i
where ρ j is the final relative reflectance of pixel j and N is the number of paths
used in the computation.
The ratios computed between different adjacent pixels may give some indi-
cation of whether the pixels are on opposite sides of an edge or belong to the
same uniform region. If the pixels have similar values, then the ratio will be close
to 1. In log space, this is equivalent to a value close to 0. Log ratios significantly
different from 0 will mean that an edge exists between the pixels. To eliminate
the effects of slowly varying lighting, a threshold t may be applied to the ratios.
Those ratios that are close to 0 will be set to 0, as shown below:
Θi, j = ∑ T (d pk ) , (5.15a)
k∈P,k< j
i i
i i
i i
i i
312 5. Perception
i i
i i
i i
i i
red is displayed against a background that would yield good contrast with yellow
as well. Making the text large and bold will also be beneficial.
A saturated red right next to a saturated blue is a special case. The wavelengths
of these colors are at opposite ends of the visible spectrum, and this causes the
focal length for lenses (including the ones in human eyes) to be distinctly different
for these colors. This may leave the impression that the text is not stationary, but
moves around with eye movements. This is due to continuous refocusing of the
eye for this color combination, even though the eye-object distance is identical.
This effect occurs for color-normal observers.
Figure 5.50. From left to right, top to bottom: input image, simulation of a deuteranope,
daltonized input image, and a simulation of how a deuteranope may perceive the daltonized
input image; Insel Mainau, Germany, August 2005. (Simulations created using Vischeck
(www.vischeck.com/daltonize).)
i i
i i
i i
i i
314 5. Perception
While the above ought to help with choosing text and backgrounds, it is not
so easy to adjust images so that they can be viewable for all web users. However,
it is possible to adjust images so that color-deficient observers find it easier to
recognize objects in them. One approach is to increase the red and green contrast
in the image, as many color deficient viewers have residual red-green sensitivity.
Second, red-green variations can be mapped to other image attributes, such as the
yellow-blue channel, or luminance. The combination of these image-processing
techniques is called daltonization [261]. An example of such a simulation is
shown in Figure 5.50.
• http://www-psych.rutgers.edu/˜alan/theory3/ [363]
i i
i i
i i
i i
Interesting web sites with further optical illusions can be found at:
• http://www.echalk.co.uk/amusements/OpticalIllusions/illusions.htm
• http://www.michaelbach.de/ot/
• http://www.ritsumei.ac.jp/ akitaoka/index-e.html
• http://www.viperlib.org/
i i
i i
i i
i i
Part II
Color Models
i i
i i
i i
i i
Chapter 6
Radiometry and
Photometry
Photons are the carriers of optical information. Light, emitted by display devices,
light sources, or the sun, reaches our eyes, usually after reflection off objects in
a scene. After transduction, a signal is transmitted to the brain which is inter-
preted, and dependent on the circumstances, this may provoke a response from
the observer.
Since light is the carrier of visual information, a fundamental activity related
to light is its measurement. Light measurement exists in two forms. The first,
radiometry, is the science of measuring all optical radiation whose wavelengths lie
between approximately 10 nm and 105 nm (see Figure 2.4). This region includes
ultraviolet, visible, and infrared radiation. We only consider optical radiation, i.e.,
radiation that obeys the principles of optics.
Whereas radiometry considers many more wavelengths than those to which
the human visual system is sensitive, the second form of light measurement,
photometry, is concerned only with radiation in the visible range. The quan-
tities derived in radiometry and photometry are closely related: photometry is
essentially radiometry weighted by the sensitivity of the human eye. As such, ra-
diometry measures physical quantities and photometry measures psycho-physical
quantities.
In this chapter, we first revisit the topic of human wavelength sensitivity, en-
abling us to refine our explanation of radiometry and photometry. We then define
several important quantities relevant to both disciplines. We introduce computa-
tional models that attempt to compute perceptual quantities from their physical
counterparts. Thus, this chapter affords an insight into radiation measurements
319
i i
i i
i i
i i
that are important to understand the color measurements discussed in the next
chapter.
0
of the human eye (log10)
Luminous efficiency
V(λ)
-1 VM(λ)
-2
-3
-4
-5
-6
-7
400 500 600 700 800
Wavelength (nm)
i i
i i
i i
i i
V(λ) V(λ)
-1 V'(λ)
0.8 V'(λ)
-2
0.6 -3
0.4 -4
-5
0.2
-6
0.0 -7
400 500 600 700 800 400 500 600 700 800
Wavelength (nm) Wavelength (nm)
i i
i i
i i
i i
The most commonly used radiometric and photometric quantities are listed in
Table 6.1; they are explained in the following sections.
i i
i i
i i
i i
380 – 780 nm range. Luminous energy may be computed from radiant energy
using (6.1). It is denoted by Qv and measured in lumen seconds (lm s). The lumen
is the base unit of photometry and is explained in the next section. Radiant and
luminous energy are used in applications where the aim is to measure the total
dosage of radiation emitted from a source.
dQe
Pe = . (6.4)
dt
With Qe expressed in joules and t in seconds, radiant flux Pe is given in watts
(1 W = 1 J/s). The photometric counterpart of radiant flux is called luminous
flux:
dQv
Pv = . (6.5)
dt
The same quantity may be derived from radiant flux:
780
Pv = Km Pe (λ ) V (λ ) d λ . (6.6)
380
The unit of luminous flux is the lumen (lm), which is the base unit of photometry.
Luminous flux is a measure of the capacity of radiant flux to invoke visual sensa-
tions and is equal to 1/683 watt of radiant power at a frequency of 540 × 1012 Hz,
which corresponds to a wavelength of about 555 nm.
i i
i i
i i
i i
dPe
Ie = (6.7a)
dω
d 2 Qe
= . (6.7b)
dt d ω
The unit of radiant intensity is watts per steradian (W/sr). If the radiant intensity
of a point source is the same in all directions, the source is said to be uniform
or isotropic. For a uniform source with radiant intensity Ie , the total radiant flux
equals
Pe = ω Ie = 4π I. (6.8)
For instance, if Ie = 1 W/sr then Pe ≈ 12.56 W. A related quantity to radiant
intensity is the average radiant intensity which is defined as the ratio of the total
flux emitted by a source to the solid angle of emittance which is usually 4π (i.e.,
the entire sphere surrounding the source).
The luminous intensity is the photometric counterpart of radiant intensity and
may be computed from luminous flux:
dPv
Iv = (6.9a)
dω
d 2 Qv
= , (6.9b)
dt d ω
or may be derived from radiant intensity:
780
Iv = Km Ie (λ ) V (λ ) d λ . (6.10)
380
The unit of luminous intensity is lumens per steradian (lm/sr) which is normally
called candela (cd). The candela is defined as the amount of luminous intensity
in a given direction of a source that emits monochromatic radiation of frequency
540 × 1012 Hz and that has a radiant intensity in that direction of 1/683 W sr−1 .
Luminous intensity may be computed for any source (i.e., both point and ex-
tended), but in general it is only meaningful for point sources [839]. For extended
i i
i i
i i
i i
sources, radiance and luminance are appropriate. They are defined in the follow-
ing sections.
The average luminous intensity, which is the photometric equivalent of av-
erage radiant intensity, is also referred to as the mean spherical candle power
(MSCP) [1199].
dPe
Ee = . (6.11)
dA
As irradiance does not have directional dependency, to compute irradiance at a
point, the radiation arriving from all incident directions must be included. For
a differential surface area of an opaque surface, this requires integrating over a
hemisphere covering the area of interest. Irradiance is measured in watts per
square meter (W/m2 ).
The illuminance Ev at a point is defined as the luminous flux per differential
area:
dPv
Ev = . (6.12)
dA
It may also be derived directly from irradiance Ee :
780
Ev = Km Ee (λ ) V (λ ) d λ . (6.13)
380
The unit of illuminance is lumens per square meter (lm/m2 ), which is also called
lux.
Irradiance and illuminance are measures of the density of incident radiation.
To define the surface density of exitant radiation leaving a point, the quantities
called radiant exitance and luminous exitance are used. They have the same units
as irradiance and illuminance.
In the following sections we demonstrate the computation of illuminance with
two examples. In the first example, the surface lit by a light source is assumed
to be differential. Therefore, the intensity of the light source can be regarded as
the same for all directions spanning the surface. In the second example, a larger
surface is considered where this assumption cannot be made.
i i
i i
i i
i i
Iv
N
θ d θ ⊥
d
dA
Figure 6.3. A differential surface area lit by a point light with intensity Iv .
cos3 (θ )
Ev = Iv . (6.16)
(d ⊥ )2
This equation is known as the cosine-cubed law. In this formulation, each point
on a surface lit by a uniform light source receives illumination that varies with θ ,
since d ⊥ is the same for all points on a large flat surface. For non-uniform light
sources, the illumination changes with θ and Iθ (the intensity of the source in the
given direction θ ).
i i
i i
i i
i i
I
θ2
⊥ θ1
d
S S
I
θ θ
⊥
d
R
r
dr
2π r
Figure 6.5. Top: Each differential area on a ring has the same angle of incidence θ and,
therefore, has the same illuminance. Bottom: The area calculation of the ring is shown.
i i
i i
i i
i i
ring is then
cos3 (θ )
Pv,r = Iv 2π r dr. (6.17)
(d ⊥ )2
Integrating over all rings that form the disk yields the total illuminance over the
surface: R
cos3 (θ )
Pv = Iv 2π r dr. (6.18)
0 (d ⊥ )2
Here, both r and θ are changing together, making the integration problematic.
This can be avoided by rewriting r in terms of θ as r = d ⊥ tan (θ ). Then dr =
d ⊥ sec2 (θ ) d θ and the upper limit of the integral becomes tan−1 (R/d ⊥ ):
tan−1 (R/d ⊥ )
Pv = 2π Iv sin (θ ) d θ (6.19a)
0
d⊥
= 2π Iv 1 − . (6.19b)
R2 + (d ⊥ )2
The average illumination E¯v is calculated by dividing Pv by the disc’s surface area:
2 Iv d⊥
Ēv = 2 1 − . (6.20)
R R2 + (d ⊥ )2
It can be verified that as R becomes small compared to d ⊥ (i.e., R/d ⊥ → 0), this
equation reduces to Ēv = Iv /(d ⊥ )2 .
d 2 Pe
Le = , (6.21)
d ω dA cos (θ )
where θ is the angle of incidence or exitance (see also Figure 6.6). The unit of
radiance is watts per steradian per square meter (W sr−1 m−2 ).
i i
i i
i i
i i
Le (x,y,θ,φ,λ)
dω
N
θ
dA cos (θ)
φ
dA
Figure 6.6. Radiance Le emitted from a point (x, y) on a differential surface dA in the
direction of (θ , φ ).
The radiance Le emitted from a differential area is directly related to its radiant
intensity Ie :
dIe
Le = , (6.22)
dA cos (θ )
where θ is the exitant angle.
Luminance Lv is the photometric counterpart of radiance. It may be computed
from luminous flux Pv as
d 2 Pv
Lv = . (6.23)
d ω dA cos (θ )
Alternatively, luminance may be derived from radiance:
780
Lv = Km Le (λ ) V (λ ) d λ . (6.24)
380
Luminance is measured in candela per square meter (cd m−2 ). The nit is also
sometimes used instead of candela per square meter, although this term is now
obsolete.
It is easy to see that a camera records spectrally weighted radiance. Light
arriving from many directions is focused by the lens, measured over a given ex-
posure time, and for each pixel individually recorded over a small surface area.
Thus, the charges recorded in the sensor, between the time that the shutter is
opened and closed, represents radiance values. The firmware in the camera then
processes the data, so that we can not be certain that the values retrieved from the
camera still correspond to radiances (although most cameras now export a raw
format, designed to bypass much of the in-camera processing).
i i
i i
i i
i i
In the human eye a similar process takes place. Here the refractive indices of
the various ocular media focus light, which is then transduced by the photorecep-
tors. The human equivalent of exposure time is not so easily defined. Photore-
ceptors transmit signals at the same time as they receive photons. The amount
of signal is modulated by the amount of light, bleaching, and by neural feedback
mechanisms. Nevertheless, this combination of processes causes the signal to
be commensurate with the radiance reaching each receptor. Subsequent visual
processing stages then cause the signal to be photometrically weighted, so that
luminance values result.
dS
NS
θS
d
LS
NR LR
θR
dR
i i
i i
i i
i i
If the radiance emitted from the surface with surface area dS is equal to LS ,
then the total flux dPS that is emitted in the direction of dR equals
where d ωR is the solid angle subtended by dR at the center of dS. The differential
solid angle d ωR can be rewritten as follows:
cos (θR ) dR
d ωR = , (6.26)
d2
where d is the distance between the center of the surfaces. Substituting d ωR into
the previous equation we obtain
i i
i i
i i
i i
dS
NS
NR
dR
Figure 6.8. The flux emitted by dS in the direction of dR is equal to the flux received by
dR from the direction of dS.
i i
i i
i i
i i
An interesting result is obtained if both the source dS and the receiver dR lie
on the interior surface of a sphere. In that case, we have,
θR = θS = θ , (6.37a)
d = 2r cos (θ ), (6.37b)
where r is the radius of the sphere as shown in Figure 6.9. Substituting these
into (6.35), the irradiance on dR is given by
LS dS
ER = . (6.38)
4 r2
dS
θR = θS = θ
d = 2r cos (θ)
θS r
d
θR O
r
dR
Figure 6.9. The two differential patches dS and dR lie on a sphere. In this case, the
irradiance of dR due to emission from dS is independent of the distance d and both of the
angles θS and θR .
i i
i i
i i
i i
Note that this result is independent of both θ and d. Therefore, all the points
on a sphere are equally irradiated from any other point on the sphere. This is
a particularly convenient configuration which may be exploited to measure both
power as well as irradiance (or their photometric equivalents) using a device called
the integrating sphere (see Section 6.8.2).
dI I0
L= = . (6.40)
dA cos (θ ) dA
The total flux emitted from the entire surface is then computed as
P= dP (6.42a)
A
= π L dA (6.42b)
A
= π L A. (6.42c)
i i
i i
i i
i i
Receiver patch dR
N
I
d θ ⊥
d
θ
dS
Source
r
Figure 6.10. The differential surface patch dR is illuminated by a flat circular extended
source. A differential ring on the source surface is shown, and the radiance L emitted from
a differential area dS of this ring is shown to strike dR.
i i
i i
i i
i i
θΝ
(x 0, y 0)
L (x 0, y0 , θ, φ)
θΝ
(x 1, y 1) L (x 1, y1 , θ, φ)
Figure 6.11. The angle θN may change based on the surface curvature whereas θ is fixed.
yields
tan−1 (R/d ⊥ )
P= 2π L sin (θ ) cos (θ ) d θ dR (6.46a)
0
R2
= π L dR , (6.46b)
R2 + (d ⊥ )2
and the irradiance E is equal to
dP R2
E= = πL . (6.47)
dR R2 + (d ⊥ )2
In the general case, the radiant intensity of an extended source in the direction
of (θ , φ ) is defined as
I (θ , φ ) = L (θ , φ ) cos (θN ) dS, (6.48)
S
!
where S indicates that integral is computed over the entire surface. Also note
that θN is the angle between the surface normal and the direction of emission.
The angle θN is not necessarily equal to θ as illustrated in Figure 6.11.
For an extended circular source (Figure 6.10), the intensity in the direction of
the surface normal is given by
I = π R2 L. (6.49)
By substituting this result into (6.47) we obtain the irradiance Ea on the differen-
tial surface dR in terms of the intensity of the source:
I
E= . (6.50)
R2 + (d ⊥ )2
If the source is very small (R → 0) or if the distance from the source is very large
(d ⊥ → ∞) (e.g., starlight), the approximate irradiance Ea is given by
I
Ea ≈ . (6.51)
(d ⊥ )2
i i
i i
i i
i i
-1
-2
-3
-2 -1 0 1 2
log (d ⊥/ R)
Figure 6.12. The relative irradiance on a differential surface due to an extended source.
Both exact values E and the point-source approximation Ea are shown.
Therefore, the error associated with treating an extended source as a point source
stems from ignoring the R2 in the denominator. We can see this by dividing both
E and Ea by R2 so that
I
E = 2
, (6.52a)
1 + (d ⊥ /R)
I
Ea ≈ 2
. (6.52b)
⊥
(d /R)
If we plot both E and Ea for different ratios of d ⊥ and R (Figure 6.12), we see
that the two curves almost overlap when the distance between the source and the
receiver is at least ten times the radius of the source. At this distance, the error
drops below 1%, and it reduces further for longer distances.
i i
i i
i i
i i
in lumens per watt (lm/W). On the other hand, efficiency is a unitless term that
expresses the ratio of the efficacy at a particular wavelength to the efficacy at
555 nm [1090].
Two types of efficacies are distinguished: the efficacy of electromagnetic ra-
diation and the efficacy of lighting systems. The former is defined as the ratio of
the luminous power to the radiant power in a beam of radiation, whereas the lat-
ter is defined as the ratio of the luminous power produced to the electrical power
consumed in a lighting system. These concepts are discussed in the following
sections.
i i
i i
i i
i i
1800
Efficacy (lm/W) 1600
K (λ)
1400 K’ (λ)
1200
1000
800
600
400
200
0
400 500 600 700 800
Wavelength (nm)
Figure 6.13. Luminous efficacy curves for photopic (K(λ )) and scotopic (K (λ )) vision.
Table 6.2. Luminous efficacies of commonly encountered light sources (from [761]).
i i
i i
i i
i i
Table 6.3. Luminous efficacies of lighting system for commonly used light sources.
Figure 6.14. The car lights are on in both images, but they appear much brighter in the
dark environment. (This image was published in Erik Reinhard, Greg Ward, Sumanta
Pattanaik, and Paul Debevec, “High Dynamic Range Imaging, Acquisition, Display, and
Image-Based Lighting”, c Elsevier 2006.)
i i
i i
i i
i i
B = a L1/3 − B0 . (6.57)
Here, a and B0 are constants depending on the viewing conditions. These condi-
tions include the state of adaptation of the observer as well as the background and
spatial structure of the viewed object [865].
A fundamental quantity which is closely related to luminance and brightness
is contrast. Essentially, two types of contrast may be identified: physical contrast
and perceived contrast.
Physical contrast, also known as luminance contrast, is an objective term used
to quantify luminance differences between two achromatic patches.1 It may be
defined in several ways depending on the spatial configuration of patches. For in-
stance, the contrast C between a foreground patch superimposed on a background
patch is characterized by " "
" L f − Lb "
C=" " ", (6.58)
Lb "
where L f and Lb are the luminances of the foreground and background patches. If
two patches L1 and L2 reside side by side forming a bi-partite field, their contrast
may be computed by
|L1 − L2 |
C= . (6.59)
max (L1 , L2 )
Finally, for periodic patterns, such as sinusoidal gratings, the contrast may be
computed by
Lmax − Lmin
C= , (6.60)
Lmax + Lmin
where Lmax and Lmin are the maximum and the minimum luminances in the grat-
ing. This type of contrast is also called Michelson contrast or modulation .
For pairs of diffuse surfaces, the luminance contrast depends solely on their
reflectances [941]. Thus, even if the illumination on these surfaces is altered, the
luminance will change proportionally, so that the luminance contrast will remain
the same.
The second type of contrast is perceived contrast, also known as brightness
contrast. It is the apparent difference between the brightnesses of two achromatic
patches. Since brightness does not only depend on luminance, brightness con-
trast does not strictly depend on luminance contrast either. In general, brightness
1 Chromatic patches might also induce color contrast; this is discussed in Chapter 5.
i i
i i
i i
i i
contrast increases with luminance, and this phenomenon is called the Stevens
effect [302]. For instance, a black-and-white photograph appears less rich in con-
trast if viewed indoors compared to viewing it outdoors. Outdoors, blacks appear
more black and whites appear more white. The luminance contrast is the same in
both cases, unless the photograph is held at angles that cause highlights.
6.5.1 Responsivity
The output signal of an optical detector may be expressed in several ways, such
as a change in the voltage, current, or conductivity. If we denote the output signal
by S and the incident flux by P, the flux responsivity RP of a detector is given by
S (λ , λ0 )
RP ( λ , λ 0 ) = , (6.61)
P (λ )
where λ0 is the wavelength to which the detector is most sensitive. Thus, the
responsivity may change as a function of position, direction, and wavelength.
Furthermore, the responsivity may change as a function of incident radiation. A
detector is said to have a linear response if its flux responsivity does not change
with the amount of incident radiation, assuming all other factors are kept constant.
Linearity is a desired property of optical detectors and often satisfied within the
wavelength range of the detector [839].
i i
i i
i i
i i
Relative response
100
80
Selenium
60 Silicon
Standard observer
40 Tungsten (2856 K)
20
0
400 500 600 700 800 900 1000 1100
Wavelength (nm)
Figure 6.15. The spectral response of various materials used in detectors, as well as the
responsivity of the human eye (after [1090]).
i i
i i
i i
i i
Anode
Resistor
+
Anode Vacuum tube
Cathode -
Guard rings
Light rays
Electron flow Cathode
i i
i i
i i
i i
2 Suggested radiators were an iron vessel containing boiling zinc, a coil of platinum wire heated to
tions can be made with a thin hollow cylinder with absorbing walls.
i i
i i
i i
i i
Of course, all of the standards described above can be realized under labora-
tory conditions. However they are not practical for everyday use. For practical
measurements, more portable and readily usable light sources are created by com-
parison with primary standards. These standards are called secondary standards
(or sub-standards).
Secondary standards should approximate the primary standards to the ade-
quate degree of accuracy. In most cases, the hierarchy is deepened one more level
to produce what is called working standards by comparing against secondary stan-
dards. The accuracy of working standards should be adequate enough to be used
in ordinary photometric work [1199].
i i
i i
i i
i i
Po
w
+
er
su
pp
-
ly
i
Shutter
H
Radiant flux
ea
te
r (R
)
ne
Th
co
er
g
in
m
iv
al
ce
so e
lin
en tur
Re
k
) s ra
r
(T pe
m
Te
Heat sink
(temperature = T 0 )
Figure 6.18. An illustration of the ESR, simplified to bring out its important components.
(This illustration is based on the drawing in NIST Technical Note 1421 [877].)
i i
i i
i i
i i
(θ,φ)
LS
NS L
θ
dw
Detector area = A
Responsivity = R
dA
Figure 6.19. The detector with responsivity R and total surface area A is configured to
measure the radiance emitted from the light source. For simplicity, the optical system that
typically sits in front of the detector is not shown.
where ω is the solid angle subtended by the source being measured at point (x, y).
As an aside, it should be noted that ω may change based on the position (x, y) on
the detector surface, and, therefore, ω is a function of position (x, y).
Equation (6.65b) is valid for the configuration given in Figure 6.19. In this
form, we are measuring the incident radiance on the detector rather than the emit-
ted radiance from the source. In a lossless medium, these would be equal due to
the invariance of radiance. In practice, however, the radiance may be attenuated
by the intervening medium and due to losses during refraction by the optical sys-
tem. As discussed in Section 6.2.8, the incident radiance L is related to the source
radiance LS via the propagance τ of the optical path:
i i
i i
i i
i i
Here, LSs is the known radiance of the standard lamp. The known quantities are
the results of the two measurements S and Ss and the radiance LSs of the standard
lamp. To solve for LS using (6.67) and (6.68), we assume that the values of LS and
LSs are constant over the integration intervals A, w, ws , and Δλ . In practice, the
aperture size of the measuring device is about 1◦ , and the response is sufficiently
flat over Δλ , making this assumption valid [839]. As a result, LS and LSs may be
taken out of their integrals in (6.67) and (6.68).
In addition, we will assume that τ equals τ s and ω equals ω s . The integrals in
these equations are then identical, so that combining these two equations yields a
solution for LS : s
LS
LS = S . (6.69)
Ss
i i
i i
i i
i i
Entrance port
P
P: power
S S S: screen
D: detector
D D
Figure 6.20. The left integrating sphere is designed to measure the radiant power of a
light source whereas the right sphere is designed to measure the irradiance at the entrance
port. In both cases, a screen is used to prevent direct light from reaching the detector.
the position of the light source [1105]. Furthermore, the amount of illumination
bears a direct relation to the power of the light source, as can be derived as follows.
The flux received by a differential surface dR from another differential surface
dA with luminance L can be deduced from (6.38) under the condition that both
surfaces lie on a sphere (see Figure 6.9):
dR
dP = L dA. (6.70)
4r2
The luminance reflected by a surface Lo is equal to the incident luminance Li
times the bi-directional reflectance distribution function (BRDF) fr , which is de-
termined by the reflective properties of the coating of the interior of the integrating
sphere:
Lo (x, y, Θi , Θo ) = fr (x, y, Θi , Θo ) Li (x, y, Θi ). (6.71)
The luminance outgoing into a given direction at a surface element is the result of
all incident luminance that is being reflected into the outgoing direction. Within
an integrating sphere, the integral over all incoming directions is equivalent to an
integral over all surface elements dA:
Lo = fr Li dA. (6.72)
A
The coating of the interior of the integrating sphere is diffuse, and therefore the
BRDF fr can be approximated with a constant, the reflectance factor ρ . The
integrating sphere enables this integral to be evaluated with the help of (6.41d):
dP
Lo = ρ (6.73a)
A π
ρP
= . (6.73b)
π
i i
i i
i i
i i
Substitution into (6.38) yields the flux dPi received by differential area dR as a
function of the flux P emitted by the iso-radiant interior surface of the sphere:
dPi ρP
= . (6.74)
dR 4 π r2
This expression is valid for all flux occurring from one reflection inside the sphere.
As light can reflect multiple times before being absorbed or detected by the sensor,
this expression can be refined for multiple reflections:
dPi ρP
= . (6.75)
dR 4 π r2
Finding the illuminance due to reflected light is a matter of tracing light rays
for a few bounces. First, light emitted by the light source will reflect off the
sphere’s walls. We call this reflected luminance Lo,1 . The total flux received by
any area dR due to the first reflection is equal to
dR
dPi,1 = Lo,1 dA. (6.76)
4r2
A
The value of Lo,1 may be different for different points on the sphere. For instance,
the points closer to the light source will have higher luminance. However, the
!
aggregate value Lo,1 dA is related to the total flux of the source via
A
ρ dP ρ P
Lo,1 dA = ρ Li,0 dA = = , (6.77)
π π
A A A
where Li,0 is the incident luminance due to direct lighting. Therefore the flux
received by dR due to the first reflection is equal to
dR ρ P
dPi,1 = , (6.78)
4r2 π
and, consequently, the luminance received due to the first reflection equals
dPi,1 ρP
Li,1 = = 2 2. (6.79)
π dR 4π r
At this point, the luminance emitted from any point on the sphere will be equal to
Lo,2 = ρ Li,1 . (6.80)
Similar to the computation just performed, one may compute the flux received by
dR due to the first and the second reflections:
dR
dPi,1,2 = 2 (Lo,1 + Lo,2 ) dA, (6.81)
4r
A
i i
i i
i i
i i
i i
i i
i i
i i
where S denotes the output signal of the photo-detector and the subscripts S and
T identify the standard and the test lamps.
In the substitution method, the lamps are placed inside the sphere one by one,
preferably in the center of the sphere [1199]. In the simultaneous method, both
lamps are in the sphere together (symmetrically along the vertical axis), although
only one lamp is lit for each measurement. An additional screen is placed between
the lamps to prevent the opaque parts of the lamps absorbing direct light from each
other. In the auxiliary lamp method, a third lamp is placed inside the sphere to
control the self-absorption of light emitted by each lamp itself.
i i
i i
i i
i i
A = A cos(θ ). (6.87)
The difficulty to satisfy the cosine law arises from the fact that as the light rays
arrive at the surface from grazing angles, they also tend to reflect more off the
surface (see Section 2.5.2).5 Without accounting for this effect, irradiance and
illuminance values will be underestimated.
5 This is a different way of saying that near grazing-angles paints tend to be more reflective than a
Lambertian surface.
i i
i i
i i
i i
Figure 6.21. An extruded detector surface. Paths of three of the light rays are depicted
with dashed lines. The rays incident from narrow angles are received by the top of the
detector, while the rays incident from wide angles are received by the edges. Screening
prevents contribution of light from very wide angles.
The solution to this problem is called cosine correction and can be achieved
in several ways. Most detectors have highly diffuse covers to minimize the re-
flectance of the detector. In addition, the detector surface may be extruded to
enable light entrance through the edges of the detector as well. Such a detector is
depicted in Figure 6.21. In this case, light arriving from the horizontal direction
(θ = 90◦ ) should be blocked with proper screening. A second alternative is to use
an integrating sphere to average out light arriving from a very wide solid angle.
Depolarizer
ct
Aperture
io
n
gr
at
Incoming radiation
in
g
Detector array
380 nm 800 nm
Figure 6.22. Incident radiation passing through the aperture is depolarized and diffracted
into its constituent components. Each component impinges on the detector cell sensitive to
that wavelength.
i i
i i
i i
i i
i i
i i
i i
i i
ds dt
Bench
placed screens are used to block stray light so that light enters the photometer
head only from the light sources. One of the sources is a known standard whose
intensity is previously established, and the other source is a test source whose
intensity is to be measured. The observer is able to adjust the brightness of the
illuminated surfaces by sliding the lights towards and away from the photometer
head, effectively using the inverse-square law (Section 2.9.3)
The brightness match is obtained when the illumination of both surfaces inside
the photometer head appear the same:
Is ρs It ρt
= 2 , (6.88)
ds2 dt
where Is and It are the intensities of the standard and the test sources, ρs and ρt
are the reflectances of the illuminated surfaces, and ds and dt are the distances of
the sources from the surfaces. The measured term It may then be computed with
2
ρs dt
It = Is , (6.89)
ρt ds
i i
i i
i i
i i
i i
i i
i i
i i
that of the other light with which the comparison is made. Then, classical meth-
ods of homochromatic photometry may be used to measure the intensity of the
transmitted light. Of course the transmission characteristics of the filter should
be known (or measured with some other method) to adjust the measured intensity
accordingly.
In the case of large chromatic differences between source and test patches, it
is still possible to carry out brightness matches with good accuracy, for instance,
with the flicker photometer method. An instrument called the flicker photometer
is used for the purpose of presenting two lights on a surface in alternation at a
minimum frequency such that when the brightness of the two lights match, the
perception of flicker will disappear [1199]. The method relies on the fact that the
temporal resolving power of the human eye is lower for colored stimuli than for
achromatic stimuli. This means that there exist frequencies for which chromatic
differences between the alternating patches is not perceived. By adjusting the
intensity of one of the patches, a brightness match occurs if the perceived achro-
matic differences disappear as well. Typical frequencies are between 10 Hz and
40 Hz [1262].
While a heterochromatic flicker-photometry experiment can be constructed
with a physical device whereby synchronized shutters are opened and closed in
opposite phase, an accurate experiment carried out on a display device is compli-
cated. Computers and graphics cards are getting ever faster, but are rarely able
to guarantee a fixed and stable frame-rate. Drift and fluctuations are likely to
occur, and they affect the accuracy of the measurement [591]. A general prob-
lem with heterochromatic flicker photometry is that the frequencies required to
measure brightness matches may induce photosensitive epileptic seizures in some
individuals, making the technique usable only in controlled laboratory experi-
ments [530].
Brightness matching, in the case of large chromatic differences between
source and test patches, moves into the realm of color matching, which is dis-
cussed in the following chapter on colorimetry. There, we will also introduce
a recent technique to improve the accuracy of visual brightness matching that
can be performed on computer displays and show how it can be used to derive
observer-specific iso-luminant color maps (Section 7.7.1).
i i
i i
i i
i i
synthesis can only be achieved if modeling, rendering, and display are all based
on accurate data and measurements. Even the best renderer is still a garbage-in-
garbage-out system, meaning that if the modeling is not accurate, the resulting
image will not be accurate.
Modeling of an environment requires the acquisition of geometry, the emis-
sion spectra of light sources, as well as the measurement of materials. In this
section, we consider the problem of measuring materials. Opaque spatially ho-
mogeneous materials can be modeled with bi-directional reflectance distribution
functions (BRDFs), which were introduced in Section 2.9.7. Such functions re-
late the amount of light reflected in a direction of interest to the amount of light
incident upon a point on a surface. The relation is given as a ratio and may be
defined as a function of wavelength.
With incident and exitant direction defined as two-dimensional polar coordi-
nates, a BRDF is a five-dimensional function. For isotropic BRDFs the dimen-
sionality is reduced to four, as rotation of the surface around the surface normal
at the point of interest does not alter the ratio of reflected light. If the BRDF is, in
addition, measured with tristimulus values rather than a full spectral set of sam-
ples, the wavelength dependence is removed, and the dimensionality is reduced
to three dimensions.
This means that to measure an isotropic BRDF for the purpose of using it in a
renderer requires a material sample to be measured along three dimensions. The
traditional way to measure a BRDF is with the use of a gonioreflectometer.6 A
flat sample containing the material to be measured is placed in a fixed position.
A calibrated point light source can move along one direction over the hemisphere
placed over the sample material. The detector can independently move over the
entire hemisphere. Together, they span the three degrees of freedom required to
sample all possible combinations of incident light and measurement angles, as
shown in Figure 6.24 [744]. Anisotropic BRDFs can be measured by giving the
position of the light source a second degree of freedom.
Every measurement, thus, provides a data point for one angle of incidence
paired with one angle of reflectance. A similar result can be obtained by plac-
ing the detector in a fixed position and rotating the sample over two degrees of
freedom. The light source is then still moved over the same path as before.
In either set-up, the number of samples needs to be large to get a dense sam-
pling of the BRDF. Such a dense sampling is required in the absence of further
knowledge about the sort of material being measured. This makes the acquisition
of a full BRDF long-winded, even if the gantry maneuvering the detector and the
6 See the Cornell Light Measurement Laboratory (http://www.graphics.cornell.edu/research/
measure/) and the Nist reference reflectometer at the Starr facility (http://physics.nist.gov/).
i i
i i
i i
i i
Detector
Light source
Sample
Figure 6.24. In a gonioreflectometer, the light source and detector have one and two
degrees of freedom to move over the hemisphere placed over the sample to be measured.
This configuration enables the acquisition of isotropic BRDFs.
light source is under computer control. If specific materials are being measured,
then it may be possible to measure only a subset of all angles of incidence and
reflectance.
A different approach to make the acquisition procedure tractable is to capture
many samples at once, usually by employing a digital camera. Two of the de-
grees of freedom in the measurement device are then replaced by the two image
dimensions. For flat samples this may be achieved by placing a hemispherical
mirror over the sample and photographing it through a fish-eye lens [1212]. A
perhaps simpler hardware set-up may be achieved by requiring the sample to be
curved. Spherical [744,745,751,752] or cylindrical samples [701] are simplest to
capture, since a single photograph will capture many different orientations of the
surface normal with respect to the light source. The requirement of simple shapes
exists because this allows an analytic model to infer the surface orientation at each
pixel.
More complex material samples can also be used, as long as they are convex
to avoid inter-reflections within the sample material. In that case, the camera
may be augmented with a laser scanner device to capture the 3D geometry of
the sample [744]. From the geometry, the surface normals at each pixel can be
inferred; these are, in turn, used to relate pixel intensities to angles of incidence
and angles of reflectance.
For concave objects one may expect significant inter-reflections that will ren-
der the material measurement inaccurate. One would have to separate the pixel
values into direct and indirect components. While to our knowledge this has not
been achieved in the context of material measurements, separation of direct and
indirect components is possible by placing an opaque surface with small regu-
larly spaced holes (an occlusion mask) in front of the light source. After taking
i i
i i
i i
i i
multiple captures with the occlusion mask laterally displaced by small amounts,
image post-processing techniques can be employed to infer the direct and indirect
illumination on surfaces separately [818].
BRDFs can reflect almost all incident light. If, in addition, the BRDF is highly
specular, then the amount reflected in the reflected direction is only limited by the
intensity of the light source. As a result, the imaging system should be able to
capture a potentially very large dynamic range. The capture of high dynamic
range images is the topic of Chapter 12.
The number of materials that can be represented by a BRDF, although large,
is still limited as it assumes that all surface points have identical reflectance prop-
erties. It is possible to extend the measurement of BRDFs to account for spatially
varying reflectance. Such spatially varying reflectance functions are called bi-
directional texture functions (BTFs). They can be measured with the aid of a
robot arm, a light source, and a digital camera using flat samples [231, 232].7
In the case of BRDFs, and especially BTFs, the number of samples required to
represent a material is very large. It is therefore desirable to fit this quantity of data
to analytic BRDF models. This can be achieved in several different ways. Some
of the resulting models are discussed in Section 2.9.7. For further information,
we refer the interested reader to the relevant references [231, 582, 601, 637, 639,
640, 671, 744, 764, 998, 1013, 1212, 1240, 1285].
7 A collection of BTFs and BRDFs obtained with this technique is available from the Columbia-
i i
i i
i i
i i
Chapter 7
Colorimetry
i i
i i
i i
i i
364 7. Colorimetry
It is very important to stress that although colorimetry is based upon the hu-
man visual system, it is not a technique for describing how colors actually appear
to an observer. It is purely a method for accurately quantifying color measure-
ments and describing when two colors “match” in appearance. The ultimate vi-
sual response to any stimulus requires much more information about the viewing
conditions, including (but not limited to) size, shape, and structure of the stimulus
itself, color of surround, state of adaptation, and observer experience. The sci-
ence of describing what a stimulus actually looks like is called color appearance
modeling and is described in detail in Chapter 11.
i i
i i
i i
i i
Radiant Power
A+B
A
B
400 700
Wavelength (nm)
Figure 7.1. Additive color mixing of the spectral power distributions of two color stimuli
A and B.
the overall spectral power distribution cannot change. This can be impor-
tant when designing color-matching experiments. For instance, if the three
primaries are tungsten light bulbs, the radiant power cannot be changed
by decreasing or increasing the current to the light source as this will also
change the fundamental spectral power distribution of the lights.
Additivity Law. The additivity law is the most important law for generalized color
matching and forms the basis for colorimetry as a whole. Essentially, if
there are four color stimuli A, B, C, and D, and if two of the color matches
are true, i.e., A matches B and C matches D, A + C matches B + D, then it
follows that A + D matches B + C.
i i
i i
i i
i i
366 7. Colorimetry
the match will continue to hold if one of the stimuli is placed in a different
condition.
2. Observer adaptation state. Although the match will generally hold if the
observer’s adaptation state changes while viewing both stimuli, an effect
known as persistence of color matches, the match may break down when
viewing the stimuli independently under two different adaptation states.
More details on chromatic adaptation are given in Chapter 10.
3. The dependence on the match for a given observer. If two stimuli (with
disparate spectral power distributions) match for one person, there is no
reason to assume that the match will hold for another person. The vari-
ability between color matches among the general population is surprisingly
large. When we add in the color deficient population, it should be obvious
to the casual observer that what they consider to be a color match will not
necessarily match for all (or any!) other people.
Grassmann’s laws provide the framework for developing a robust method for
measuring and communicating colors. Remember that an ideal method for mea-
suring colors would be to directly measure the human cone responses for any
given stimulus. Unfortunately this is not a feasible method for specifying color.
If we had the spectral sensitivities of the cone photoreceptors, as described in
Section 4.3.3, it would be possible to estimate the response through mathematical
integration of the color stimulus and the cones. It is only within the last several
decades that we have had a solid understanding of what the cone responses actu-
ally are. So, historically, we have not been able to use the cones as a method for
describing color measurements.
Using Grassmann’s laws of additive color mixing, researchers in the early
1920s were able to estimate the cone responses in a series of color-matching ex-
periments. Section 7.2 describes those historical experiments and how they were
used to generate computational systems of colorimetry.
i i
i i
i i
i i
From these equations, we can state that if the relative integrated cone responses
L1 , M1 , and S1 for one stimulus equal those for a second, L2 , M2 , and S2 , then
the two color stimuli must, by definition, match.
The integrated LMS cone responses can be thought of as tristimulus values.
We can now state that two color stimuli match when their integrated tristimulus
values are identical. Knowing the cone spectral-power distributions would allow
us to perform these tristimulus calculations to specify color measurements. Un-
fortunately, it is only recently that we know how to obtain cone sensitivities. His-
torically, they have been very difficult to measure. If the cone sensitivities were
idealized, such as those shown in Figure 7.2, then they would be easy to mea-
sure. We could just use a monochrometer to scan the visible wavelength range
and record the relative response to each wavelength. The real cone spectral sen-
sitivities, as discussed in Section 4.3.3 and shown again in Figure 7.3 have a high
degree of overlap. This overlap means that it is impossible to isolate the visual
response to a single cone excitation and impossible to experimentally measure the
individual cone sensitivities. Other techniques for estimating the cone responses
are necessary.
At this point, one might question why our eyes have evolved in such a way that
our photodetectors are so highly overlapping. From an information-processing
point of view, this suggests that the LMS cone signals are highly correlated and
not data-efficient. This is true, although processing at the ganglion level does
partially de-correlate the cone signals into opponent colors before transmission to
the brain. The high degree of cone overlap actually provides the human visual
system with a very high sensitivity to wavelength differences by examining the
relative LMS signals with each other.
Imagine two monochromatic lights with the same radiant power at 540 nm and
560 nm. If these lights are viewed using the idealized cone spectral sensitivities
shown in Figure 7.2, only the middle sensitive cone will be excited. The integrated
i i
i i
i i
i i
368 7. Colorimetry
1.0
sensitivity
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2 L
M
0.1 S
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.2. Idealized cone spectral sensitivities. The cones evenly sample the visible
wavelengths and have minimal overlap.
response to both the lights will be the same, and the observer will not be able to
distinguish the color difference between the two light sources.
Now consider the same two light sources, and the real cone spectral sensitiv-
ities as shown in Figure 7.3. All three cones will respond to the 540 nm light,
while the L and M cones will respond to the 560 nm light. The relative LMS
responses to the two lights will be very different, however. By comparing the
1.0
sensitivity
0.9 L
M
0.8 S
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.3. Actual cone spectral sensitivities. Notice the high degree of overlap (correla-
tion) between the three cone types. (From [1091].)
i i
i i
i i
i i
relative responses, it is then possible for the visual system to distinguish the two
color stimuli as being different. This behavior is discussed in further detail in
Section 7.2.1.
1.0
sensitivity
0.9 L
M
0.8
S
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
400 450 500 550 600 650 700
wavelength (nm)
i i
i i
i i
i i
370 7. Colorimetry
1.0
sensitivity
0.9 L
M
0.8 S
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.5. Iconic example of metamerism. The two color stimuli, represented by the solid
and dashed black lines, integrate to the same LMS signals and appear to match, despite
have markedly different spectral power distributions.
now conceive of creating a color match between any given spectral power distri-
bution by generating a differing spectral power distribution that integrates to the
same LMS cone responses. An iconic example of this is shown in Figure 7.5.
i i
i i
i i
i i
values so that they can open that file on their own system. If the manufacturer of
the computer and LCD has excellent quality control, then it is possible that your
friend will see the same color that matches the flower on your desk. Of course,
because color perception depends on many factors other than just the display it-
self, such as whether the computer is viewed in the dark or in a brightly lit office,
there is no way to guarantee that your friend is seeing the same color.
Now we can imagine a device specifically designed to generate these types
of visual color matches. It could be as simple as a box with three light sources
and a screen. We can generate color matches by adjusting the relative amounts of
those three primaries and then send those coordinates to anyone else in the world
who has the same device. If the device is viewed in identical conditions, then
we can be relatively satisfied that when they dial in the same relative amounts
of the primaries, they will get a color match. With this concept, it is possible
to measure the color of any stimulus in the world simply by dialing in a color
match. This form of color measurement does not need to have any knowledge of
the actual cone spectral sensitivities, as long as we create the color matches with
a consistent set of primaries.
There are obvious practical limitations to this type of color measurement. For
one, it is quite inconvenient to have to create visual matches for all measurement
needs. There is also the problem with the difference between matches created by
different people. What matches for one person will quite possibly not match for
another person.
If we instead perform this type of visual color matching for a wide variety
of colored stimuli and a wide variety of observers, it is possible to generate an
“average” color-matching data set. Experimentally, this can be performed using a
set-up similar to that shown in Figure 7.6.
An observer looks at an opening in a wall and sees a circular patch with two
colors shown in a bipartite field. His task is to adjust the colors of one of the fields
such that it matches the other. Behind the wall are three light-source primaries on
one side and a single monochromatic light source on the other, all pointing at
a white screen. The observer can adjust the relative amounts of the three pri-
maries (without altering the spectral power distribution) to create a match for the
monochromatic light. He can then repeat this procedure across all wavelengths in
the visible spectrum.
In practice, as shown in Figure 7.6, it is necessary to also have adjustable
primaries in the reference field. This is because monochromatic light appears
very saturated to humans, and it is difficult to match with three broad primaries.
Adding light from the broad primaries to the monochromatic light will desaturate
the appearance. In essence, using Grassmann’s laws as described in Section 7.1,
i i
i i
i i
i i
372 7. Colorimetry
Figure 7.6. A generic color-matching experiment. The observer adjusts the relative pri-
maries of the test field to create a match to the monochromatic illumination in the reference
field.
this is equivalent to subtracting light from the test field. Algebraically, a color
match can then be defined using (7.2), where R, G, and B are the test primaries, r,
g, and b are arbitrary units or amounts of those primaries, and λ is the monochro-
matic test light:
This is exactly the type of experiments that were performed by two scien-
tists in the early 1920s. Wright performed this experiment with ten observers
in 1928–1929 using monochromatic primaries in the test field [1253]. Indepen-
dently, Guild performed a similar experiment with seven observers, using broad-
band primaries in 1931 [405]. Both experiments used similar set-ups, with the
reference and test bipartite field subtending approximately two degrees (2◦ ) and
viewed in darkness. The two-degree field is the approximate size of the fovea in
the retina, as discussed in Section 4.2.7. These experiments generated two sets
of spectral tristimulus values that are essentially the relative amounts of each of
the RGB primaries needed to match any given wavelength. In (7.2), these are the
amounts of r, g, and b needed at each wavelength.
i i
i i
i i
i i
0.5
sensitivity
r (λ)
g (λ)
0.4
b (λ)
0.3
0.2
0.1
0.0
−0.1
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.7. The spectral tristimulus values from the Wright and Guild experiments, aver-
aged and transformed to the primaries at 435.8, 546.1, and 700 nm. These spectral tristim-
ulus values are also referred to as the CIE r̄ (λ ), ḡ (λ ), b̄ (λ ) color-matching functions.
i i
i i
i i
i i
374 7. Colorimetry
The spectral tristimulus values indicate the amount of each of the three pri-
maries (hence, tristimulus) necessary to generate a color match, for any given
spectral wavelength. For instance, at 440 nm, we would need approximately
−0.0026 units of R, 0.0015 units of G and 0.3123 units of B. These curves
can also be thought of as color-matching functions. We can can determine the
amount of each primary necessary to create a color match by integrating the
spectral power distribution of the color stimuli with the color-matching functions.
These integrated values can then be thought of as tristimulus values. (Note, we no
longer need to refer to them as spectral.) The generic form of this calculation is:
R= Φ (λ ) · r̄ (λ ) d λ , (7.3a)
λ
G= Φ (λ ) · ḡ (λ ) d λ , (7.3b)
λ
B= Φ (λ ) · b̄ (λ ) d λ . (7.3c)
λ
Φ1 (λ ) · r̄ (λ ) d λ = Φ2 (λ ) · r̄ (λ ) d λ , (7.4a)
λ λ
Φ1 (λ ) · ḡ (λ ) d λ = Φ2 (λ ) · ḡ (λ ) d λ , (7.4b)
λ λ
Φ1 (λ ) · b̄ (λ ) d λ = Φ2 (λ ) · b̄ (λ ) d λ . (7.4c)
λ λ
We now have a method for calculating when colors match, or we can measure
a color, based upon the spectral power distribution of the stimuli. It should be
possible to communicate the RGB tristimulus values to our friend and have them
i i
i i
i i
i i
generate a color match using their three monochromatic primaries at 435.8, 546.1,
and 700 nm. If they do not have access to those primaries, they can calculate the
appropriate linear transform (3 × 3 linear matrix) necessary to generate a match
with the primaries that they do have. One such transformation, to an imaginary
primary set, is discussed in Section 7.4.
i i
i i
i i
i i
376 7. Colorimetry
1.8
sensitivity
x (λ)
1.6
y (λ)
1.4 z (λ)
1.2
1.0
0.8
0.6
0.4
0.2
0.0
400 450 500 550 600 650 700
wavelength (nm)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X 0.4900 0.3100 0.2000 R
1 ⎣ 0.17697
⎣ Y ⎦= 0.81240 0.01063 ⎦ · ⎣ G ⎦ . (7.5)
0.17697
Z 0.0000 0.0100 0.9900 B
1
The scaling factor 0.17697 shown in this equation was used to normalize the
functions to the same units as CIE V (λ ). Applying (7.5) to the r̄ (λ ), ḡ (λ ), and
b̄ (λ ) color-matching functions gives us the x̄ (λ ), ȳ (λ ), and z̄ (λ ) color-matching
functions:
The very first thing to notice about these color-matching functions is that,
indeed, they are all positive. It should also be immediately obvious that these
functions do not look like the human cone spectral sensitivities, although we can
use them to calculate color matches in much the same way as we could with
cones. The ȳ (λ ) color-matching function should look almost identical to the
1924 photometric standard observer. These color-matching functions have come
to be known as the CIE 1931 standard observer, or similarly the CIE 2◦ -observer;
they are meant to represent the color-matching results of the average human pop-
ulation [186].
The term 2◦ -observer comes from the fact that both Wright and Guild used
a bipartite field that subtended approximately two degrees of visual angle, and
as such all the color matches were made using just the fovea. As we learned
in Section 4.3.1, this meant that the color matches were performed using only the
cones while eliminating any rod contribution. This was vitally important, because
i i
i i
i i
i i
Grassmann’s laws of additive color mixing can break down when there are both
rod and cone contributions to color mixing.
What this means, is that the CIE 1931 standard observer is designed to predict
color matches of very small stimuli (think a thumbnail at arm’s length). Another
important consideration is that the average human response used for color mea-
surements was taken from a pool of 17 observers, over 75 years ago! Despite the
age, and apparent limitations of the 1931 standard observer, it is still used with
great success today and is still the foundation of modern colorimetry.
Not willing to rest on their laurels with the introduction of the 1931 stan-
dard observer, the CIE continued to encourage active experimentation to both
validate the existing color-matching functions and to test the use of them for
larger color patches. Through observations lead by Stiles at the National Physics
Laboratory in the UK, with ten observers using both 2◦ and 10◦ , it was deter-
mined that the 1931 standard observer was appropriate for the small-field color
matches [1262].
There were discrepancies between the small field results and the large field re-
sults, and so further testing was required. Stiles and Burch proceeded to
measure an additional 49 observers using the 10◦ -bipartite field [1087]. The
Stiles and Burch data was collected at a relatively high luminance level, in an
attempt to minimize the contribution of the rods to the color matches. Some
computational techniques were also utilized to eliminate any remaining rod
contribution [93].
Around the same time, Speranskaya measured the color-matching functions
for 27 observers, also using a 10◦ viewing field [1074]. This data was measured
at a lower luminance level, and so the contributions of the rods were thought to
be higher. The CIE computationally removed the rod contribution and combined
these two data sets. A transformation to imaginary primaries, very similar to that
performed to generate the 1931 standard observer, followed. What resulted were
the 1964 CIE supplementary standard observer [186, 1262].
Like the 1931 standard observer, the 1964 standard observer is also commonly
referred to as the 10◦ -standard observer. The color-matching functions for the
10◦ observer are expressed as: x̄10 (λ ), ȳ10 (λ ), and z̄10 (λ ). These functions are
shown in Figure 7.9. For comparisons to the 1931 standard observer, both sets of
color-matching functions are shown in Figure 7.10.
The CIE recommends the use of the 1964 color-matching functions for all
color stimuli that are larger than 4◦ of visual angle. It is important to emphasize,
as shown in Figure 7.10, that the ȳ10 (λ ) color-matching function does not equal
the photometric observer (ȳ (λ ) or V (λ )), and that calculations using the 1964
standard observer do not directly translate into luminance measurements.
i i
i i
i i
i i
378 7. Colorimetry
2.5
sensitivity
x10(λ)
y10 (λ)
2.0
z10(λ)
1.5
1.0
0.5
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.9. The CIE 1964 x̄10 (λ ), ȳ10 (λ ),and z̄10 (λ ) color-matching functions.
2.5
sensitivity
x (λ)
y (λ)
z (λ)
2.0
x10(λ)
y10 (λ)
z10(λ)
1.5
1.0
0.5
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.10. The CIE 1931 x̄ (λ ), ȳ (λ ), and z̄ (λ ) and the CIE 1964 x̄10 (λ ), ȳ10 (λ ), and
z̄10 (λ ) color-matching functions.
i i
i i
i i
i i
X= x̄ (λ ) · Φ (λ ) d λ , (7.6a)
λ
Y= ȳ (λ ) · Φ (λ ) d λ , (7.6b)
λ
Z= z̄ (λ ) · Φ (λ ) d λ . (7.6c)
λ
From these calculations, we can measure the “color” of any given stimulus,
Φ (λ ), and determine if two stimuli match. By definition, two stimuli will match
if they have identical tristimulus values. This match is considered to be valid for
the average population. It is also important to stress that this calculation of a
tristimulus value does not give us any insight into what the color actually looks
like, nor does it guarantee a match if any aspect of the viewing conditions change:
In practice, we often want to make calculations for real objects, rather than
generic radiant spectral power distributions. Typically, these objects are reflect-
ing objects. In this case, the color stimulus, Φ (λ ), is calculated by multiplying
the spectral reflectance or reflectance factor of an object with the spectral power
distribution of the light source. In most cases, we use relative reflectance factors
that have been normalized between 0–1 by dividing by the reflectance of a perfect
reflecting diffuser. Likewise, as will be discussed in Section 9.1, we typically use
the normalized spectral power distribution of the light sources, which is defined
to be either 1.0 or 100 at 560 nm. The normalized reflectance power and spectral
power distribution of the light source are then multiplied by the chosen standard
observers and integrated to calculate tristimulus values. This relationship is shown
in Figure 7.11.
The CIE color-matching functions are defined from 360 to 800 nm, in 1 nm
intervals. In practice, light-source and reflectance data are not available in this
range, or in that fine a wavelength interval. Typically, tristimulus values are cal-
culated using a more limited range of data; often from 380–720 nm or 400–700
nm at 5 or 10 nm increments. We also typically measure what are known as rela-
tive tristimulus values, which are normalized to a Y of 100.0 for the light source
itself. The relative discrete tristimulus-value calculation for reflecting objects, as
seen in Figure 7.11 is shown in (7.8). Similar calculations can also be performed
for transmitting colors.
i i
i i
i i
i i
380 7. Colorimetry
Relative power
1.2
1.0
S(λ)
0.8
0.6
Multiply
Reflectance factor
0.6
R(λ)
0.4
0.2
0.0
Multiply
2.5
Tristimulus values
2.0 x(λ)
X = ∫ S(λ) R(λ) x(λ)
y(λ)
λ
1.5 z(λ)
Y = ∫ S(λ) R(λ) y(λ)
1.0 λ
0.0
400 450 500 550 600 650 700
wavelength λ (nm)
Integrate
Figure 7.11. Tristimulus values X, Y , and Z for real objects are calculated by multiplying
the light source, S (λ ), with the reflectance factor, R (λ ), and with the color-matching
functions, x̄ (λ ), ȳ (λ ), and z̄ (λ ) and then integrating.
X = k ∑ x̄ (λ ) S (λ ) R (λ ) Δλ , (7.8a)
λ
Y = k ∑ ȳ (λ ) S (λ ) R (λ ) Δλ , (7.8b)
λ
i i
i i
i i
i i
Z = k ∑ z̄ (λ ) S (λ ) R (λ ) Δλ (7.8c)
λ
100
k= (7.8d)
∑ ȳ (λ ) S (λ ) Δλ .
λ
i i
i i
i i
i i
382 7. Colorimetry
Figure 7.12. The visible spectrum displayed in CIE XYZ tristimulus space, seen from
different vantage points. This can also be considered a three-dimensional plot of the 2◦
color-matching functions. It’s projection onto a 2D plane produces the more familiar chro-
maticity diagrams.
do not represent color appearances, we can use them to represent a stimulus’ lo-
cation in a three-dimensional color space. Each of the axes in this space represent
the imaginary X, Y, and Z primaries, and a color stimulus’ location in the space
is its integrated tristimulus values. We can plot the location of the visible spec-
trum to get an idea of the behavior and shape of the CIE XYZ space. Figure 7.12
shows the spectral colors, which are also the color-matching functions of the 1931
standard observer, plotted in XYZ tristimulus space.
Although we cannot ascertain the appearance of any color based upon its tris-
timulus values, we show the spectral lines in Figure 7.12 in color. While not en-
tirely appropriate and done mostly for illustrative purposes, the monochromatic
colors of the spectrum do not drastically change as a result of viewing conditions
and chromatic adaptation. In essence, we are assuming that the colors drawn are
those of the individual wavelengths when viewed in isolation, otherwise known
as unrelated colors.
It is difficult to visualize the shape of the spectral colors in the three-dimen-
sional space. It is also fairly difficult to determine where any given color lies in
the space. Often we are interested in knowing and specifying the general region
of space a color occupies, but in an easy to understand manner. We know, by
definition, that for the 1931 standard observer, the Y tristimulus value is a measure
of luminance, which is directly related to our perception of lightness.
The other two tristimulus values do not have as easily interpretable meanings.
By performing a projection from three to two dimensions, we can generate a space
that approximates chromatic information: information that is independent of lu-
minance. This two-dimensional projection is called a chromaticity diagram and is
obtained by performing a type of perspective transformation that normalizes the
tristimulus values and removes luminance information.
i i
i i
i i
i i
X
x= , (7.10a)
X +Y + Z
Y
y= , (7.10b)
X +Y + Z
Z
z= , (7.10c)
X +Y + Z
x + y + z = 1. (7.10d)
Plotting the spectral colors in the chromaticity diagram results in the familiar
horseshoe-shaped curve. These spectral locations are referred to as the spectrum
locus, and they represent the boundary of physically realizable colors. The spec-
trum locus for the 1931 standard observer is shown in Figure 7.13. Once again,
in this figure we plot the spectral wavelengths in color. While this is somewhat
appropriate for the monochromatic lights (again, when viewed in isolation) it is
not appropriate to draw colors inside the spectral locus.
As we have emphasized in this chapter, colorimetry is not designed to predict
how colors appear, but rather just provide a means for measuring and describ-
ing color matches. Chromaticity diagrams have an even more limited description
i i
i i
i i
i i
384 7. Colorimetry
Figure 7.13. The visible spectrum displayed in the CIE xy chromaticity diagram.
i i
i i
i i
i i
Figure 7.14. The location of various CIE standard illuminants in the 1931 xy chromaticity
diagram.
Figure 7.15. The chromaticity boundaries of the CIE RGB primaries at 435.8, 546.1, and
700 nm (solid) and a typical HDTV(dashed).
i i
i i
i i
i i
386 7. Colorimetry
Figure 7.15 for the monochromatic primaries used to specify the CIE RGB color-
matching functions, as well as the primaries specified for a typical high-definition
television [536]. It is tempting, and erroneous, to refer to these triangles as the
gamut of the display system. This is very misleading, as the gamut of reproducible
colors must at the very least include the third dimension of luminance to have
any meaning. We can say that these triangles represent the range of all possible
chromaticities that we can create through additive mixing of the three primaries,
following Grassmann’s laws.
Another word of warning: Chromaticity diagrams, while sometimes useful,
should be used with diligence and care. They do not represent meaningful colors
and should not be used to specify what a color looks like. They do provide insight
into additive color mixing, which is why they are often still used in the display
industry. There is no reason to believe that equal distances, or areas for that matter,
should represent equal perceptual distances. In fact, we should be surprised to find
out if they did. The CIE color-matching functions that they are based upon are
not meant to be human cone sensitivities (though we can consider them to be
approximate linear transformations from cones).
Nevertheless, there has been considerable research on creating chromaticity
diagrams that are more nearly uniformly spaced. In 1976, the CIE recommended
using the u v Uniform Chromaticity Scale (UCS) chromaticity diagram [186].
The general goal of this diagram was to be more uniformly spaced, such that
distances and areas held a closer relationship to actual perceptions. It should be
noted that just like the xy chromaticity diagram, the u v diagram does not contain
any luminance information. Thus, any distances that are meant to be perceptually
meaningful do not include the luminance relationships between the colors. This
may be acceptable when dealing with primaries or light sources that have similar
luminances, but it is not acceptable when examining those with widely disparate
luminances. The calculation of u v from CIE XYZ tristimulus values is given by
4X
u = , (7.12a)
X + 15Y + 3Z
9Y
v = . (7.12b)
X + 15Y + 3Z
4x
u = , (7.13a)
−2x + 12y + 3
9y
v = , (7.13b)
−2x + 12y + 3
i i
i i
i i
i i
i i
i i
i i
i i
388 7. Colorimetry
that is not tied to any specific display device or specific set of primaries. Most
readers have probably encountered the situation where they create or examine an
image on one computer display and then look at the image on a second computer
display, only to find out that the image looks markedly different. Despite the
fact that the digital file contains the exact same red, green, and blue values, each
computer interprets these data differently. The images are assumed to be device-
dependent, essentially measurements tied to the specific device on which they
were created. Imagine if other measurements in the world were based in such a
manner, for instance a unit of length based upon each individual foot size!
By removing the display device from the picture, we can specify color in a
meaningful way, based upon the imaginary but well-defined XYZ primaries. This
is similar to basing the measurement of length on an imaginary, but well-defined,
golden foot. We can then transform our colorimetric units into any other unit, or
any other display device, much like we can convert the foot to the meter, fathom,
or furlong. This is the basis of modern color management.
So how can one take advantage of the CIE system of colorimetry? When creat-
ing digital images, we can create them directly in tristimulus space, though due to
the imaginary nature of the space this may prove difficult. For computer-graphics
renderings that rely on physics, there are several ways to directly take advantage
of colorimetry. If, in the rendering process, all calculations are performed in the
spectral domain, including light source propagation as well as material and light
source interactions, then it is easy to just integrate the resulting spectral radiance
values of the scene with the CIE x̄ (λ ), ȳ (λ ), and z̄ (λ ) color-matching functions,
and integrate as shown in (7.6). We will then have perceptually accurate measure-
ments of the rendered image.
But how do we display colorimetric data? For that, it is necessary to apply
another transformation of primaries.
Recall from Section 7.3, that the original color-matching data from Wright and
Guild were transformed to a common set of primaries before being averaged to-
gether. This transformation utilized Grassmann’s laws to express one set of spec-
tral tristimulus values in terms of another. This was accomplished using a linear
transform, which for a three-primary display is represented by a 3 × 3 matrix
transform. For additive color systems that obey Grassmann’s laws, the calcula-
tion of this transform is very straightforward.
Desktop display systems, such as CRTs, behave as additive systems and, as
such, calculating the transformation from XYZ tristimulus values into the native
i i
i i
i i
i i
1.0
Relative Power
0.9 Red
Green
0.8 Blue
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
400 450 500 550 600 650 700
wavelength (nm)
Figure 7.17. The relative spectral power distribution of a representative CRT display.
i i
i i
i i
i i
390 7. Colorimetry
In (7.15), the Xred , Yred , and Zred terms represent the colorimetric measurements
of the CRT red primary, while the same holds for green and blue. These are
measured by driving the display with the maximum RGB digital counts, typically
(255, 0, 0) for the red primary on a 8-bit system.
Examining the form of the matrix in (7.15), we can immediately see that the
tristimulus calculation of any color on the display is literally just an additive mix-
ture of the three primaries, e.g., Xcolor = Xred + Xgreen + Xblue . By inverting the
3 × 3 matrix, it is easy to calculate the RGB values necessary to display a given
tristimulus value:
⎡ ⎤ ⎡ ⎤−1 ⎡ ⎤
R Xred Xgreen Xblue X
⎣G⎦ = ⎣Yred Ygreen Yblue ⎦ ⎣Y ⎦ . (7.16)
B Zred Zgreen Zblue Z
So in our computer rendering example, if we have rendered the spectral ra-
diance values, and then calculated the tristimulus values, we can apply (7.16) to
determine the RGB values to display on our CRT (again, temporarily ignoring the
nonlinear behavior). To display the same colors on a different device, it is not nec-
essary to re-render the scene, but rather just calculate a different transformation
matrix.
If it is not possible to directly measure the output display device, there are
assumptions that can be made. For instance, the primary set for HDTV (ITU-
R BT.709/2) is well defined [536]. Not coincidentally, this same primary set,
dubbed sRGB, was chosen to be the “standard” RGB for color management as
well as for the Internet [524]. If the primaries of the output display are unknown,
it is possible to assume that they are consistent with the sRGB primaries. The
transformation to sRGB from CIE XYZ tristimulus values is given by
⎡ ⎤ ⎡ ⎤⎡ ⎤
sR 3.2410 −1.5374 −0.4986 X
⎣sG⎦ = ⎣−0.9692 1.8760 0.0416⎦ ⎣Y ⎦ . (7.17)
sB 0.0556 −0.2040 1.0570 Z
i i
i i
i i
i i
mapping for this case is simple clipping, though more complicated algorithms are
in wide use (see Section 15.4).
i i
i i
i i
i i
392 7. Colorimetry
2.0
Sensitivity
r sRGB
1.5
gsRGB
bsRGB
1.0
0.5
0.0
−0.5
−1.0
400 450 500 550 600 650 700
Wavelength (nm)
transform from the desired primaries. The special case, where the spectral sen-
sitivities of the capture device are a direct linear transform from CIE XYZ tris-
timulus values, means that the device satisfies the Luther-Ives condition and is
colorimetric [538, 715].
where x j are three coefficients. The values for these coefficients can be found by
substituting this equation into (7.6), giving
i i
i i
i i
i i
3
X= x̄ (λ ) ∑ x j f j (λ ) d λ , (7.19a)
j=1
λ
3
Y= ȳ (λ ) ∑ x j f j (λ ) d λ , (7.19b)
j=1
λ
3
Z= z̄ (λ ) ∑ x j f j (λ ) d λ . (7.19c)
j=1
λ
c0 = X, (7.20a)
c1 = Y, (7.20b)
c2 = Z, (7.20c)
c̄0 = x̄, (7.20d)
c̄1 = ȳ, (7.20e)
c̄2 = z̄, (7.20f)
With the color-matching functions c̄i (λ ), the basis functions f j (λ ), and the XYZ
tristimulus value ci known, the coefficients x j can be computed. Plugging these
coefficients into (7.18) yields the desired spectral representation of the input XYZ
tristimulus value ci .
However, the choice of basis functions f j (λ ) is important, as it will have
a profound impact on the shape of the resulting spectrum. For computational
simplicity, a set of delta functions could be chosen [369]:
1 for λ = λi ,
f j (λ ) = δ (λ − λ j ) = (7.22)
0 for λ = λi .
To achieve a numerically stable solution, the values for λ j can be chosen to co-
incide with the peaks in the color-matching functions c̄i , which are λ1 = 590 nm,
i i
i i
i i
i i
394 7. Colorimetry
λ2 = 560 nm, and λ3 = 440 nm. Of course, with this approach the resulting
spectrum will only have three non-zero values. This is far removed from typical
spectra, which tend to be much smoother.
Alternatively, the following box functions could be used as basis functions
[414, 415]:
1 for λi < λ < λi+1 ,
f j (λ ) = (7.24)
0 otherwise.
The boundaries of the intervals are given by λ1 = 400 nm, λ2 = 500 nm, λ3 = 600
nm, and λ4 = 700 nm. This approach yields spectra which vary abruptly at the
interval boundaries.
For use in rendering, it is desirable to create spectra which are relatively
smooth, especially if they represent reflectance functions of materials [1063]. To
enforce a smoothness constraint, the first three Fourier functions can be chosen as
basis functions [265, 1204]:
f1 (λ ) = 1, (7.25a)
λ − λmin
f2 (λ ) = cos 2π , (7.25b)
λmax − λmin
λ − λmin
f3 (λ ) = sin 2π . (7.25c)
λmax − λmin
This approach leads to smooth spectra, but there is a significant probability that
parts of the spectra are negative, and they can therefore not be used. This tends to
happen most for highly saturated input colors. Although the negative parts of any
spectrum could be reset to zero, this would introduce undesirable errors.
An arguably better approach would be to parameterize the basis functions to
make them adaptable to some characteristic of the tristimulus value that is to be
converted to a spectrum. In particular, it would be desirable to create smooth
spectra for desaturated colors, whereas saturated colors would yield more spiked
spectra. As an example, Gaussian basis functions may be parameterized as fol-
lows [1106]:
2 (λ − λ j ) 2
f j (λ ) = exp − ln 2 . (7.26)
wj
Here, λ j indicates the center position of the jth Gaussian, and w j is its width. The
values λ j are λ1 = 680 nm, λ2 = 550 nm, and λ3 = 420 nm.
i i
i i
i i
i i
For a tristimulus value given in, for instance, the sRGB color space, a rough
indication of the degree of saturation is obtained by the following pair of values:
|sR − sG|
s1 = , (7.27a)
sR + sG
|sB − sG|
s2 = . (7.27b)
sB + sG
We can now compute the widths of the Gaussian basis functions by linearly in-
terpolating between the user-defined minimum and maximum Gaussian widths
wmin = 10 nm, and wmax = 150 nm:
This approach will produce smooth spectra for desaturated colors and progres-
sively more peaked spectra for saturated colors. As there are infinitely many
solutions to the conversion between XYZ tristimulus values and spectral repre-
sentations, this appears to be a reasonable solution, in keeping with many spectra
found in nature.
i i
i i
i i
i i
396 7. Colorimetry
Input Device
Characterization
Device
Independent CIE
XYZ
Output Device
Characterization
i i
i i
i i
i i
i i
i i
i i
i i
398 7. Colorimetry
i i
i i
i i
i i
Figure 7.20. The two-tone pattern on the left is easily recognized as a human face, whereas
the image on the right is not. (Gordon Kindlmann, Erik Reinhard, and Sarah Creem, “Face-
Based Luminance Matching for Perceptual Color Map Generation,” Proceedings of IEEE
Visualization, pp 309–406, c 2002 IEEE.)
i i
i i
i i
i i
400 7. Colorimetry
Figure 7.21. Example of varying the colors in face stimuli. Note the different cross-
over points for the three colors, where the perception of a face flips from the left image
to the right image. (Gordon Kindlmann, Erik Reinhard, and Sarah Creem, “Face-Based
Luminance Matching for Perceptual Color Map Generation,” Proceedings of IEEE Visual-
ization, pp. 309–406, c 2002 IEEE.)
i i
i i
i i
i i
Figure 7.22. The two-tone pattern used in the minimally distinct border experiment.
(Gordon Kindlmann, Erik Reinhard, and Sarah Creem, “Face-Based Luminance Matching
for Perceptual Color Map Generation,” Proceedings of IEEE Visualization, pp. 309–406,
c 2002 IEEE.)
figure is easily recognized as a human face, whereas inverting this pattern breaks
human face perception.
In this stimulus, it is possible to replace the black with a desired shade of
gray. The white is replaced with a color, and the observer is given control over
the intensity value of the color. By appropriately changing the intensity value of
the color, the stimulus that was initially seen as a face will disappear, and the
pattern on the other side will then be recognized as a face. The range of intensity
0.70 0.09
Mean lightness
Standard deviation
Figure 7.23. Mean and standard deviation for both the minimally distinct border and
the face-based luminance matching techniques. (Gordon Kindlmann, Erik Reinhard, and
Sarah Creem, “Face-Based Luminance Matching for Perceptual Color Map Generation,”
Proceedings of IEEE Visualization, pp. 309–406,
c 2002 IEEE.)
i i
i i
i i
i i
402 7. Colorimetry
Figure 7.24. Color maps generated using face-based luminance matching. (Gordon Kindl-
mann, Erik Reinhard, and Sarah Creem, “Face-Based Luminance Matching for Perceptual
Color Map Generation,” Proceedings of IEEE Visualization, pp. 309–406, c 2002 IEEE.)
values where both the left and right halves of the stimulus are perceived to be
ambiguous tends to be small, suggesting a high level of accuracy. An example of
a set of stimuli is shown in Figure 7.21.
In a user study, the face-based luminance-matching technique was compared
against a conventional minimally distinct border technique, using a stimulus that
can not be recognized as a human face, but has otherwise the same boundary
length and a similar irregular shape (as shown in Figure 7.22). The results, shown
in Figure 7.23, reveal that the two techniques produce the same mean values, but
that the face-based technique has a significantly higher accuracy.
Color maps generated with this technique are shown in Figure 7.24. This
figure shows that different observers generate somewhat different iso-luminant
color maps. The variances between different observers have been measured using
face-based luminance matching. Results for seven participants, each for normal
color vision, are shown in Figure 7.25. Aside from showing differences between
observers, this figure also shows that the standard rainbow color map is far from
iso-luminant.
It is also shown in Figure 7.24 that monotonically increasing color maps can
be constructed with this technique. This is achieved by choosing increasing lu-
minance values from left to right along the color scale. Thus, different colors
are matched to increasing luminance values, yielding the monotonically increas-
ing luminance map shown in this figure. Note that face-based techniques have
also been developed to test if a given color map is monotonically increasing or
not [974].
i i
i i
i i
i i
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
red yellow green cyan blue magenta
Figure 7.25. The perception of lightness varies between observers, as shown here for
18 different colors and seven participants. (Gordon Kindlmann, Erik Reinhard, and Sarah
Creem, “Face-Based Luminance Matching for Perceptual Color Map Generation,” Pro-
ceedings of IEEE Visualization, pp. 309–406, c 2002 IEEE.)
i i
i i
i i
i i
Chapter 8
Color Spaces
As shown in Chapter 7, due to the trichromatic nature of the human visual sys-
tem, color may be represented with a color vector consisting of three numbers,
or tristimulus values. The CIE XYZ color space is derived from spectral power
distributions by means of integration with three color-matching functions. These
color-matching functions can be considered approximate linear transformations
of the human cone sensitivities. For the XYZ color space, the Y channel was
derived to be identical with the photopic luminance response function.
Closely tied to all color-matching functions, and their corresponding tristim-
ulus values, are the primaries or light sources that can be used to re-derive the
functions in a color-matching experiment. One of the main goals in creating the
the CIE XYZ color space was to have tristimulus values that are all positive. In
order to do that, it was necessary to select an imaginary set of primaries. Essen-
tially, this means that it is not possible to construct a set of only three positive
light sources that, when combined, are able to reproduce all colors in the visible
spectrum.
In general practice, the XYZ color space may not be suitable for performing
image calculations or encoding data, as there are a large number of possible XYZ
values that do not correspond to any physical color. This can lead to inefficient
use of available storage bits and generally requires a higher processing bit-depth
to preserve visual integrity. It is not possible to create a color display device that
corresponds directly to CIE XYZ; it is possible to create a capture device that
has sensor responsivities that are close to the CIE XYZ color-matching functions,
though due to cost of both hardware and image processing, this is not very com-
mon. Therefore, since the corresponding primaries are imaginary, a transcoding
405
i i
i i
i i
i i
into a specific device space is necessary. Depending on the desired task, the use
of a different color space may be more appropriate. Some of the reasons for the
development of different color spaces are:
Physical realizability. Camera and display manufacturers build sensors and dis-
plays that have specific responsivities (cameras) and primaries (displays).
The output of any given camera is often implicitly encoded in a color space
that corresponds to the camera’s hardware, firmware, and software. For in-
stance, most digital sensors capture data linearly with respect to amounts
of light, but this data is gamma corrected by compressing the signal using
a power function less than 1.0. This gamma-corrected data may or may
not still correspond to the physical RGB sensors used to capture the image
by the camera. The signal that is sent to a display device such as a cath-
ode ray tube, which has its own device-dependent color space defined by
its phosphors, may not be in the same color space as the cathode ray tube;
it must first be transformed or color reproduction will be impaired. In a
closed system, it is possible to directly capture images using a camera that
is specifically tied to an output device, removing the need for transforming
the colors between capture and display.
Efficient encoding. Some color spaces were developed specifically to enable ef-
ficient encoding and transmission of the data. This encoding and trans-
mission may rely on certain behaviors of the human visual system, such
as our decrease in contrast sensitivity for chromatic signals compared to
achromatic signals. In particular, the color encodings used by the different
television signals (NTSC, PAL, SECAM) exploit this behavior of the visual
system in their underlying color models.
Perceptual uniformity. For many applications, we would like to measure the per-
ceived differences between pairs of colors. In most linear color spaces,
such as CIE XYZ, a simple Euclidean distance does not correspond to the
perceptual difference. However, color spaces can be designed to be percep-
tually uniform, such that the Euclidean distance in those color spaces is a
good measure of perceptual difference.
Intuitive color specification. Many color spaces are designed to be closely tied
to output display primaries, such as the red, green, and blue primaries of a
typical CRT display. However, the mapping from such values encoded in
these spaces to the appearance of a color can be very non-intuitive. Some
color spaces are designed to provide a more intuitive meaning to allow for
ease of specifying the desired color. These spaces may encode the data in
i i
i i
i i
i i
407
ways that more closely match our perception of color appearance, using
terms such as hue, saturation, and lightness (HSL).
Color spaces are often characterized by their bounding volumes, or the range
of colors that they are capable of representing. This is often referred to as the
gamut of colors that a space can represent. This is only a valid assumption if
the values that color vectors can store are limited, for example all positive with
a range between 0 and 1. For instance, many (but not all) color spaces do not
allow negative values—typically because the encoded values correspond to the
light energy coming from a specific set of output primaries, and light cannot be
negative. We generally assume that only output devices have color gamuts, while
capture devices have a dynamic range or limitations in their spectral sensitivity.
This is often an area of great philosophical and semantic debate and discussion in
the color-imaging community.
The conversion of colors represented in one gamut to another, often smaller,
gamut is a frequently occurring problem in color reproduction. This field of study
is known as gamut mapping. There are many ways to remap colors from a larger
to a smaller gamut, with some of the possible approaches outlined in Section 15.4.
As display devices and color printers get more sophisticated, the opposite problem
is also becoming an active area of research, often called gamut expansion. How
we move colors that have been encoded in a limited range to a device capable of
displaying a much larger range has both similar and unique problems to traditional
gamut mapping.
The CIE XYZ color space is still actively used for converting between differ-
ent color spaces, as many color spaces are actually defined by means of a trans-
form to and from XYZ. We can think of CIE XYZ as a device-independent method
for representing colors, and any given output device can be described by its rela-
tionship to XYZ. It is important to stress that when a color space that corresponds
to physical primaries has a well-documented and specified relationship with CIE
XYZ, then this device can also be thought of as device-independent. Although the
space may be tied to physically realizable primaries, it is still possible to specify
any color match using that space (allowing for both positive and negative encoded
values), or also to derive a unique set of color-matching functions that correspond
to those primaries.
For simple, linear and additive trichromatic display devices the transformation
to and from CIE XYZ can usually be given by a 3 × 3 matrix. This transforma-
tion relies on Grassmann’s Laws of additive color mixing. Often, color spaces are
defined by the linear relationship to CIE XYZ, with additional non-linear process-
ing. In most cases, such non-linear processing is designed to minimize perceptual
errors when storing in a limited bit-depth without ample color precision, or to di-
i i
i i
i i
i i
R G B White
x 0.6400 0.3000 0.1500 0.3127
y 0.3300 0.6000 0.0600 0.3290
Table 8.1. The xy chromaticity coordinates for the primaries and white point specified by
ITU-R BT.709. The sRGB standard also uses these primaries and white point.
XW = xR SR + xG SG + xB SB , (8.1a)
YW = yR SR + yG SG + yB SB , (8.1b)
ZW = zR SR + zG SG + zB SB . (8.1c)
i i
i i
i i
i i
409
⎡ ⎤ ⎡ ⎤
xw xr xg xb
⎡ ⎤⎡ ⎤
⎢ yw Yw ⎥ ⎢ yr yb ⎥ 0 0
⎢ ⎥ ⎢ yg ⎥ Yr R
⎢ ⎥ ⎢ ⎥⎣ 0 ⎦ ⎣G⎦ .
⎢ Yw ⎥ = ⎢ 1 1 1 ⎥ 0 Yg (8.4)
⎣ zw ⎦ ⎣ zr zg zb ⎦ 0 0 Yb B
Yw
yw yr yg yb
This expansion still requires the knowledge of both the luminance of the
white, Yw , as well as the luminance of the red, green, and blue channels Yr ,Yg ,Yb .
Since we do not have this information, we can calculate the luminance ratios Y R
that would be necessary to obtain the chromaticity of the given white point, with
the chromaticities of the red, green, and blue channel we are given. First, we as-
sume that the maximum luminance Yw occurs when R = G = B = 1, and that it
has a luminance ratio of 1.0. Equation (8.4) can then be reduced to
⎡ ⎤ ⎡ ⎤
xw xr xg xb
⎡ ⎤
⎢ yw ⎥ ⎢ yr y y ⎥ R
b ⎥ Yr
⎢ ⎥ ⎢ g
⎢ 1 ⎥=⎢ 1 ⎥ ⎣ R⎦
⎢ ⎥ ⎢ 1 1 ⎥ Yg . (8.5)
⎣ zw ⎦ ⎣ zr zg zb ⎦ YbR
yw yr yg yb
R , by inverting the
We can solve for the luminance ratios, represented by Yrgb
individual chromaticity matrix, as shown in Equation (8.6a). We then use these
i i
i i
i i
i i
luminance ratios to solve for the 3 × 3 matrix from Equation (8.3) that will trans-
form from device RGB into XYZ. This technique is also useful when you want
to force a set of primaries with given chromaticities into having a specific white
point chromaticity:
⎡ ⎤−1 ⎡ ⎤
⎡ R⎤ xr xg xb xw
Yr ⎢ yr yg yb ⎥ ⎢ yw ⎥
⎢ R⎥ ⎢ ⎥ ⎢ ⎥
⎣Yg ⎦ = ⎢⎢ 1 1
⎥ ⎢ ⎥
1 ⎥ ⎢ 1 ⎥ ; (8.6a)
R ⎣ zr zg zb ⎦ ⎣ zw ⎦
Yb
yr yg yb yw
⎡ ⎤
xr xg xb
⎡ ⎤ ⎡ R ⎤
⎢ yr yb ⎥
XRmax XGmax XBmax ⎢ yg ⎥ Yr 0 0
⎣YRmax YGmax YBmax ⎦ = ⎢
⎢ 1 1
⎥⎣
1 ⎥ 0 Yg
R 0 ⎦.
ZRmax ZGmax ZBmax ⎣ zr zg zb ⎦ 0 0 YbR
yr yg yb (8.6b)
Y = yR SR + yG SG + yB SB . (8.7)
i i
i i
i i
i i
It is important to note that we can also calculate the white point of any given
display by calculating the XYZ tristimulus values of the maximum RGB values:
⎡ ⎤ ⎡ ⎤⎡ ⎤
XW 0.4124 0.3576 0.1805 Rmax
⎣YW ⎦ = ⎣0.2126 0.7152 0.0722⎦ ⎣Gmax ⎦ . (8.10)
ZW 0.0193 0.1192 0.9505 Bmax
The ITU-R BT.709 primaries have a defined white point equivalent to that of CIE
D65:
⎡ 709 ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
XW 0.4124 0.3576 0.1805 100 95.05
⎣Y 709 ⎦ = ⎣0.2126 0.7152 0.0722⎦ ⎣100⎦ = ⎣100.00⎦ . (8.11)
W
ZW709 0.0193 0.1192 0.9505 100 108.90
This tristimulus value is equivalent to the chromaticity values of the white point
listed in Table 8.1.
In this chapter, many of the more common color spaces used in various indus-
tries are discussed. These are generally grouped into device-specific and device-
independent color spaces. While it is tempting to group the color spaces that are
defined by a specific (and real) primary set as being device-dependent, if they
have a well-defined transform to and from a device-independent space such as
CIE XYZ, then these spaces are indeed device-independent. To facilitate imple-
mentation and ease of communication between the wide variety of color spaces,
we include transformation matrices where possible.
i i
i i
i i
i i
implicitly define the primaries and native white points of display devices. Typical
additive display devices use a red, a green, and a blue primary. However, different
display devices can have very disparate primaries, even if they are all described
as being RGB. As a result, each display device has a different range of colors,
or color gamut, that it is able to display. Nonetheless, these color spaces are all
generically referred to as RGB color spaces.
It is important to remember that, as a result, there is no such thing as the RGB
color space. Every device has its own color space, and these classes of RGB
color spaces are therefore called device-dependent; if we send the same RGB
values to each of the displays we will see different colors depending on their
choice of primaries. This is in contrast to the XYZ color space, which is well
defined, based on a set of imaginary primaries, and is an example of a device-
independent color space. The practical implication is that an image can appear
very different depending on the display, unless care is taken to process the data
prior to output. The process of measuring how a display handles color is called
device-characterization and is necessary to achieve color matches between differ-
ent devices (see Section 14.18).
To enable some consistency in how images appear on a display, the images
can be converted to a standardized set of primaries, with a standardized white
point. This standardized set can be specified by the target display, e.g., a display
requesting that all images be described using the ITU-R BT.709 primaries, which
then performs an internal conversion from this space into the primaries of the
display, or this conversion can be done using the model of modern color manage-
ment. This model is achieved by converting the image into a device-independent
space such as the XYZ or CIELAB color space and, subsequently, to the RGB
color space of the target display. All too often, in general practice, such correc-
tion does not happen and images appear different on different displays. This may
be because the color space of the image is unknown or the primaries and white
point of the display are unknown.
Some standardization has occurred in recent years, for instance, through the
use of the sRGB color space. Some cameras are able to output images in this
color space, and some display devices are sRGB compliant. This means that in
an imaging pipeline that employs the sRGB color space throughout, conversion
between different device-dependent RGB color spaces is unnecessary. We can
think of using the sRGB color space in this manner as a means for a default form
of color management. If all devices work in sRGB, then all devices theoretically
should match each other. This makes the sRGB color space suitable for such
consumer applications as shooting a digital image and displaying this image on a
CRT, or placing it on a website and having it appear somewhat consistent wher-
i i
i i
i i
i i
ever it is viewed. If the output device is not assuming an sRGB input, however,
then the colors may mismatch.
The sRGB color space, like many other RGB color spaces, is defined by a
3 × 3 matrix as well as a non-linear luminance encoding. A maximum luminance
of 80 cd/m2 is assumed, as this was representative of a typical display when the
sRGB space was created. The matrix transforms between linearized sRGB and
XYZ is the same as that specified by ITU-R BT.709, as given in (8.8). The non-
linear encoding is given by
1.055 R1/2.4 − 0.055 R > 0.0031308,
RsRGB = (8.12a)
12.92 R R ≤ 0.0031308;
1.055 G1/2.4 − 0.055 G > 0.0031308,
GsRGB = (8.12b)
12.92 G G ≤ 0.0031308;
1.055 B1/2.4 − 0.055 B > 0.0031308,
BsRGB = (8.12c)
12.92 B B ≤ 0.0031308.
The non-linear encoding, which is also known as gamma encoding, and which
is therefore frequently confused with gamma correction, is required to minimize
quantization errors in digital applications. Since sRGB is an 8-bit quantized color
space, it is beneficial to have a non-linear relationship between the color values
and the luminance values that they represent. By compressing the data with a
power function less than 1.0, we use more bits of precision for the darker colors
where color differences are more perceptible. This helps keep quantization errors
below the visible threshold where possible. It is important to note that although
gamma correction and gamma encoding are different entities, the ultimate behav-
ior is very similar. Gamma correction is necessary when an output display device
responds non-linearly with regard to the input signal. Typical CRTs have a non-
linear relationship between input voltage and output luminance, often described
by a power function greater than 1.0 (generally around 2.4). Gamma correction
is necessary to compress the signal prior to sending it to a display, in order to get
a linear, or close to linear, response out. Since sRGB was defined to be represen-
tative of a typical CRT display, the gamma encoding also serves the purpose of
gamma correction and the non-linearly encoded data can be sent directly to the
output device.
For very small values in the sRGB space, the encoding is linear, as this pro-
duces better behavior for near-black values. When we combine the linear compo-
nent with the exponent of 1/2.4, we get a behavior that is very similar to having
i i
i i
i i
i i
just an exponent of 1/2.2, although it deviates for the dark colors. In practice, this
short linear ramp is often abandoned for simplicity of implementation, and a sin-
gle exponent of 1/2.2 is used. Note that this simplified formulation is a deviation
from the sRGB specification and is therefore not recommended.
The non-linear encoding given for the sRGB space can be parameterized as in
Equation (8.13) [882]:
(1 + f ) Rγ − f t < R ≤ 1,
Rnonlinear = (8.13a)
sR 0 ≤ R ≤ t;
(1 + f ) Gγ − f t < G ≤ 1,
Gnonlinear = (8.13b)
sG 0 ≤ G ≤ t;
(1 + f ) Bγ − f t < B ≤ 1,
Bnonlinear = (8.13c)
sB 0 ≤ B ≤ t.
Together with specifications for the primaries and the white point, the parameters
f , s, and t specify a class of RGB color spaces that are used in various industries.
The value of s determines the slope of the linear segment, and f is a small off-
set. The value of t determines where the linear slope changes into the non-linear
encoding.
Table 8.2 lists the primaries and white points of a collection of commonly
encountered device RGB spaces. The associated conversion matrices and non-
linearities are given in Table 8.3. The two-dimensional gamuts spanned by each
color space are shown in Figure 8.1, as projections in the CIE u v chromaticity
diagrams. The projected gamut for the HDTV color space is identical to the sRGB
standard, and it is therefore not shown again. It is important to stress that device
gamuts are limited by the range of colors that a real output device can create; how-
ever the encoding color space inherently has no gamut boundaries. Only when we
impose a limitation on the range of values that can be encoded, such as [0, 1] or
[0, 255] do color spaces themselves have gamuts. It is also important to point
out that the two-dimensional gamut projection into a chromaticity diagram is in-
herently simplifying gamut descriptions. Since we need at least three dimensions
to fully specify color appearance (see Chapter 11) color gamuts should also be
at least three-dimensional. Since the non-linear encoding of HDTV and sRGB is
different, it is possible for these spaces to have different gamuts when represented
in a three-dimensional space, such as CIELAB. For additive display devices, how-
ever, the triangle formed by the primaries in a chromaticity diagram is often used
in industrial applications to represent the ultimate gamut of the device.
The Adobe RGB color space was formerly known as SMPTE-240M, but was
renamed after SMPTE’s gamut was reduced. It has a larger gamut than sRGB, as
i i
i i
i i
i i
shown in the chromaticity diagrams of Figure 8.1. This color space was developed
with the printing industry in mind, as there were many colors that could be printed
that could not be represented by a smaller gamut color space. It is important to
note that when a limited precision is used, e.g. 8-bits for each color channel,
then an extended gamut inherently means there will be less bits for all colors and
quantization may occur. Thus, it is important to choose an encoding color space
that is best representative of the most likely or most important colors, in order to
not “waste bits” where they are not needed. Many digital cameras today provide
an option to output images in either the Adobe RGB color, as well as sRGB.
High definition television (HDTV) and sRGB standards specify identical pri-
maries, but they differ in their definition of viewing conditions. This difference
is represented by the non-linear encoding. The sRGB space has a linear segment
and a power function of 1/2.4, while the ITU-R BT.709 has a linear segment and
power function of 1/2.2. These linear segments make the effective non-linearity
approximately 1/2.2 for sRGB and 1/2 for HDTV. Thus, if improper assumptions
are made between these color spaces, color mismatches can occur. The American
National Television Systems Committee (NTSC) standard was created in 1953
and has been used as the color space for TV in North America. As phosphors
Table 8.2. Chromaticity coordinates for the primaries and white points of several com-
monly encountered RGB color spaces.
i i
i i
i i
i i
Color space XYZ to RGB matrix RGB to XYZ matrix Non-linear transform
⎡ ⎤ ⎡ ⎤ γ = 1/2.4 ≈ 0.42
3.2405 −1.5371 −0.4985 0.4124 0.3576 0.1805
⎣−0.9693 f = 0.055
sRGB 1.8760 0.0416⎦ ⎣ 0.2126 0.7152 0.0722⎦
s = 12.92
0.0556 −0.2040 1.0572 0.0193 0.1192 0.9505
t = 0.0031308
⎡ ⎤ ⎡ ⎤ γ = 1
51 ≈ 1
2.2
2.0414 −0.5649 −0.3447 0.5767 0.1856 0.1882 2 256
Table 8.3. Transformations for standard RGB color spaces (after [882]).
and other display technologies now allow much more saturated colors, it has been
deprecated and replaced with SMPTE-C to match phosphors in current display
devices, which are more efficient and brighter. Phase alternation line (PAL) and
Séquentiel Couleur à Mémoire (SECAM) are the standards used for television in
Europe. These spaces are discussed further in Section 8.4.
Finally, the Wide gamut color space is shown for comparison [882]. Its pri-
maries are monochromatic light sources with wavelengths of 450, 525, and 700
nm. By moving the primaries closer to the spectrum locus, a larger triangle of rep-
resentable colors is formed. In the limit, the primaries become monochromatic,
as in the case of this color space.
i i
i i
i i
i i
Figure 8.1. CIE (u ,v ) chromaticity diagrams showing the color gamuts for various color
spaces.
The shift of primaries toward the spectrum locus yield primaries that, when
plotted against wavelength, are more peaked. This is where the name sharpen-
ing stems from. Hence, color spaces formed by employing primaries with such
peaked response functions are called sharpened color spaces. The associated
widening of the gamut is beneficial in several applications, including white bal-
ancing, color appearance modeling (Chapter 11), and rendering (Section 8.12).
Transforms for sharpened color spaces are introduced in Section 8.6.2.
Finally, to demonstrate that conversion between color spaces is necessary for
correct color reproduction, Figure 8.2 shows what happens when an image’s pri-
maries are misinterpreted. In this figure, we have converted an image between
different RGB color spaces. This has the same effect as displaying an image
on different monitors without adjusting for the monitors’ primaries. The error
created with this procedure is what is commonly seen when color conversion is
omitted. Note that even if care is taken to transform image data between color
spaces, it is also important to assure that white points are correctly mapped to-
i i
i i
i i
i i
Figure 8.2. A set of images obtained by converting to different RGB color spaces. These
images demonstrate the types of error that can be expected if the primaries of a display
device are not taken into account. From left to right and top to bottom: the original image
(assumed to be in sRGB space) and the image converted to Adobe RGB, NTSC, PAL, and
SMPTE-C spaces; Zaragoza, Spain, November 2006.
gether. For instance, when moving from a space with a white point of D65 to
one with a white point of D50, we cannot just move into CIE XYZ and then back
out into the new RGB color space. We must first perform a chromatic-adaptation
transform between the white points, as discussed in Chapter 10.
8.2 Printers
Printers are one technology that can be used to produce hardcopy images. We
use the term hardcopy for printed images because once an image is created it
cannot be changed, unlike the softcopy display device. Typically a printed image
i i
i i
i i
i i
i i
i i
i i
i i
C = 1 − R, (8.14a)
M = 1 − G, (8.14b)
Y = 1 − B; (8.14c)
i i
i i
i i
i i
and
R = 1 −C, (8.15a)
G = 1 − M, (8.15b)
B = 1 −Y. (8.15c)
The process therefore removes some of the cyan, magenta, and yellow ink, and
replaces it with black. The most that can be removed is the minimum of the three
C, M, and Y inks, so that neutral colors are preserved. The new values for C, M,
and Y are as they were before, but with the value of K subtracted. Note that it
would be possible to chose any value for K as long as
C = C + K, (8.18a)
M = M + K, (8.18b)
Y = Y + K. (8.18c)
We must again stress that the CMY(K) values obtained in this manner are directly
related to the primaries and white point of the original (perhaps unknown) RGB
color space. These primaries will no doubt be very different from the CMY(K)
i i
i i
i i
i i
primaries of whichever printing device will ultimately be used, and so the results
of this type of RGB to CMY(K) conversion can only be considered approximate
at best.
The color gamut of the CMY(K) spaces are typically said to be much smaller
than the gamuts of RGB devices. This means that there are ranges of colors that
can be displayed on a monitor that can not be reproduced in print. Section 15.4
discusses gamut mapping, a process that maps a larger gamut to a smaller one. It
is important to stress that not all printers have a smaller gamut than all additive
display devices, especially for colors closely associated with the printer primaries,
such as bright yellows or dark cyan colors. The ultimate gamut of a printer device
is heavily tied to the materials used to create the printed colors. For example, the
same printer may produce a gamut much smaller than a typical LCD display when
printing on plain or matte paper, but may have a gamut much larger than the LCD
when printing on a high-quality glossy paper. In some instances, gamut-mapping
algorithms may be best served by both reducing a larger gamut to a smaller one in
some areas, and mapping a smaller to a larger (gamut expansion) in others. This
just suggests that overarching statements such as “displays have larger gamuts
than printers” should be met with a high degree of skepticism.
More recently, printers are using more than four inks to achieve a higher de-
gree of color fidelity. For instance, some desktop printers have added two extra
inks for light cyan and light magenta. This resulting device-dependent colorant
space is known as CcMmYK, in which the c and m indicate light cyan and light
magenta, respectively. This may be done to reduce the visibility of the half-toning
process, and to create a more uniform printed surface. Other multiple ink config-
urations, such as the addition of green, red, orange, blue, or even clear inks, are
certainly possible as well. These inks can be added to increase the overall gamut
of the printer, increase the color constancy of the printer, or to increase overall
image quality (perhaps by reducing the size of the halftone dots).
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
an XYZ color to dot percentages for each of the three colorants in terms of the
eight Neugebauer primaries (which need to be measured for these equations to be
useful).
As the solution is non-linear, an analytical inversion of the printer model can
only be achieved if simplifying assumptions are made. Otherwise, regression
techniques can be used to express each of the three dot percentages in terms of
X, Y , and Z. First, we will assume that the color of paper results in a tristimulus
value of (XW ,YW , ZW ) = (1, 1, 1). Second, we assume that the densities of inks
are additive, i.e., X12 = X1 X2 and so on for all other ink combinations [912–914].
This allows the Neugebauer equations to be factored as follows [1287]:
X = (1 − c1 + c1 X1 ) (1 − c2 + c2 X2 ) (1 − c3 + c3 X3 ) , (8.21a)
Y = (1 − c1 + c1 Y1 ) (1 − c2 + c2 Y2 ) (1 − c3 + c3 Y3 ) , (8.21b)
Z = (1 − c1 + c1 Z1 ) (1 − c2 + c2 Z2 ) (1 − c3 + c3 Z3 ) . (8.21c)
X = (1 − c1 + c1 X1 ) , (8.22a)
Y = (1 − c1 + c1 Y1 ) (1 − c2 + c2 Y2 ) , (8.22b)
Z = (1 − c1 + c1 Z1 ) (1 − c2 + c2 Z2 ) (1 − c3 + c3 Z3 ) . (8.22c)
1 − c1 = X, (8.23a)
Y
1 − c2 = , (8.23b)
Y1 + (1 −Y1 ) X
Z
1 − c3 = . (8.23c)
(Z1 + (1 − Z1 ) X) (Z2 + (1 − Z2 )(1 − c2 ))
The three-ink model may be extended to four colors, for use with CMYK
printer models [426]. While the procedure is straightforward [577], the equations
become much more difficult to invert so that numerical techniques are required.
Real printing processes often deviate from the Demichel-Neugebauer model.
The above equations therefore have to be used with care. The Neugebauer printer
model can suffer from accuracy problems when the general assumptions are sim-
plified beyond physical reality.
Various extentions have been proposed [540, 907], including the Clapper-
Yule model for multiple inks [196, 197, 1287] and the spectral Neugebauer equa-
tions [1182]. The Clapper-Yule model itself was extended to model printing with
i i
i i
i i
i i
where c̄i is the specified dot percentage, and γi is an ink-specific parameter that
can be derived from experimental data [1093]. Again, it is important to stress
that this non-linear exponent is only an empirical approximation to better predict
printed results and should not be taken as a model of a real physical process.
A further extension takes into account trapping. This is a phenomenon
whereby the amount of ink that sticks to the page is less than the amount of ink
deposited. Trapping thus reduces the dot percentage of coverage by a fraction tip ,
where i indicates ink i sticking directly to the paper. For overprinting, the fraction
tipj models the trapping of ink i printed on top of ink j. The Neugebauer equations
(given only for X) then become [1093]:
The Neugebauer equations assume that the primaries of the printing inks are
known, or that they can be measured. This may not always be practical to do. In
that case, it may be possible to compute an approximation of the primaries using
a more complicated non-linear model such as the Kubelka-Munk theory used to
calculate the absorption and scattering of translucent and opaque pigments [1093]
(see Section 3.4.6).
i i
i i
i i
i i
i i
i i
i i
i i
Figure 8.3. The images on the left have all three channels subsampled, whereas the images
on the right have only their chrominance channels subsampled. From top to bottom the
subsampling is by factors of 2, 4, 8, 16, and 32; Long Ashton, Bristol, UK.
i i
i i
i i
i i
i i
i i
i i
i i
improvements over Yxy in this regard. Examples are the CIE Yuv, CIE u v uni-
form chromaticity scale (UCS) diagrams, and also the CIE L∗ a∗ b∗ (CIELAB) and
CIE L∗ u∗ v∗ (CIELUV) color spaces. It should be noted that although the CIELAB
color space was developed as a perceptually uniform luminance-chrominance
space, it does not have a defined chromaticity diagram that goes along with it.
2x
u= , (8.26a)
6y − x + 1.5
3y
v= . (8.26b)
6y − x + 1.5
A further refinement was introduced by the CIE Yu v color space, achieving yet
greater uniformity in its chromaticity coordinates:
2x
u = , (8.27a)
6y − x + 1.5
4.5y
v = . (8.27b)
6y − x + 1.5
Note that the only difference between Yuv and Yu v is between the v and v chan-
nels, whereby v = 1.5 v. Finally, as both these color spaces are derived from CIE
XYZ, they can also be considered to be well-defined device-independent color
spaces. It should also be noted that the Yu v space generally has superseded the
Yuv space, although for historical compatibility that space is still used for calcu-
lated correlated color temperatures.
i i
i i
i i
i i
To ensure that this series of events leads to an image that is reasonably close
to the image captured at the start of this sequence, standard definitions for each of
the processing steps are required. Such standards pertain to the color spaces used
at various stages, the encoding of the signal, sample rates, the number of pixels in
the image, and the order in which they are scanned.
Unfortunately, there is no single standard, and different broadcasting systems
are currently in use. For instance, for scanning North America and Japan use the
480i29.97 system, whereas most of the rest of the world uses the 576i25 system.1
The notation refers to the number of active scanlines (480 or 576), the i refers to
interlaced video, followed by the frame rate. Note that the North American system
has in total 525 scanlines. However, only 480 of them appear on screen. Aside
from interlaced video, progressive scan systems are also in use. For instance, high
definition television (HDTV) includes standards for both; 720p60 and 1080i30.
The transmission of signals requires the conversion of a captured RGB signal
into a space that is convenient for transmission. Typically, a non-linear transfer
function is applied first, leading to an R G B signal. The general form of this
non-linearity is given in (8.13). Following Charles Poynton, in this section primed
quantities denote non-linear signals [923].
The luminance-chrominance color spaces are suitable, as they allow the
chrominance channels to be compressed, saving transmission bandwidth by ex-
ploiting the human visual system’s relatively poor color acuity. Converting from
a linear RGB space to one of these spaces yields luminance and chrominance
signals. However, if the input to the transform is deliberately non-linear, as in
R G B , the conversion does not result in luminance and chrominance, but in a
non-linear luminance and chrominance representation. These are called luma and
chroma. Note that the non-linear luma representation is a rough approximation to
the color appearance description of lightness, though the chroma bears little to no
resemblance to the color appearance attribute of chroma. These appearance terms
are defined later in Chapter 11.
An image encoded with chroma subsampling is shown in Figure 8.3. The
sampling is usually described by a three-fold ratio, e.g., x:y:z which represents
the horizontal and vertical subsampling. The first number represents the hori-
zontal sampling reference of luminance with respect to a sampling rate of 3.375
MHz; thus a 4:4:4 represents a luminance sampling rate of 13.5 MHz. The second
number refers to the chrominance horizontal factor with respect to the first num-
ber. The third number is either the same as the second or it is zero in which case
1 We use the notation introduced by Charles Poynton in Digital Video and HDTV: Algorithms and
Interfaces [923].
i i
i i
i i
i i
Luminance
Color 1
Color 2
Figure 8.4. Four instances of chroma subsampling: 4:4:4, 4:2:2, 4:1:1, and 4:2:0.
Table 8.4. Primaries, non-linear transfer function, and luma encoding for the different
television systems currently in use (after [923]).
i i
i i
i i
i i
8.4.1 EBU Y U V
Starting with linear RGB values, the EBU standard, which is used in European
PAL and SECAM color encodings, assumes that the D65 white point is used,
as well as the primaries listed under PAL/SECAM in Table 8.2. The associated
transformation matrix, as well as the parameters for the non-linear transfer func-
tion are listed under the PAL/SECAM entry in Table 8.3. The R G B signal is
then converted to Y U V transmission primaries. Here, the chromatic channels
are formed by subtracting luma from blue and red, respectively [975]:
U = 0.492111(B −Y ), (8.28a)
V = 0.877283(R −Y ). (8.28b)
i i
i i
i i
i i
Figure 8.5. Visualization of the Y I Q color channels. The original image is shown in the
top-left panel. The Y channel is shown in the top right and the bottom two images show
the Y +I channels (left) and Y +Q channels (right); Zaragoza, Spain, November 2006.
i i
i i
i i
i i
Figure 8.6. Standard NTSC color bars along with their corresponding luminance bars.
The I is the orange-blue axis while Q is the purple-green axis (see Figure 8.5
for a visualization). The reason for this choice is to ensure backward compatibil-
ity (the Y channel alone could be used by black and white displays and receivers)
and to utilize peculiarities in human perception to transmit the most visually pleas-
ing signal at the lowest bandwidth cost. Human visual perception is much more
sensitive to changes in luminance detail than to changes in color detail. Thus,
the majority of bandwidth is allocated to encoding luminance information and the
rest is used to encode both chrominance information and audio data. The color
coding is given by
⎡ ⎤ ⎡ ⎤⎡ ⎤
Y 0.299 0.587 0.114 R
⎣ I ⎦ = ⎣ 0.596 −0.275 −0.321⎦ ⎣G ⎦ , (8.32)
Q 0.212 −0.523 0.311 B
and its inverse transform is
⎡ ⎤ ⎡ ⎤⎡ ⎤
R 1.000 0.956 0.621 Y
⎣G ⎦ = ⎣ 1.000 −0.272 −0.647⎦ ⎣ I ⎦ . (8.33)
B 1.000 −1.107 1.704 Q
The standard NTSC color bars are shown alongside their luminance bars in
Figure 8.6.
For practical purposes, the transform between EBU and NTSC transmission
primaries is possible because their RGB primaries are relatively similar. The
transform is given by
I −0.547 0.843 U
= ; (8.34a)
Q 0.831 0.547 V
U −0.547 0.843 I
= . (8.34b)
V 0.831 0.547 Q
i i
i i
i i
i i
Note that the reverse transform uses the same matrix. The conversion between
NTSC and EBU primaries and back is given by
⎡ ⎤ ⎡ ⎤⎡ ⎤
RNTSC 0.6984 0.2388 0.0319 REBU
⎣GNTSC ⎦ = ⎣ 0.0193 1.0727 0.0596⎦ ⎣GEBU ⎦ ; (8.35a)
BNTSC 0.0169 0.0525 0.8459 BEBU
⎡ ⎤ ⎡ ⎤⎡ ⎤
REBU 1.4425 −0.3173 −0.0769 RNTSC
⎣GEBU ⎦ = ⎣−0.0275 0.9350 0.0670⎦ ⎣GNTSC ⎦ . (8.35b)
BEBU −0.0272 −0.0518 1.1081 BNTSC
i i
i i
i i
i i
i i
i i
i i
i i
8.4.6 SECAM Y DB DR
The transmission encoding of SECAM is similar to the YIQ color space. Its
conversion from RGB is given by
⎡ ⎤ ⎡ ⎤⎡ ⎤
Y 0.299 0.587 0.114 R
⎣DB ⎦ = ⎣−0.450 −0.883 1.333⎦ ⎣G ⎦ ; (8.42a)
DR −1.333 1.116 0.217 B
⎡ ⎤ ⎡ ⎤⎡ ⎤
R 1.0000 0.0001 −0.5259 Y
⎣G ⎦ = ⎣ 1.0000 −0.1291 0.2679⎦ ⎣DB ⎦ . (8.42b)
B 1.0000 0.6647 −0.0001 DR
This color space is similar to the YUV color space, and the chromatic channels
DB and DR may therefore also be directly derived as follows:
DB = 3.059U, (8.43a)
DR = −1.169V. (8.43b)
The Kodak Y CC color space includes an explicit quantization into an 8-bit for-
mat. The Y CC channels are quantized using
⎡8 ⎤ ⎡ 255Y
⎤
Y
⎣8C1 ⎦ = ⎢ 1.402
⎣111.40C
⎥
⎦. (8.45)
8C 1 + 156
2 135.64C + 137
2
i i
i i
i i
i i
i i
i i
i i
i i
color spaces are generally device-dependent. If these spaces are defined using a
specific set of RGB primaries, they could be considered to be device-independent,
though in practice this is rarely done. Despite the suggestive names given to the
axes, the hue, saturation, and lightness do not correlate well with the perceptual
attributes of hue, saturation, and lightness as defined in Chapter 11. In particular,
the lightness axis (or value, or brightness, or whatever other name is given to
this channel) is not linear in any perceived quantity. To compute lightness, one
should consider computing XYZ tristimulus values first and then converting to an
appropriate perceptually uniform color space, such as CIE L∗ a∗ b∗ .
Further, as the computation for hue is usually split into 60◦ segments, one may
expect discontinuities in color space at these hue angles. Computations in these
color spaces are complicated by the discontinuity that arises at the 0◦ / 360◦ hue
angle [922].
In summary, the HSL-related color spaces discussed in the following sections
could be used for selecting device-dependent colors, for instance in drawing pro-
grams. Nonetheless, it should be noted that the meaning of the axes does not
correlate well with perceptual attributes of the same names, and care must be
taken not to confuse these spaces with more advanced color appearance spaces.
8.5.1 HSI
Replacing the lightness axis with an intensity axis, which is computed somewhat
differently, we arrive at the HSI space. The minimum of the three R, G, and B
values is used as an intermediary variable:
R+G+B
I= , (8.50a)
3
vmin
S = 1− , (8.50b)
I
(8.50c)
i i
i i
i i
i i
⎧
⎪ B G
⎪
⎪1−h if > ∧ S > 0,
⎪
⎨ I I
H= B G (8.50d)
⎪h if ≤ ∧ S > 0,
⎪
⎪ I I
⎪
⎩
undefined if S = 0.
The hue angle H is undefined for colors with S = 0. This stems from the fact that
in this case the color is monochromatic, and defining a value for hue would not
be meaningful.
The conversion from HSI to RGB begins by converting H to degrees:
H = 360 H. (8.51)
An RGB triplet is then formed as follows:
⎧⎡
⎤
⎪ S cos(H )
⎪⎢ 3
⎪
1
1 + cos(60−H ) ⎥
⎪
⎪ ⎢ 1 − (B + R) ⎥
⎪
⎪
⎪ ⎣ ⎦ if 0 < H ≤ 120,
⎪
⎪
⎪
⎪ 3 (1 − S)
1
⎪
⎪
⎪
⎪ ⎡ ⎤
⎡ ⎤ ⎪ ⎪
⎪ 1
(1 −
R ⎪
⎨⎢
3 S)
⎥
⎣G⎦ = ⎢ 1 1 + S cos(H −120) ⎥ if 120 < H ≤ 240, (8.52)
⎪ ⎣3 cos(180−H ) ⎦
B ⎪
⎪
⎪
⎪ 1 − (R + G)
⎪
⎪
⎪⎡
⎪ ⎤
⎪
⎪ 1 − (G + B)
⎪
⎪
⎪
⎪ ⎢ ⎥
⎪
⎪ ⎢ 1
(1 − S) ⎥
⎪
⎪
⎪ ⎣
3 ⎦ if 240 < H ≤ 360.
⎩ 1 1 + S cos(H −240)
3 cos(300−H )
It should be noted that some of the R, G, and B values depend on the prior com-
putation of other R, G, and B values in the same triplet. This implies that these
values must be computed sequentially, as they are interdependent.
8.5.2 HSV
The conversion between RGB and HSV begins by finding the minimum and max-
imum values of the RGB triplet:
vmin = min (R, G, B) , (8.53a)
vmax = max (R, G, B) . (8.53b)
Saturation S and value V are defined in terms of these values:
vmax − vmin
S= , (8.54a)
vmax
V = vmax . (8.54b)
i i
i i
i i
i i
Of course, when R = G = B we have a gray value, and thereby a color for which
the saturation is 0 since vmax = vmin . In that case, hue H is undefined. Otherwise,
hue is computed as an angle ranging between 0◦ and 360◦ :
⎧
⎪ G−B
⎪
⎪ 60 if R = vmax ,
⎪
⎪ v − vmin
⎪
⎪
max
⎪
⎨
B−R
H = 60 2 + if G = vmax , (8.55)
⎪
⎪ vmax − vmin
⎪
⎪
⎪
⎪ R−G
⎪
⎪
⎩60 4 + if B = vmax .
vmax − vmin
If the hue angle is less than 0 after this computation, the angle should be brought
within range by adding 360◦ . The inverse transformation, from HSV back to
RGB, begins by dividing the hue angle by 60:
H
H = . (8.56)
60
The fractional part of H is given by
) *
f = H − H . (8.57)
a = V (1 − S), (8.58a)
b = V (1 − S f ), (8.58b)
c = V (1 − S (1 − f )) . (8.58c)
Figure 8.7. Hue angles (right) computed for the image on the left; Zaragoza, Spain,
November 2006.
i i
i i
i i
i i
Figure 8.8. Cardinal hue angles. From top to bottom, left to right: original image, fol-
lowed by images with hue set to 0◦ , 60◦ , 120◦ , 180◦ , 240◦ , and 300◦ ; Zaragoza, Spain,
November 2006.
i i
i i
i i
i i
The integer part of H determines which of the four quadrants the hue angle
H lies in. It is used to select which values to assign to R, G, and B:
⎧+ ,T
⎪
⎪ V c a if H = 0,
⎪
⎪
⎪
⎪ + ,T
⎪
⎪
⎪
⎪ b V a if H = 1,
⎪
⎡ ⎤ ⎪ ⎪+
⎪
,T
R ⎨ a V c if H = 2,
⎣G⎦ = + ,T (8.59a)
⎪
⎪ H = 3,
B ⎪
⎪ a b V if
⎪
⎪ + ,
⎪
⎪ c a V T if H = 4,
⎪
⎪
⎪
⎪
⎪
⎪ + ,
⎩ V a b T if H = 5.
i i
i i
i i
i i
von Kries
Relative response 1.0
L
0.8 M
S
0.6
0.4
0.2
0.0
400 450 500 550 600 650 700
Wavelength (nm)
Figure 8.9. Relative response functions for the LMS cone-excitation space (after [318]).
⎡ ⎤ ⎡ ⎤⎡ ⎤
X 1.9102 −1.1121 0.2019 L
⎣Y ⎦ = ⎣ 0.3710 0.6291 0.0000⎦ ⎣M ⎦ . (8.60b)
Z 0.0000 0.0000 1.0000 S
As the LMS cone space represents the response of the cones in the human visual
system, it is a useful starting place for computational models of human vision.
It is also a component in the CIECAM02 and iCAM color appearance models
(see Chapter 11). The relative response as a function of wavelength is plotted in
Figure 8.9.
i i
i i
i i
i i
Bradford
1.0
Relative response
R
0.8 G
B
0.6
0.4
0.2
0.0
-0.2
400 450 500 550 600 650 700
Wavelength (nm)
Figure 8.10. Relative response functions for the Bradford chromatic-adaptation transform
(after [318]).
CAT02
Relative response
1.0
R
0.8 G
B
0.6
0.4
0.2
0.0
-0.2
360 410 460 510 560 610 660 710 760
Wavelength (nm)
Figure 8.11. Relative response functions for the CAT02 chromatic-adaptation transform.
i i
i i
i i
i i
⎡ ⎤
0.7328 0.4296 −0.1624
MCAT02 = ⎣−0.7036 1.6975 0.0061⎦ ; (8.62a)
0.0030 0.0136 0.9834
⎡ ⎤
1.0961 −0.2789 0.1827
−1
MCAT02 = ⎣ 0.4544 0.4735 0.0721⎦ . (8.62b)
−0.0096 −0.0057 1.0153
i i
i i
i i
i i
For any cell in the human visual system that linearly combines cone signals, there
will be an azimuth and elevation pair (φ0 , θ0 ) defining a null plane. The cell will
not respond to any colors lying in this plane. Modulating colors along a direction
perpendicular to this plane will maximally excite this cell. For a given color
encoded by (φ , θ ), the response R of a cell in spikes per second is
i i
i i
i i
i i
⎧√
⎨3r for r > 0.008856,
f (r) = 16 (8.67)
⎩7.787 r + for r ≤ 0.008856.
116
i i
i i
i i
i i
The letter E stands for difference in sensation (in German, Empfindung) [560].
2 This inverse transform is approximate, but widely used in practice. It would only be inaccurate
for colors that are both very dark and saturated. Such colors are rare in nature.
i i
i i
i i
i i
Figure 8.12. An example of the use of color-difference metrics. Here the ΔEab ∗ metric is
applied to a pair of images with the top-left image encoded in a lossless file format (PPM),
whereas the top-right image was stored as a jpeg file. The bottom image shows the per
∗ color difference; Wick, UK, October 2006.
pixel ΔEab
shades indicating larger errors. This image shows qualitatively where in the im-
∗
age the largest errors occur. For quantitative evaluation of errors, the actual ΔEab
values should be examined.
It should be noted that this color-difference metric is only approximately lin-
ear with human visual perception. The reason is that the laboratory conditions
under which this difference metric was tested, namely by observing flat reflection
samples on a uniform background, has little bearing on typical imaging applica-
tions. Hence, this is a case where laboratory conditions do not extend to the real
world. Also, it should be stressed that the CIELAB color-difference equations
were not designed for predicting very large color differences. Further develop-
ments have led to improved color difference metrics, and these are outlined in
Section 8.8.
i i
i i
i i
i i
b* yellowness
Cab
hab
-b* blueness
+ ,1/2
∗
ΔHab = (ΔE ∗ )2 − (ΔLab
∗ 2 ∗ 2
) − (ΔCab ) , (8.71a)
+ ,1/2
∗
ΔEab = (ΔL∗ )2 + (ΔCab
∗ 2 ∗ 2
) + (ΔHab ) . (8.71b)
i i
i i
i i
i i
b* b*
Δb*
ΔHab ΔCab
Δa*
a* a*
Figure 8.14. Tolerances in (a∗ ,b∗ ) space yield axis-aligned error metrics, whereas in
CIELAB L∗ C∗ hab space, the error metric is aligned with the principal axis of the ellipsoids.
tolerances, the use of weighted values for lightness, hue, and chroma yields closer
approximations to these elliptical areas of equal tolerance [93]. This is visualized
in Figure 8.14. Thus, color-difference metrics using well-chosen weights on light-
ness, hue, and chroma differences are considered to give more accurate results
than the color-difference formulae associated with the CIELAB and CIELUV
color spaces.
The inverse transform from L∗ C∗ hab to CIE L∗ a∗ b∗ is given by
L∗ = L∗ , (8.72a)
∗ ∗
a = C cos (hab ) , (8.72b)
∗ ∗
b = C sin (hab ) . (8.72c)
i i
i i
i i
i i
The primed quantities in the above equations are computed from (X,Y, Z) as
follows:
4X 4 Xn
u = , un = ; (8.74a)
X + 15Y + 3 Z Xn + 15Yn + 3 Zn
9Y 9Yn
v = , vn = . (8.74b)
X + 15Y + 3 Z Xn + 15Yn + 3 Zn
The inverse transform from CIE L∗ u∗ v∗ to CIE XYZ begins by computing the
luminance channel Y , ∗
Yn L + 16 3
Y= (8.75)
100 116
and three intermediary parameters,
u∗
a= + un , (8.76a)
13 L∗
v∗
b= + vn , (8.76b)
13 L∗
c = 3Y (5b − 3). (8.76c)
The Z coordinate is then
(a − 4) c − 15abY
Z= , (8.77)
12b
and the X coordinate is
c
X =− + 3Z . (8.78)
b
The transformation to CIE L∗ u∗ v∗ creates a more or less uniform color space,
such that equal distances anywhere within this space encode equal perceived color
differences. It is therefore possible to measure the difference between two stimuli
(L1∗ , u∗1 , v∗1 ) and (L2∗ , u∗2 , v∗2 ) by encoding them in CIELUV space, and applying the
following color difference formula:
+ ,1/2
∗
ΔEuv = (ΔL∗ )2 + (Δu∗ )2 + (Δv∗ )2 , (8.79)
i i
i i
i i
i i
Although real cone signals would never have negative components, the linear
matrix transform used in IPT can generate negative signals for real colors. This
is why care must be taken to maintain the proper sign before applying the non-
linear compression. The IPT opponent space is then reached by a further matrix
i i
i i
i i
i i
transform:
⎡ ⎤ ⎡ ⎤⎡ ⎤
I 0.4000 0.4000 0.2000 L
⎣ P ⎦ = ⎣ 4.4550 −4.8510 0.3960⎦ ⎣M ⎦ , (8.82)
T 0.8056 0.3572 −1.1628 S
where I ∈ [0, 1] and P, T ∈ [−1, 1] under the assumption that the input XYZ values
were normalized. If the IPT axes are scaled by (100, 150, 150), they become
roughly equivalent to those found in CIELAB. The inverse transform begins by
transforming IPT to non-linear cone excitation space:
⎡ ⎤ ⎡ ⎤⎡ ⎤
L 1.8502 −1.1383 0.2384 I
⎣M ⎦ = ⎣ 0.3668 0.6439 −0.0107⎦ ⎣ P ⎦ . (8.83)
S 0.0000 0.0000 1.0889 T
Linearization is then achieved by
L 1/0.43 if L ≥ 0,
L= (8.84a)
− (−L )1/0.43 if L < 0;
M 1/0.43 if M ≥ 0,
M= (8.84b)
− (−M )1/0.43 if M < 0;
S 1/0.43 if S ≥ 0,
S= (8.84c)
− (−S )1/0.43 if S < 0,
and the transform to CIE XYZ is given by
⎡ ⎤ ⎡ ⎤⎡ ⎤
X 1.0000 0.0976 0.2052 L
⎣Y ⎦ = ⎣ 1.0000 −1.1139 0.1332⎦ ⎣M ⎦ . (8.85)
Z 1.0000 0.0326 −0.6769 S
The IPT color space bears similarities with CIELAB, although with different co-
efficients. It was designed specifically to have the strengths of CIELAB, but to
avoid the hue changes that could occur when compressing chroma along lines of
constant hue. It is a suitable color space for many color-imaging applications,
such as gamut mapping.
i i
i i
i i
i i
idea behind this approach is to determine if there is any link between the encod-
ing of signals in the human visual system and natural image statistics. A set of
natural images is used here, so that any averages computed are representative for
a canonical natural scene. The photoreceptor output is simulated by converting
the spectral images to the LMS color space.
The principal components analysis rotates the data so that the first princi-
pal component captures most of the variance. This can be understood by as-
suming that the data forms, on average, an elliptical point cloud. PCA rotates
this ellipse so that its major axis coincides with the axis defined by the first
principal component. Its second most important axis is aligned with the sec-
ond principal component. The remaining axis is aligned with the third principal
component. The principal components form an orthogonal coordinate system
and, thus, form a new color space. In this color space, the data is maximally
decorrelated.
By analyzing the resulting color space, it was found that this color space
closely resembles a color opponent space. The 3 × 3 matrix that transforms be-
tween LMS and the PCA-derived color space consists of elements that, aside from
a weight factor, recombine LMS signals in close-to-integer multiples. In addition,
it was found that the orientation of the axes is such that they can be assigned mean-
ing. The first principal component represents luminance, whereas the second and
third principal components represent yellow-blue and red-green color opponent
axes, respectively. For this reason, this color space is named Lαβ .
Additionally, it was found that the point cloud obtained becomes more sym-
metrical and well behaved if the color space is derived from logarithmic LMS
values.
The color opponency is shown in Figure 8.15, where the image is decom-
posed into its separate channels. The image representing the α channel has the β
channel reset to 0 and vice versa. We have retained the luminance variation here
for the purpose of visualization. The image showing the luminance channel only
was created by setting both the α and β channels to zero. The transform between
LMS and Lαβ is given by
⎡ ⎤
1
⎡ ⎤ ⎢ √3 0 0
⎥⎡ ⎤⎡ ⎤
L ⎢ ⎥ 1 1 1 log L
⎣α ⎦ = ⎢
1 ⎥⎣
⎢ 0 √ 0 ⎥ 1 1 −2⎦ ⎣log M ⎦ ; (8.86a)
⎢ 6 ⎥
β ⎣ 1 ⎦ 1 −1 0 log S
0 0 √
2
i i
i i
i i
i i
Figure 8.15. The top-left image is decomposed into the L channel of the Lαβ color space,
as well as L + α and L + β channels in the bottom-left and bottom-right images; Rochester,
NY, November 2004.
⎡√ ⎤
3
⎡ ⎤ ⎡ ⎤⎢ 0 0 ⎥⎡ ⎤
log L 1 1 1 ⎢ 3 √ ⎥ L
⎣log M ⎦ = ⎣1 ⎢ 6 ⎥⎣ ⎦
1 −1⎦ ⎢ 0 0 ⎥ α . (8.86b)
log S 1 −2 0 ⎣
⎢ 6 √ ⎥ ⎦ β
2
0 0
2
i i
i i
i i
i i
The diagonal matrix contains weight factors that scale the relative contribution of
each of the three axes.
The theoretical importance of this color space lies in the fact that decorrelation
of natural images, represented in LMS color space, yields a color opponent space
which is very close to the one thought to be computed by the human visual system.
In other words, for natural images, the human visual system appears to decorrelate
the signal it transmits to the brain.
In practice, this color space can be used whenever natural images need to be
processed. Although PCA only decorrelates the data, in practice, the axis tend to
be close to independent. This means that complicated 3D color transforms on im-
age data can be replaced by three simpler 1D transforms. An example application
is discussed in Section 8.10.
i i
i i
i i
i i
0.36 + 0.4 |cos(35 + H̄)| if H̄ ≤ 164◦ ∨ H̄ > 345◦ ,
t= (8.87b)
0.56 + 0.2 |cos(168 + H̄)| if 164◦ < H̄ ≤ 345◦ .
0.0638 C̄∗
SC = + 0.638, (8.88b)
1 + 0.0131 C̄∗
SH = SC ( f t + 1 − f ). (8.88c)
In the above equations, L̄∗ = 0.5(L1∗ + L2∗ ), and C̄ and H̄ are computed similarly.
The color difference ΔECMC is then computed by
L1∗ − L2∗
ΔLCMC = , (8.89a)
l SL
C1∗ −C2∗
ΔCCMC = , (8.89b)
c SC
H1∗ − H2∗
ΔHCMC = , (8.89c)
SH
ΔECMC = ΔLCMC
2 + ΔCCMC
2 + ΔHCMC
2 . (8.89d)
This color difference metric quantifies the perceived difference between two col-
ors, indicated with L1C1 H1 and L2C2 H2 .
i i
i i
i i
i i
the CMC color difference metric that the relative contributions of lightness, hue,
and chroma are divided by functions of chroma to account for the relatively small
color differences that humans can distinguish for near-achromatic colors [507]. In
addition, the CIE 1994 color difference metric also specifies a set of experimental
conditions under which this formula is valid.
The reference conditions include D65 illumination of two homogeneous
patches of color placed side by side in direct contact, and each subtending at
least 4◦ of visual field. The illuminance is rated at 1000 lux. The observer should
have normal color vision.
Like CMC, the CIE 1994 color difference formula derives from the CIE LCH
color space and is also parameterized by a set of weights. They are kL , kC , and
kH , which are normally each set to unity. However, in the textile industry, a value
of 2 is used for kL . First, values for SL , SC , and SH are computed:
SL = 1, (8.90a)
SC = 1 + 0.045 C1∗ C2∗ , (8.90b)
SH = 1 + 0.015 C1∗ C2∗ . (8.90c)
These values are used to compute differences along each of the dimension L, C,
and H:
L1∗ − L2∗
ΔL94 = , (8.91a)
kL SL
C∗ −C2∗
ΔC94 = 1 , (8.91b)
kC SC
H ∗ − H2∗
ΔH94 = 1 . (8.91c)
kH SH
∗ (abbreviated as CIE94) is then given by
The color difference ΔE94
∗
ΔE94 = ΔL94 2 + ΔC 2 + ΔH 2 .
94 94 (8.92)
Finally, if the weights kL , kC , or kH differ from unity, then their values should be
included in the notation, i.e., CIE94 (2 : 1 : 1) or ΔE94∗ (2 : 1 : 1).
The interpretation of values obtained with this metric is that a ΔE94 ∗ value of 1
late well with perceived color difference for saturated blue colors and near neutral
colors [273, 500, 666, 714].
i i
i i
i i
i i
The CIEDE2000 color difference metric derives directly from the CIE L∗ a∗ b∗
color space. It was developed to improve the predictive power of the CIE94 metric
in the saturated blue and near-neutral regions. The near-neutral color prediction is
improved by scaling the a∗ axis differently. The hue and chroma differences are
weighted differently to aid in the prediction of blue color differences.
Given two colors specified in this space, we carry out the following calcula-
∗ :
tions for each color. First, for both samples, we compute Cab
∗
Cab = (a∗ )2 + (b∗ )2 , (8.93)
⎛-
. ∗ 7 ⎞
. C̄
g = 0.5 ⎝1 − / 7ab ⎠. (8.94)
∗
C̄ab + 257
With this value, we compute L , a , b , C , and h for each of the two colors.
L = L∗ , (8.95a)
∗
a = (1 + g) a , (8.95b)
b = b ∗ , (8.95c)
C = (a )2 + (b )2 , (8.95d)
180 −1 b
h = tan . (8.95e)
π a
Note that h and values derived from it are specified in degrees rather than in
radians. The arithmetic means of the pairs of L , C , and h values are also com-
puted. These are denoted by L̄ , C̄ , and h̄ . Checks should be made to ensure that
hue angles remain positive. This also has implications for the computation of the
arithmetic mean of the pair of h values. Here, it is important to ensure that the
hue difference remains below 180◦ . If a larger hue difference is found, then 360◦
is subtracted from the larger of the two hue angles, and the arithmetic mean is
recomputed. Before computing SL , SC , and SH , several intermediary values are
i i
i i
i i
i i
calculated:
(C̄ )7
RC = 2 , (8.96a)
(C̄ )7 + 257
2
h̄ − 275
RT = −RC sin 60 exp − , (8.96b)
25
T = 1 − 0.17 cos h̄ − 30 + 0.24 cos 2h̄ (8.96c)
+ 0.32 cos 3h̄ + 6 − 0.20 cos 4h̄ − 63 , (8.96d)
followed by
L1∗ − L2∗
ΔLCIE00 = , (8.98a)
kL SL
C1∗ −C2∗
ΔCCIE00 = , (8.98b)
kC SC
h1 − h2
2 sin C1 C2
2
ΔHCIE00 = , (8.98c)
k H SH
ΔECIE00 = ΔLCIE00
2 + ΔCCIE00
2 + ΔHCIE00
2 + RT ΔCCIE00 ΔHCIE00 . (8.98d)
and ΔEuv∗ predict a much larger visible difference than the other metrics.
i i
i i
i i
i i
Figure 8.16. The top-right image is a sharpened version of the top-left image. They
are compared in the remaining images using the color difference metrics discussed in this
section; Castle Combe, UK, September 2006.
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
or chromaticness. For a small change in hue, the distance between two colors
of low chroma is much smaller than for colors with higher chroma. This just
suggests that steps of hue in these systems cannot be considered perceptually
uniform, though for any given value and chroma the steps should be close to
uniform.
The Optical Society of America (OSA) Uniform Color Scales System (OSA-
UCS) was designed to remedy this problem [718, 719, 836, 837]. It accomplished
this by structuring the color space according to a crystal lattice, shaped as a cuboc-
tahedron. The complexity of this system, however, requires a software implemen-
tation for it to be useful [93].
i i
i i
i i
i i
3 Natural image statistics are normally collected over large sets of images, or image ensembles.
On average, the 1/ f statistic holds to a high degree of accuracy. However, for individual images this
statistic (and all other natural image statistics) may deviate from the ensemble average by an arbitrary
amount.
i i
i i
i i
i i
Figure 8.17. Examples images used to demonstrate the correlation between channels.
The first two images are reasonable examples of natural images, whereas the third image
is an example of an image taken in a built-up area. Built environments tend to have some-
what different natural image statistics compared with natural scenes [1295,1296]; Top left:
Mainau, Germany, July 2005. Top right: Dagstuhl, Germany, May 2006. Bottom: A
Coruña, Spain, August 2005.
To transfer the color palette of one image to another, both images are thus
first converted to the L α β space. Within this space, along each of the axes a
suitable statistic needs to be transferred. It turns out that a very simple matching
of means and standard deviations is frequently sufficient to create a plausible
result.
Naming the image from which its color palette is gleaned, the source image,
and the image to which this palette is applied, the target image, we use the s and
t subscripts to indicate values in source and target images. We first subtract the
mean pixel value in each of the axes:
i i
i i
i i
i i
-2 1.6
Second channel
Red - Green l-α
-4 Red - Blue 1.2 l-β
Green - Blue α-β
-6
0.8
-8
0.4
-10
0.0
-12
-10 -9 -8 -7 -6 -5 -4 -3 -2 -16 -14 -12 -10 -8 -6 -4 -2 0 2
1.6
Second channel
Figure 8.18. Random samples plotted in RGB color space (left) and L α β color space
(right). The top to bottom order of the plots is the same as the order of the images in
Figure 8.17.
β β
Second, the six standard deviations σsL , σsα , σs , σtL , σtα , and σt are computed.
It is now possible to make the means and standard deviations in the three target
i i
i i
i i
i i
Figure 8.19. Color transfer between images. The left and middle images serve as input,
and the right image was produced by matching means and standard deviations.
σsL
Lt (x, y) = L (x, y) + L̄s , (8.100a)
σtL t
σα
αt (x, y) = sα αt (x, y) + ᾱs , (8.100b)
σt
β
σs
βt (x, y) = β
βt (x, y) + β̄s . (8.100c)
σt
i i
i i
i i
i i
Figure 8.20. The top two images served as input to the color transfer algorithm. The
bottom-left image shows a result whereby means and standard deviations are matched,
whereas the bottom-right image shows a result obtained with histogram matching. Top-left
image: Turtle Mound, New Smyrna, FL, November 2004. Other images: Lake Konstanz,
Germany, July 2005.
i i
i i
i i
i i
Figure 8.21. The University of Central Florida logo superimposed on a background [951].
The color transfer algorithm was applied to the image on the right and makes the logo blend
in with the background better; UCF Campus, Orlando, FL, June 2004.
Figure 8.22. The color transfer algorithm benefits from the appropriate choice of color
space. The top and left-most images are the input, whereas the bottom-middle image was
computed in RGB space. The bottom-right image was computed in the decorrelated L α β
space. Both transfers were computed using histogram matching; Nicosia, Cyprus, June
2006.
i i
i i
i i
i i
color statistics, which makes them fit in the scene better [951]. An example is
shown in Figure 8.21.
It is interesting to compare the results obtained in L α β color space with im-
ages created in different color spaces. In particular, when the results are created in
an RGB color space, it is expected that due to correlations between the axes, the
results will not be as good. That this is indeed the case is shown in Figure 8.22.
As mentioned above, the color transfer algorithm implicitly assumes that the
source and target images adhere to natural image statistics. Although image en-
sembles tend to show remarkable statistical regularities, individual images may
have different statistical properties. It is therefore not guaranteed that this algo-
rithm will work on any pair of images.
In particular, if the composition of the source and target images is very differ-
ent, the algorithm may fail. In general, transformations whereby the source image
has a fairly small color palette (such as night images, sunsets, and everything with
a limited number of different colors) tend to work well.
If the source and/or the target image is not a natural image in the statistical
sense, then the L α β color space is not an appropriate choice of color space. In
that case, it may be possible to explicitly decorrelate the data in the source and
target images separately, i.e., run PCA on both images [551]. It is then possible to
construct a linear transform between the source and target image. This transform
thus defines an image-specific color space, which may then be used to equalize
the three means and standard deviations.
This section has shown that the appropriate choice of color space is sometimes
crucial to achieve a particular goal. Most applications in computer vision and
computer graphics will benefit from the appropriate choice of color space. Color
should be an important consideration in the design of visual algorithms.
i i
i i
i i
i i
Figure 8.23. Conversion from color to grayscale using the luminance channel of the Lαβ
color space; Castle Combe, UK, September 2006.
The use of the Lαβ color space is motivated by the fact that, for natural image
ensembles, this color space is obtained by employing PCA, as outlined in the
preceding section. As the luminance channel is the first principal component, this
particular grayscale conversion will, on average, retain the most information.
Although an appropriate choice on average, for a specific image this conver-
sion may yield a distinct loss of information. For instance, if the image contains
large variations in chromatic content, without exhibiting large variations in its
luminance channel, then the Lαβ color space will yield unconvincing results.
In such cases, we may apply PCA to each image individually, yielding the
best possible color space achievable with linear transformations only. By taking
only the first principal component and removing the second and third principal
components, a grayscale image is obtained.
While this approach may produce somewhat better results for certain images,
this is largely dependent on the distribution of points in color space. Principal
components analysis is only able to rotate the point cloud such that the direction
of greatest variance becomes axis-aligned. By projecting the data onto the first
principal component, we are not guaranteed a plausible grayscale conversion. For
instance, consider a dumbbell-shaped distribution of points such as in images of
green foliage with red fruit. The greens will be clustered in one sub-volume of
color space, and the reds will be clustered elsewhere. In the best case, PCA will
rotate the data such that all the greens are mapped to very light colors, and all the
reds are mapped to very dark colors. The mid-range of the grayscale will end up
being under-utilized.
A better approach would be to preserve the color difference between pairs of
pixels as much as possible. Here, it is important to start with a perceptually uni-
form color space, so that color differences have perceptual meaning. For instance,
the CIE94 color difference metric could be used. Assuming that a tristimulus
value c is mapped to grayscale by a function T (c), and the color difference met-
i i
i i
i i
i i
ric is written down as ci − c j , then the color difference between pairs of pixels
should remain constant before and after the mapping [938]:
ci − c j T (ci ) − T (c j )
= . (8.101)
Crange Trange
A measure of the error introduced by the mapping T is given by
ci − c j T (ci ) − T (c j ) 2
ε =∑ ∑
2
− . (8.102)
i j=i+1 Crange Trange
Here, Crange and Trange are the maximum color differences in the color and
grayscale images. Restricting the mapping T to linear transformations in the
CIELAB color space, the general form of T is
As two of the three dimensions are mapped to 0, the function T may be replaced
by the scalar function g · c, where g is the vector in color space that determines
the axis to which all color values should be projected to obtain a grayscale image.
Its direction will be determined shortly. The color difference metric for T is now
the absolute value of the difference of two gray values. The error function thus
simplifies to
ci − c j |g · (ci − c j ) | 2
ε =∑ ∑
2
− . (8.104)
i j=i+1 Crange Trange
A solution to the grayscale conversion is then found by minimizing this error
function, which is achieved with standard minimization techniques. For instance,
the Fletcher-Reeves conjugate gradient method can be used. The optimization
procedure can be seeded with an initial solution of g = (1, 0, 0). After conver-
gence, the vector g for which the error is minimized then determines the axis onto
which all colors are projected, leading to the desired grayscale image.
The power of this approach, however, is still limited by the fact that all pixel
values are projected to a single dimension. Many different colors may project to
the same gray value. A possible alternative approach can be derived by realizing
that different colors should not be projected to the same gray value if they are
neighboring. However, pixels separated from each other by differently colored
image regions may successfully be mapped to the same gray value. Such an ap-
proach would help overcome the problem of having a limited number of different
gray values to which all pixels must be mapped.
One such approach, called Color2Gray, begins by converting the image to the
CIE L∗ a∗ b∗ space [378]. For a pair of pixels, indicated with subscripts i and j, the
i i
i i
i i
i i
∑ ((gi − g j ) − δi j )2 . (8.106)
i, j
Here, gi is the gray value of pixel i which is found by minimizing this function.
It is initialized by setting g = Li . Once again, conjugate gradient methods are
suitable for finding a minimum.
The target differences δi j are computed under user control. Just as the a∗
and b∗ channels span a chromatic plane, the color differences in these channels,
Δa∗ and Δb∗ , span a chromatic color-difference plane. To determine which color
differences are mapped to increases in gray value, the user may specify an angle
Θ in this plane. The associate unit length vector is denoted with vΘ .
A second user parameter α steers a non-linearity cα (x) that is applied to the
chromatic color difference ΔCi j . This compressive function assigns relatively
more importance to small color differences than large ones:
x
cα (x) = α tanh . (8.107)
α
The target color difference for a pair of pixels (i, j) is then
⎧
⎪
⎨ΔLi j if |ΔLi∗j | > cα ( ΔCi∗j ),
∗
δi j = cα ( ΔCi j ) if ΔCi∗j · vΘ ≥ 0, (8.108)
⎪
⎩
cα ( − ΔCi∗j ) otherwise.
i i
i i
i i
i i
Figure 8.24. The red of this anturium stands out clearly in the color image. However,
conversion to gray using the luminance channel of Lαβ space makes the flower nearly
disappear against the background. The bottom-left image shows the result of applying
PCA directly to the color image. The bottom-right image shows the result obtained by
applying the spatial Color2Gray algorithm [378].
i i
i i
i i
i i
The resulting XYZ values represent the material as seen under illuminant L. The
dominant illuminant available within the environment should be chosen for L.
This is possible, even if there are many light sources in an environment, as most
of these light sources typically have the same spectral emission spectra.
Conversion to a sharpened color space then involves multiplication by a 3 × 3
matrix:
⎡ ⎤ ⎡ ⎤
R X
⎣G ⎦ = Msharp ⎣Y ⎦ . (8.110)
B Z
i i
i i
i i
i i
After rendering in this color space, the resulting pixels need to be converted to
the space in which the display device operates. This is achieved by converting
each pixel first to XYZ and then to the desired display space. Assuming this is
the sRGB color space, the conversion is then
⎡ ⎤ ⎡ ⎤
Rd R
⎣Gd ⎦ = MsRGB M −1 ⎣G ⎦ . (8.112)
sharp
Bd B
Comparing renderings made in the sRGB and sharpened color spaces with
a direct spectral rendering, it is found that the difference between the reference
and the sharpened images remained at or below detectable levels. Measured us-
ing the CIE ΔE94∗ color difference metric (see Section 8.8.2), the average differ-
ence is 4.6 for the 98th percentile (with 5.0 visible in side-by-side image com-
parisons) [1208]. The sRGB color space does not perform as well, producing
differences with the reference image that are more easily perceived at 25.4.
Aside from the choice of color space, color-accurate rendering will require
white balancing, as spectral measurements of materials are typically made under
different illumination than they will be rendered. Second, the dominant illumi-
nant used in the rendering is usually different from the conditions under which
the resulting image is viewed. Both are sources of error for which correction is
needed. This topic is discussed in more detail in Section 10.7, where rendering is
revisited.
i i
i i
i i
i i
i i
i i
i i
i i
paint to be matched. However, the number of different pigments that can be used
in the mixture is limited to three [412].
The approach described here follows Haase and Meyer’s technique and is
based on minimizing the color difference between a trial match and the paint
sample [412]. The CIE L∗ a∗ b∗ ΔE color difference, explained in Section 8.7.1, is
used for this application.
As the aim is to find concentrations for each pigment, a mixture where one
or more of the concentrations is negative, is not allowed. To avoid setting up an
optimization problem where negative concentrations are a possible outcome, the
problem is restated to find the square of concentrations for each pigment, labeled
qi = c2i for the ith pigment.
Initially, a random XYZ tristimulus value is guessed, and the ΔE color differ-
ence with the target sample is computed. To enable a gradient descent method to
be used, the gradient of ΔE 2 with respect to each of the squared concentrations qi
is computed:
∗ 2 ∗ 2
∂ ΔEab ∂ ΔEab
= 2 qi . (8.113)
∂ qi ∂ ci
By the chain rule, we have
∗ 2
∂ ΔEab ∗ 2
∂ ΔEab ∗ 2 ∗ 2
∂ X ∂ ΔEab ∂ Y ∂ ΔEab ∂Z
= + + . (8.114)
∂ ci ∂ X ∂ ci ∂ Y ∂ ci ∂ Z ∂ ci
The right-hand side of this equation can be expanded by expressing each of the
partials as follows:
∗ 2
∂ ΔEab 1000Δa∗
= , (8.115a)
∂X 1/3
3X 2/3 X0
∗ 2
∂ ΔEab −1000Δa∗ + 232ΔL∗ + 400Δb∗
= , (8.115b)
∂Y 1/3
3Y 2/3 X0
∗ 2
∂ ΔEab −400Δb∗
= . (8.115c)
∂Z 1/3
3Z 2/3 Z0
The remaining partials are expanded as follows:
∂X ∂ R(λ )
= k ∑ L(λ )x̄(λ ) , (8.116a)
∂ ci λ
∂ ci
∂Y ∂ R(λ )
= k ∑ L(λ )ȳ(λ ) , (8.116b)
∂ ci λ
∂ ci
∂Z ∂ R(λ )
= k ∑ L(λ )z̄(λ ) . (8.116c)
∂ ci λ
∂ ci
i i
i i
i i
i i
Here, x̄(λ ), ȳ(λ ), and z̄(λ ) are standard color-matching functions, L(λ ) is the
emission spectrum of the light source lighting the sample, and k is a normalization
constant given by 100/ ∑λ L(λ )ȳ(λ ). The partial ∂ R(λ )/∂ ci is given by
⎛ ⎞
KM (λ )
⎜ +1 ⎟
∂ R(λ ) SM (λ )Ki (λ ) − KM (λ )Si (λ ) ⎜⎜ SM (λ ) ⎟
⎟.
= ⎜ 1− ⎟
∂ ci S M (λ )
2
⎝ KM (λ ) 2 KM (λ ) ⎠
+2
SM (λ ) SM ( λ )
(8.117)
In this equation, KM and SM are the absorption and scattering functions for the
current trial, and Ki and Si are the absorption and scattering functions for the ith
pigment.
With the current match trial a gradient vector is computed; this is then added
to the current squares of the concentrations for each pigment qi . This leads to
new values for the concentrations ci , which in turn are used to compute a new
match trial. This process iterates until convergence is reached, i.e., all gradients
are below a specified threshold. The concentrations found for each pigment can
then be used to mix a paint that will look similar to the target XYZ tristimulus
value.
The match is valid under the chosen illuminant L. However, it may be benefi-
cial to minimize the effects of metamerism if the paint samples are to be viewed
under different illuminants. For instance, if it is assumed that a match is required
under CIE illuminant C, but metamerism needs to be minimized under CIE illumi-
nant A with the initial match being four times as important, then the minimization
function could be modified to
The matching algorithm outlined here has important benefits over standard tris-
timulus matching approaches that produce a match in XYZ space. By using the
uniform CIE L∗ a∗ b∗ color space, the match is found by minimizing the ΔE color-
difference metric that corresponds to how humans perceive color differences. Sec-
ond, an arbitrary number of pigments can be used in the match.
It would be possible to improve this method, though, as the steepest descent
method is not guaranteed to find a global minimum. Second, the method does not
take into account the fact that it may be both costly and inconvenient to mix as
many pigments as there are reflectance functions in the database used to compute
the match. As a result, it is likely that the color match involves small concentra-
tions of many pigments. It would be beneficial to devise a method that would find
the smallest number of pigments that would create a match.
i i
i i
i i
i i
i i
i i
i i
i i
Figure 8.25. The image on the top left was captured in a scene where the gray world
assumption was assumed to hold approximately. The image on the right shows the infor-
mation captured by the luminance channel of the Yxy color space. The bottom two images
show the information encoded by the color channels (x and y, respectively). Notice how
most of the information across the shadow edge is contained in the luminance channel,
whereas all channels encode the change across reflectance edges.
and color channels are probably reflectance edges. Figure 8.25 illustrates this
difference between shadow and reflectance edges in scenes where the spectral
component of ambient light is the same as that of direct light.
It is hypothesized that the difference across shadow edges in sunlit scenes
will be in luminance as well as color, as the spectral component of ambient light
is possibly different from that of direct light for such scenes. Sunlight, which is
the direct light source, has a distinctly yellow spectrum, while the rest of the sky
appears blue due to Rayleigh scattering (as discussed in Section 2.8.1). Ambient
light therefore tends to have a bluish color. Figure 8.26 demonstrates this change
across shadow edges in outdoor scenes. There appears to be a change in color
as well as luminance across the edge. Choosing color spaces that separately en-
code luminance and color information is therefore not helpful in distinguishing
between shadow and reflectance edges in outdoor scenes.
Given that the changes in color across shadow edges are possibly in blue and
yellow (as the direct light is predominantly yellow and the ambient light is blue),
color opponent spaces may be better choices for the purpose of edge classifica-
tion than other color spaces, since these separately encode blue-yellow and red-
green information. Shadow edges will appear in the luminance and blue-yellow
i i
i i
i i
i i
Figure 8.26. Colors appear to have a bluer shade in shadows and a yellower shade under
direct light. The left image shows a color under direct sunlight, and the right image shows
the same color in shadow. Images were derived from the photograph shown in Figure 8.27.
channels, but not in the red-green channel, while all three channels will tend to
encode the difference across reflectance edges. The two edge types will therefore
often be distinguishable in outdoor scenes. The invariance to shadows in the red-
green channel was shown to have an evolutionary advantage in monkey foraging
behavior [1144].
The performance of 11 color spaces were tested for the purpose of shadow-
edge identification [585]. These include RGB, XYZ, Yxy, normalized rgb, LMS,
Luv, Lab, Lαβ , HSV, AC1 C2 , and the linear version of Lαβ (where the log of
LMS values is not taken before conversion to Lαβ ). When an image is converted
into a color space, one channel in the color space might encode the difference
across reflectance edges only, while another channel might encode the difference
across both shadow and reflectance edges. In that case, classification of edges
in the image may be done by first converting the image to the color space and
then using these two channels to distinguish between the two types of edges. The
most suitable color space for distinguishing between the edge types will have one
channel that does not distinguish between the edge types and one channel that
best discriminates between the edge types when compared with all the channels
in all the color spaces in the experiment.
Data for hundreds of edges of both types were collected and encoded by each
of the 11 color spaces. Edge data was generated by photographing a variety of dif-
fuse paint samples and then cropping these photographs to obtain uniform patches
of each paint sample under direct light as well as in shadow. Figure 8.27 shows
one of the photographs and the corresponding patches obtained from it. An edge
may be represented by a pair of such patches. For a shadow edge, one patch will
contain a color in direct light, and the second patch will contain the same color in
shadow. For a reflectance edge, the patches will contain different colors in direct
light; the patches might also contain different colors in shadow.
To compute the response of a color space to a shadow or reflectance edge,
the two patches representing the edge are converted into that color space. The
values of each patch are then averaged to obtain a single value representing color,
i i
i i
i i
i i
Figure 8.27. The image at the top shows a set of colored paint samples. The patches at the
bottom were cropped from this photograph. The top row shows patches from the shadow
region, and the bottom row shows corresponding patches from the lit region.
followed by an assessment of how different these values are for the two patches. If
a channel shows large differences in the average values of the patches representing
reflectance edges and small differences for patches representing shadow edges,
then that channel is a good classifier.
The discrimination of each channel is computed using receiver operating char-
acteristic (ROC) curves [773]. A ROC curve is a graphical representation of the
accuracy of a binary classifier. They are created by thresholding each channel’s re-
sponse to reflectance and shadow edges. The true positive fraction is the fraction
of correctly identified reflectance edges against the total number of reflectance
edges, and the true negative fraction is the same fraction for shadow edges. The
true positive fraction is plotted against the false positive fraction to give the ROC
curve. The area under the ROC curve (AUC) is the most commonly used quan-
titative index describing the curve. This area has been empirically shown to be
i i
i i
i i
i i
1.0 1.0
Lαβ Linear Lαβ
0.8 0.8
0.6 0.6
0.4 0.4
L L
0.2 α 0.2 α
β β
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Figure 8.28. ROC curves for each channel in each of the 11 color spaces included in the
experiment. (Erum Arif Khan and Erik Reinhard, “Evaluation of Color Spaces for Edge
Classification in Outdoor Scenes,” IEEE International Conference on Image Processing,
Genova, Italy, pp 952–955, c 2005 IEEE.)
i i
i i
i i
i i
Table 8.5. Channels and their AUCs for outdoor and indoor scenes.
equivalent to the Wilcoxon statistic and computes the probability of correct clas-
sification [421]. An area of 1 signifies perfect classification while an area of 0.5
means that the classifier performs at chance.
A suitable color space should have one channel that has an AUC close to
0.5, and another channel with an AUC close to 1. The most suitable color space
is one for which the discriminating channel’s AUC is closest to 1. The ROC
i i
i i
i i
i i
curves computed for each channel for sunlit scenes are shown in Figure 8.28 and
the AUCs for each channel (for both indoor and outdoor scenes) are shown in
Table 8.5. Note that the ROC curve for normalized rgb shows the absence of any
channel that does not discriminate between the two edge types. It is therefore
removed from further consideration.
Four of the color spaces under consideration show color opponency. These
color spaces are CIELAB, Lαβ , linear Lαβ and AC1 C2 . While CIELAB is a
perceptually uniform space, and Lαβ is perceptually uniform to a first approxi-
mation, the latter two are linear color spaces. The performance of the perceptually
uniform color spaces is significantly better than their linear counterparts.
For indoor scenes however, where there is only a change in luminance across
shadow edges, and all that is required of a color space is separate encoding of
luminance and chromatic information, other color spaces tend to perform as well
as color opponent spaces. For outdoor scenes, color opponent spaces perform
well. The two perceptually uniform color opponent spaces prove to be the most
suitable choices. CIELAB and Lαβ may be used to get the best discrimination
between shadow and reflectance edges, followed by Yxy and HSV color spaces.
From this experiment, it is shown that color is an important cue in the classi-
fication of edges for sunlit scenes even when no assumptions are made about the
gray world and the effects of participating media. Experimental results show that
opponent color spaces are a good choice for the purpose of edge classification.
i i
i i
i i
i i
Chapter 9
Illuminants
The real world offers a wide range of illumination, which changes based on the
atmospheric conditions of the moment, the time of the day, and the season of the
year. Similarly, the range of illumination from man-made light sources is vast,
ranging from the yellowish light of a tungsten filament to the bluish-white of a
fluorescent light source.
The apparent color of an object is determined by the spectral power distribu-
tion of the light that illuminates the object and the spectral reflectance properties
of the object itself. Therefore, the wide range of illumination under which an
object may be viewed affects the perception of its color. This poses a significant
problem, especially in industrial colorimetry, where the main concern is to match
the color of a product to a given standard. The match achieved under one light
source may no longer hold when the product is displayed under a different light
source.
To alleviate this problem, the Commission Internationale de l’Éclairage (CIE)
defines a set of specific spectral power distributions called CIE Standard illumi-
nants. Using CIE illuminants in colorimetric computations ensures consistency
across measurements. It also simplifies the color match problem and makes it
independent of individual differences between light sources.
i i
i i
i i
i i
492 9. Illuminants
struction of a light source that is aimed to represent that illuminant. The rationale
is that new developments in light-source technology are likely to produce better
light sources that more accurately represent any desired illuminant [194].
CIE illuminants represent general types of light sources commonly encoun-
tered in real life. For example, incandescent and fluorescent light sources and
different phases of daylight are all represented by CIE illuminants.
Me,λ (2856)
SA (λ ) = 100 . (9.3)
Me,560 (2856)
i i
i i
i i
i i
300
Relative Radiant Power
250
200
150
100
50
0
300 400 500 600 700 800 900
Wavelength (λ)
Figure 9.1. Relative spectral radiant power distribution of CIE standard illuminant A.
Table 9.1. Correlated color temperature (CCT), tristimulus values, CIE 1931 xy and CIE
1976 u v chromaticity coordinates of commonly used CIE illuminants. Spectral data of all
CIE illuminants are given in the accompanying DVD-ROM.
i i
i i
i i
i i
494 9. Illuminants
140
Relative Radiant Power
D50
120 D55
D65
D75
100
80
60
40
20
0
300 400 500 600 700 800 900
Wavelength (λ)
i i
i i
i i
i i
Figure 9.3. The Gretag Macbeth D65 daylight simulator consists of a halogen light source
with a blue filter, shown here.
halogen light source. An example is the Gretag MacBeth D65 daylight simulator,
shown in Figures 2.50 and 3.45. The filter is shown in Figure 9.3.
The CIE Colorimetry Committee has developed a method that can be used
to assess the usefulness of existing light sources in representing CIE D illumi-
nants [184, 192]. The method is too elaborate to reproduce here in full. Nonethe-
less, the main steps of the method are:
1. Measure the spectral power distribution of the light source under consid-
eration by means of spectro-radiometry from λ = 300 to λ = 700 at 5-nm
intervals.
2. Compute the CIE 1976 (u10 , v10 ) chromaticity coordinates of the light
source from the measured spectral data. These coordinates must fall within
a circle of radius 0.015 centered at the chromaticity point of D65 on the
same diagram. Otherwise the tested source does not qualify as a daylight
simulator.
3. Even if the test in the previous step succeeds, the quality of the daylight
simulator needs to be determined. For this purpose two tests are performed:
(a) Compute the mean index of metamerism MIvis (see Section 9.4) of
five pairs of samples which are metamers under CIE illuminant D65 .
The spectral radiance factors of these samples are provided by the
CIE [1262]. These computations should use CIE 1964 (X10 ,Y10 , Z10 )
i i
i i
i i
i i
496 9. Illuminants
Table 9.2. Categories of daylight simulators (from [1262]). Using the method explained
in the text, MIvis and MIuv are computed for the test source. Depending on the color-
difference formula, use either the CIE L∗ a∗ b∗ or CIE L∗ u∗ v∗ values to find the corre-
sponding category for the MIvis and MIuv indices. The result is a two-letter grade, such as
BD or AC, indicating the quality of the daylight simulator in the visible and ultra-violet
regions of the spectrum.
tristimulus values. The resulting value indicates how good the meta-
merism of these five sample pairs are under the tested daylight
simulator.
(b) Compute the mean index of metamerism MIuv of three pairs of sam-
ples, where one member of each pair is non-fluorescent and the other
member is fluorescent. The spectral values of these pairs are provided
by the CIE [1262]. These pairs also yield identical tristimulus val-
ues under CIE illuminants D65 . All computations should once more
use CIE 1964 (X10 ,Y10 , Z10 ) tristimulus values. The resulting value
indicates how good the daylight simulator represents the standard il-
luminant D65 in the ultra-violet region of the spectrum.
4. Use MIvis and MIuv as an index to Table 9.2 to find the category under
which the daylight simulator falls. A category of AA indicates that the day-
light simulator has excellent performance both in the visible and ultraviolet
region of the spectrum.
i i
i i
i i
i i
while the graphics art and photography communities favor D50 [93]. The relative
spectral radiant power distributions of these illuminants, together with the CIE
standard illuminant D65 , are shown in Figure 9.2.
i i
i i
i i
i i
498 9. Illuminants
i i
i i
i i
i i
400
Relative Radiant Power
350 D 65
F2
300 F7
250
200
150
100
50
0
300 400 500 600 700 800 900
Wavelength (λ)
Figure 9.4. CIE illuminants F2 and F7 and the CIE standard illuminant D65 .
Although these illuminants were derived from actual fluorescent lamps, their
commercial types were not disclosed. Recognizing that this may limit their us-
ability, this issue has been addressed by work of other researchers [960]. Table 9.3
lists F-series illuminants and their commercial types given by Rich [960].
In this table, several abbreviations are used to denote the commercial type of
each illuminant. For instance CWF, LWF, and WWF represent cool white, lite
white, and warm white fluorescent lamps, respectively, based on their correlated
color temperatures. CWX represents a cool white deluxe lamp which is con-
structed by supplementing a single phosphor lamp, such as CWF, with a second
phosphor to enhance the under-represented parts of the emission spectrum [941].
Figure 9.4 shows the relative spectral radiant power of F2 and F7 along with
the CIE standard illuminant D65 . F11 is shown in Figure 9.5. For this illuminant,
normalization to 100 at 560 nm exaggerates its entire spectral distribution, since
there is very little power at that wavelength. Therefore, for such illuminants,
different normalization strategies, such as normalizing by the entire area under
the spectral curve, are also employed [960].
Recognizing the need to catch up with the rapidly changing fluorescent in-
dustry the CIE released 15 new fluorescent illuminants in 2004 [194]. These new
illuminants are denoted by symbols from FL3.1 to FL3.15 and are classified into
several groups as follows:
• FL3.1 − 3.3: standard halo-phosphate lamps;
• FL3.4 − 3.6: deluxe type lamps;
i i
i i
i i
i i
500 9. Illuminants
3000
2000
1500
1000
500
0
300 400 500 600 700 800 900
Wavelength (λ)
Figure 9.5. CIE illuminant F11 and the CIE standard illuminant D65 . Note that since F11
has very little power at λ = 560, normalization to 100 at this wavelength exaggerates its
entire spectral power distribution.
400
350 D65
FL3.15
300
250
200
150
100
50
0
300 400 500 600 700 800 900
Wavelength (λ)
Figure 9.6. CIE illuminant FL3.15 (daylight simulator) shown with CIE standard illumi-
nant D65 .
i i
i i
i i
i i
300
A
250 B
C
D65
200
150
100
50
0
300 400 500 600 700 800 900
Wavelength (λ)
Figure 9.7. Relative spectral radiant power distributions of CIE standard illuminants A,
D65 , and deprecated CIE standard illuminants B and C. Note the discrepancy of B and C
from D65 especially in the ultra-violet region.
i i
i i
i i
i i
502 9. Illuminants
Table 9.4. Solutions B1 and B2 used for producing filters to realize illuminant B from
illuminant A.
Table 9.5. Solutions C1 and C2 used for producing filters to realize illuminant C from
illuminant A.
i i
i i
i i
i i
3000
5000
7000
9000
11000
13000
25000
Temperature in Kelvins
Figure 9.8. Various colors of a blackbody at different absolute temperatures. Note that
these colors only approximate the true colors due to limitations in the printing process.
i i
i i
i i
i i
504 9. Illuminants
y 0.9
0.8
0.7
0.6
Planckian locus
0.5
4000 2500
5500 1500
0.4
7500
0.3 2000
2856
10000 5000
0.2 6500
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x
1 This quantity was formerly known as the microreciprocal degree or mired [1262].
i i
i i
i i
i i
v’ 0.7
Planckian locus
0.6 Iso-temperature lines
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
u’
Figure 9.10. Iso-temperature lines shown on the CIE 1976 u v chromaticity diagram.
i i
i i
i i
i i
506 9. Illuminants
v’ 0.54
0.52
0.50
0.48
0.46
Planckian locus
0.44 Iso-temperature lines
Figure 9.11. A close-up view of iso-temperature lines shown on the CIE 1976 u v chro-
maticity diagram.
y
Planckian locus
0.8
Iso-temperature lines
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8
x
Figure 9.12. Iso-temperature lines transformed back to the CIE 1931 xy chromaticity
diagram.
i i
i i
i i
i i
y
Planckian locus
0.45 Iso-temperature lines
0.4
0.35
0.3
Figure 9.13. A close-up view of iso-temperature lines on the CIE 1931 chromaticity
diagram. Note that intersections are not orthogonal as a result of the perceptual non-
uniformity of this diagram.
y Daylight locus
Planckian locus D50
0.36
Iso-temperature lines
D55 5000
0.34
D65
6000
0.32
D75
7000
0.30 8000
9000
Figure 9.14. The daylight locus together with the Planckian locus sampled at various
absolute temperatures. With the aid of iso-temperature lines, one may obtain the correlated
color temperatures for daylight illuminants.
i i
i i
i i
i i
508 9. Illuminants
Planckian locus perpendicularly, and the temperature at the intersection gives the
correlated color temperature for all stimuli whose chromaticity points lie on that
iso-temperature line. A close-up diagram is shown in Figure 9.11.
This diagram can be converted back into the more familiar CIE 1931 chro-
maticity diagram using (7.14) (Figure 9.12; a close-up is shown in Figure 9.13).
Note that iso-temperature lines are no longer orthogonal to the Planckian locus as
a result of the non-uniform nature of the CIE xy chromaticity diagram.
As an example, consider finding the correlated color temperature of CIE day-
light illuminants. In Figure 9.14, the chromaticity coordinates of CIE illuminants
D50 , D55 , D65 , and D75 are shown on a line, which is also called the daylight lo-
cus. Since these coordinates do not lie on the Planckian locus, we can only speak
of correlated color temperature for these illuminants. Correlated color tempera-
tures may be found by tracing the iso-temperature lines to their intersection point
with the Planckian locus.
i i
i i
i i
i i
5 6 7 8 12 13 14
5 6 7 8 12 13 14
Figure 9.15. The appearance of the CIE samples under two different lighting conditions.
These samples are specified by their radiance factors and are used to compute the color-
rendering index of light sources.
Here (uk , vk ) and (ur , vr ) are the chromaticity coordinates of the test and the
reference source, with respect to CIE 1960 UCS.
3. Determine the CIE 1931 XYZ tristimulus values and xy chromaticity coor-
dinates of each sample under the test source and the reference illuminant,
yielding (Xk,iYk,i Zk,i ) and (Xr,i Xr,i Xr,i ) for the ith sample.
i i
i i
i i
i i
510 9. Illuminants
Table 9.7. Test color samples used for computation of the CIE color-rendering index of
light sources.
4. Transform the tristimulus values into the CIE 1960 UCS uv chromaticity
coordinates. This can be accomplished by either of the following equations:
4X 6Y
u= v= ; (9.12)
X + 15Y + 3Z X + 15Y + 3Z
4x 6y
u= v= . (9.13)
−2x + 12y + 3 −2x + 12y + 3
5. Account for the chromatic-adaptation difference (see Chapter 10) that would
occur between the test source and the reference illuminant. This results in
the following adapted-chromaticity coordinates:
cr dr
10.872 + 0.404 ck,i − 4 dk,i
ck dk
uk,i = , (9.14)
cr dr
16.518 + 1.481 ck,i − dk,i
ck dk
5.520
vk,i = , (9.15)
cr dr
16.518 + 1.481 ck,i − dk,i
ck dk
where subscripts k and r identify terms associated with the test source and
the reference illuminant, respectively, and i denotes the sample number.
i i
i i
i i
i i
6. Transform the adaptation-corrected uk,i , vk,i values into the CIE 1964 uni-
form color space.2 This is accomplished by
7. Finally, the resultant color shift is computed by the CIE 1964 color differ-
ence formula:
ΔEi = (Ur,i ∗ −U ∗ )2 + (V ∗ −V ∗ )2 + (W ∗ −W ∗ )2 (9.21)
k,i r,i k,i r,i k,i
= (ΔUi∗ )2 + (ΔVi∗ )2 + (ΔWi∗ )2 . (9.22)
ΔEi denotes the color shift that occurs for the ith sample when the test
source is replaced by the reference illuminant. Once ΔEi is obtained the
CIE special color-rendering index, Ri is computed by
This indicates that the maximum special CRI equals 100, and this condition
only occurs if the color difference is zero for that particular sample when
the test source is replaced by the reference illuminant. The constant 4.6
ensures that a standard warm white fluorescent lamp attains a CRI of 50
when compared against an incandescent reference [190].
1 8
Ra = ∑ Ri .
8 i=1
(9.24)
2 Although the CIE 1964 uniform color space and color-difference formula has been replaced by
the CIE 1976 uniform color space and color-difference formula, they have been retained for the time
being for computing color-rendering indices [190].
i i
i i
i i
i i
512 9. Illuminants
i i
i i
i i
i i
either the source or the observer change [1262]. The degree of this mismatch, as
defined by the CIE when either the illuminant or the observer is changed, is called
the CIE metamerism index.
Since the metamerism index indicates how much the perceived color of two
samples will change when the lighting is changed, it is of particular importance
for the paint and the textile industries. For instance, it is important to ensure that
all parts of a car exterior look the same regardless of changes in illumination.
The CIE defines two types of metamerism index. The first index is used to
quantify the degree of mismatch when the illuminant is changed for a constant
observer. This is called the CIE special metamerism index: change in illuminant .
The second index is used when the illuminant is fixed and the observer is allowed
to change. This type of index is called the CIE special metamerism index: change
in observer [188].
The CIE recommends the following procedure to compute the CIE special
metamerism index of two samples:
1. Compute the tristimulus values Xm,i ,Ym,i , Zm,i , (i = 1, 2) and (m = r), of the
two samples under the reference illuminant (the preferred reference illumi-
nant is D65 ):
where Sm=r (λ ) is the spectral power distribution (SPD) of the reference il-
luminant, the βi are the spectral reflectance of the samples, and x̄(λ ), ȳ(λ ),
and z̄(λ ) are the color-matching functions (CIE 1931 or 1964); Δλ is the
sampling resolution of the spectrum. The normalization factor k is calcu-
lated as
100
k= . (9.31)
∑ Sm (λ )ȳ(λ )Δλ
λ
i i
i i
i i
i i
514 9. Illuminants
3. The CIE metamerism index Milm is equal to the CIELAB color differ-
ence ΔEab ∗ between the tristimulus values X
m=t,1 , Ym=t,1 , Zm=t,1 and Xm=t,2 ,
Ym=t,2 , Zm=t,2 (see Section 8.7.1):
∗
Milm = ΔEab . (9.35)
The CIE metamerism index for a change in observer can be computed similarly.
The sole difference is that the illuminant is kept constant while the observer (i.e.,
its associated color-matching functions) is changed.
The CIE metamerism index is particularly important for the paint and textile
industries where all products of a given type should match a standard. In this case,
a lower degree of metamerism is desired [558].
i i
i i
i i
i i
stimulus from these two concepts, one should still know the progression of colors
from the short to the long wavelengths in the light spectrum.
The dominant wavelength of a color stimulus is the monochromatic wave-
length of the spectrum such that when light of this wavelength is mixed with
an achromatic stimulus, the mixture matches the original stimulus in color. It
is customary to use either CIE standard illuminant D65 or A as the achromatic
stimulus.3
The dominant wavelength is denoted by λd and can be computed with the
aid of a CIE chromaticity diagram. Recall, that in this diagram, all colors with
chromaticity coordinates lying on a straight line can be produced by mixing, in
appropriate proportions, the two colors with chromaticity coordinates defining
the end points of this line. Therefore, to find the dominant wavelength of a color
stimulus, we can draw a straight line starting from the chromaticity coordinate
(xw , yw ) of an achromatic stimulus (e.g, D65 ) and passing through the chromatic-
ity coordinate (x, y) of the stimulus in question. If we continue extending this
line until it intersects with the spectrum locus, the wavelength corresponding to
this intersection point gives the dominant wavelength of the color stimulus. This
process is illustrated in Figure 9.16.
y 0.9
0.8 520
Spectrum locus
0.7
560
0.6
500 λd (xb,yb)
0.5 580
3 Although illuminant A has a yellowish hue, it still qualifies as achromatic under some circum-
stances due to the chromatic-adaptation mechanism of the human visual system [558].
i i
i i
i i
i i
516 9. Illuminants
y 0.9
0.4 600
D65 (xw, yw)
0.3
830
0.2 S2 (x,y)
480 Purple line
0.1
360 P (xb, yb)
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x
Figure 9.17. The complementary wavelength λc (or −λd ) of stimulus S2 is computed by
finding the intersection with the spectrum locus of the straight line drawn from S2 toward
the achromatic stimulus (D65 ). The intersection of this line with the purple line is shown
by P and used for computing excitation or colorimetric purity. All data is shown on the
CIE 1931 chromaticity diagram.
i i
i i
i i
i i
an example is shown in Figure 9.17. It is evident from the figure that this happens
when the chromaticity coordinates of a stimulus lie in the triangle whose apex is
the achromatic stimulus.
For such stimuli, we speak of complementary wavelength rather than dom-
inant wavelength. The complementary wavelength λc (or −λd ) indicates the
monochromatic wavelength such that when the stimulus is mixed with a color
of this wavelength, the mixture matches a reference achromatic stimulus in color.
The value λc can be computed similarly to λd .
i i
i i
i i
i i
518 9. Illuminants
or equivalently,
yb
pc = pe . (9.40)
y
The coordinates (x, y), (xb , yb ), and (xw , yw ) have the same meanings as those
used for excitation purity with one crucial exception. For stimuli that possess
a complementary wavelength, (xb , yb ) represents the chromaticity coordinates of
the complementary wavelength λc , rather than the chromaticity coordinates on
the purple line (see Figure 9.17).
In all, it is important to remember that the quantities discussed in this section
are defined on a non-uniform CIE chromaticity diagram. This fact in general, may
limit their use and should be kept in mind whenever a color is expressed by using
these quantities [509].
i i
i i
i i
i i
4 This law is analogous to Moore’s law, which predicts that the number of transistors on chips
i i
i i
i i
i i
520 9. Illuminants
Blue LED, yellow phosphor Blue LED, red LED, yellow phosphor
1.0
Relative luminous
0.8
intensity 0.6
0.4
0.2
0.0
UV LED, RGB phosphors RGB LED
1.0
Relative luminous
0.8
intensity
0.6
0.4
0.2
0.0
400 450 500 550 600 650 700 750 400 450 500 550 600 650 700 750
Figure 9.20. Spectral power distribution functions for some LED types. (after [1059]).
peak can be approximated with the superposition of two Gaussian curves [848]:
5
1 (λ − λ0 )2 2 (λ − λ0 )2
Si (λ ) = exp − + exp − , (9.41)
3 Δλ 3 Δλ
where λ0 is the center of the peak, and Δλ is the width of the peak. The spectral
power distribution of an LED is then the weighted sum of the individual peaks:
S(λ ) = ∑ wi Si (λ ), (9.42)
i
where the weights wi adjust the relative magnitude of the peaks. The widths of
the peaks, given by Δλ , are between approximately 20 and 30 nm. Typical peak
wavelengths for a tetrachromatic LED are λ1 = 450 nm, λ2 = 510 nm, λ3 = 560
nm, and λ4 = 620 nm [1014].
i i
i i
i i
i i
i i
i i
i i
i i
522 9. Illuminants
human skin. As a result, the specular highlight has the same color as the illu-
minant, whereas off-peak angles show the color of the surface. This behavior
is captured in the dichromatic reflection model, which essentially models the bi-
directional reflectance distribution function (BRDF) as an additive mixture of a
specular and diffuse component [596, 1028, 1136].
It is now possible to analyze image areas that are known to contain highlights,
for instance the nose in human faces [1098]. The pixels away from the highlight
represent the diffuse reflection component; these are the darker pixels. They will
lie on an approximately straight line in RGB color space. The lighter pixels will
form a separate cluster, lying on a differently oriented straight line. Separating
the two clusters can be achieved using a variation of principal components analy-
sis [1098].
Alternatively, one may view each surface point as casting a vote for its illumi-
nant [664]. Using the same dichromatic reflection model, whereby each point on
the surface reflects partly its diffuse body color and partly the specular illuminant
color, we can plot pairs of surface points in a CIE chromaticity diagram. Each
pair forms a line segment which, when extended in the appropriate direction, will
include the chromaticity point of the illuminant. In other words, colors from the
same surface will have different purities, and they will all tend to lie on the same
line. Of course, when a surface does not have a strong highlight, then the change
in purity between different points on the surface can be small and, therefore, prone
to error.
Repeating this procedure for a set of points drawn from a different surface,
we find a second line in the chromaticity diagram. The intersection point with the
first line will then yield an estimation of the chromaticity coordinate of the illu-
minant [664]. If there are more surfaces in a scene, then more lines can be drawn,
which theoretically will all pass through the same intersection point. However,
noise in the measurements, as well as diffuse inter-reflection between objects,
may cause these lines to not pass exactly through the same point, thereby intro-
ducing inaccuracies. In that case, each surface in the scene creates a different line
which can be seen as a vote for a particular illuminant color. The actual color of
the illuminant is then estimated to be identical to the majority vote [664].
If it is known beforehand that the illuminant is a blackbody radiator, then the
above algorithm can be simplified to the analysis of a single highlight-containing
surface. Its line on the chromaticity diagram will intersect the Planckian locus
(Figure 9.9).
An example image, shown in Figure 9.21, is informally analyzed by cropping
highlight patches and plotting the pixels of these patches in a chromaticity dia-
gram, shown in Figure 9.22. The analysis is only informal as the camera was not
i i
i i
i i
i i
Figure 9.21. The bull figurine is illuminated from behind by a white light and from the
front by an incandescent source, leading to two spatially separate highlights that can be
analyzed individually. (Bronze figurine by Loet Vanderveen.)
i i
i i
i i
i i
524 9. Illuminants
y 0.9 y 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
Planckian locus Planckian locus
0.2 0.2
0.1 0.1
0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x x
Figure 9.22. The two highlight patches in the insets are taken from Figure 9.21. The
chromaticity coordinates of the white highlight form a line that intersects the Planckian
locus (left), whereas the incandescent highlight does not (right).
i i
i i
i i
i i
Chapter 10
Chromatic Adaptation
In everyday life we encounter a wide range of viewing environments, and yet our
visual experience remains relatively constant. This includes variations in illumi-
nation, which on any given day may range from starlight, to candlelight, incan-
descent and fluorescent lighting, and sunlight. Not only do these very different
lighting conditions represent a huge change in overall amounts of light, but they
also span a wide variety of colors. Objects viewed under these varying conditions
generally maintain an overall constant appearance, e.g., a white piece of paper
will generally appear white (or at least will be recognized as white).
The human visual system accommodates these changes in the environment
through a process known as adaptation. As discussed briefly in Chapter 5, there
are three common forms of adaptation: light adaptation, dark adaptation, and
color or chromatic adaptation. Light adaptation refers to the changes that occur
when we move from a very dark environment to a very bright environment, for in-
stance going from a matinee in a darkened movie theater into the sunlight. When
we first move outside, the experience is dazzling and may actually be painful to
some. We quickly adapt to the bright environment, and the visual experience re-
turns to normal. Dark adaptation is exactly the opposite, moving from a bright
environment to a dark one. At first it is difficult to see anything in the dark room,
but eventually we adapt and it is possible to see objects again.
In general the time required for dark adaptation is much longer than that re-
quired for light adaptation (see Section 3.4.2). Part of the light- and dark-adapting
process can be explained physiologically by the dilation of the pupil. If we con-
sider that the average pupil diameter can range from 2.5 mm in bright light, to
8 mm in darkness, this can account for about a 10:1 reduction in the amount of
light striking the retina (considering a circular area of approximately 5 mm2 to
525
i i
i i
i i
i i
50 mm2 ) [1262] (see also Section 4.2.4). This reduction is not enough to account
for the wide variety of dynamic ranges we typically encounter. The remainder
of the adaptation occurs at the photoreceptor and retinal level, as well as at the
cognitive level.
While changes in overall illumination levels are important for digital imaging
systems, they are generally not considered the most important form of adapta-
tion. Typically, we are more concerned with changes in the color of the lighting.
As we learned in Section 2.11.2, the overall color of everyday illumination can
vary widely, from orange sunlight at dawn or twilight, to the blue Northern sky.
Despite these varieties of illumination, the colors of objects do not vary nearly
as much. The human visual system is constantly evaluating the illumination en-
vironment and adapting its behavior accordingly. We can think of this process
as the biological equivalent of the automatic white balancing available in today’s
digital cameras and camcorders (much as we can think of the automatic exposure
adjustment akin to our light and dark adaptation).
This chapter details the process of chromatic adaptation, from examples and
experimental data to computational models that predict human perception. A
model for predicting corresponding colors, the stimuli necessary to generate a
chromatic response under one lighting condition that is identical to that under
a different lighting condition, is presented. These corresponding colors form
an important component of today’s color management systems. This chapter
also outlines how we can use the computational models of chromatic adapta-
tion in a digital-imaging system to form a basis for an automatic white-balancing
algorithm.
i i
i i
i i
i i
appear less chromatic on the cloudy day. Although the appearances have changed,
we do not assume that all of the objects themselves have actually changed phys-
ical properties. Rather we recognize that the lighting has influenced the overall
color appearance.
This change in color appearance has been carefully studied and quantified and
is described in detail in Chapter 11. Artists also often have a keen eye to these
overall changes in appearance caused by illumination. Claude Monet, in particu-
lar, was very fond of painting the same scene under a wide variety of illumination.
His Haystack series and Rouen Cathedral paintings of the late 19th century are ex-
cellent examples. Although the actual contents of his paintings did not vary, he
was able to capture an incredible variety of colors by painting at a variety of times
of day and seasons. Interestingly enough, the actual appearance of these scenes
may actually have looked more similar to him while he was painting through his
own chromatic adaptation.
Although we are not fully color constant, the general concept of constancy
does apply to chromatic content. A powerful example of chromatic adaptation
is that most object colors do not significantly change appearance when viewed
under different lighting conditions. This is perhaps best illustrated by thinking
of reading a magazine or newspaper in different areas of your office or home.
Under these conditions, you may have incandescent (tungsten) illumination, flu-
orescent lighting, and natural daylight through a window. Perhaps, you may also
have candlelight or a fire providing some illumination. Throughout these different
conditions, the appearance of the magazine paper will still be white. This is espe-
cially impressive considering the color of the illumination has gone from very red
for the fire, toward yellow for the tungsten, and blue for the daylight. Our visual
system is continuously adapting to these changes, allowing the paper to maintain
its white appearance.
Another example of the vast changes in illumination can be seen in Fig-
ure 10.1, where we have the same scene photographed under very different con-
ditions. The photographs in this figure were RAW digital camera images (i.e.,
minimally processed sensor data) with no automatic white balancing, so they pro-
vided an excellent approximation of the spectral radiances present in the scene.
We can see that at sunset the color of the illumination was predominantly orange,
while in the morning the color predominantly very blue.
Another interesting example is illustrated in Figure 10.2, where the wall of
a building is illuminated by direct sunlight, with a bright blue sky behind it. In
the photograph the wall looks very yellow, a perception that is enhanced by the
simultaneous contrast of the blue sky behind it. In reality, the wall is in fact white,
and the coloration is due entirely to the illumination. To an observer looking at
i i
i i
i i
i i
Figure 10.1. An example of the same scene viewed under two very different illuminating
conditions.
the building it would be obvious that the building was indeed white, though the
appearance of the wall would retain a slight yellow tint. In other words, if the
person were asked to choose a paint to coat another wall such that it matched
the building, they would easily select a white paint, but if they were asked to
artistically reproduce the appearance of the building, they would select a yellow
paint. Similarly to Monet and the Rouen Cathedral, they would recognize that
although the building itself was white, the overall visual experience was actually
yellow.
We have seen how the same scene can contain a huge variation in the color of
lighting depending on the time of day it is viewed. Weather itself can also play
i i
i i
i i
i i
Figure 10.3. A building photographed under two very different weather conditions.
(Photo courtesy of Ron Brinkmann.)
a large role in changing our everyday viewing conditions. Figure 10.3 shows the
same building photographed on a relatively clear day, as well as on a foggy day.
The building itself appears much more blue when illuminated on the clear day
than it does on the foggy day. Despite these large changes in overall illumination,
we do not think that the building itself is actually changing color. Chromatic
adaptation accounts for much of the visual system’s renormalization, allowing us
to maintain a relatively large degree of color constancy.
Essentially, chromatic adaptation allows for a large number of lighting condi-
tions to appear “white.” Perhaps a better explanation is to say that a white paper
is still recognized as being white when illuminated under lights of a wide range
of chromaticities. For example, in Figure 10.4 we see a variety of real-world
lights plotted in a CIE u v chromaticity diagram. Also plotted in this diagram
are the CIE Daylight coordinates. The colors of the points plotted in Figure 10.4
are the relative chromaticities when viewed in the sRGB color space with a white
point of CIE D65. All of these light sources appear white to a person viewing
them if they are viewed in isolation, due to chromatic adaptation. This is also
why it is of critical importance to not use chromaticity coordinates as a method
for describing color appearance! The color appearance of all of the light sources
plotted in Figure 10.4 would most likely all be described the same, despite their
markedly different chromaticities. Further, this is why it is important to not show
i i
i i
i i
i i
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
u
Figure 10.4. A series of real-world light sources and the CIE Daylight illuminants plotted
in the CIE u v chromaticity diagram. A white piece of paper illuminated by any of these
light sources maintains a white color appearance.
chromaticity diagrams in color, since chromatic adaptation itself will change the
appearance of any given point in a chromaticity space.
i i
i i
i i
i i
colors, or stimuli that under one illumination “match” other stimuli viewed under
different illumination.
Color-matching experiments are similar in nature to those used to generate
the CIE standard observers. Essentially, stimuli are viewed under one condition
and then a match is identified (or generated) under a different viewing condition.
This can be done either through successive viewing or simultaneously using a
haploscopic approach, as explained below.
In successive-viewing experiments, an observer looks at a stimulus under one
illuminating condition and then determines a match under a different condition.
Time must be taken to adapt to each of the viewing conditions. For example, an
observer may view a color patch illuminated under daylight simulators for a pe-
riod of time and then, in turn, determine the appearance match with a second stim-
ulus illuminated with incandescent lights. Prior to making any match (or creating
a match using the method of adjustment), the observer has to adapt to the second
viewing condition. Fairchild and Reniff determined that this time course of chro-
matic adaptation should be about 60 seconds to be 90 percent complete [296]. It
should be noted that this time course was determined for viewing conditions that
only changed in chromatic content and not in absolute luminance levels, and more
time may be necessary for changes in adapting luminance. In viewing situations
at different luminances and colors, we would generally say that we are no longer
determining corresponding colors based upon chromatic adaptation but rather full
color-appearance matches. Details on models of color appearance will be found
in Chapter 11.
The inherent need to delay the determination of a color match to allow for
and to study chromatic adaptation leads some to coin this experimental technique
memory matching. This need to rely on human memory to determine a color
match adds inherent uncertainty to these measurements, when compared to di-
rect color matching such as those found using a bipartite field. Wright suggested
using a form of achromatic color matching to alleviate some of these uncertain-
ties [1255]. An achromatic color matching experiment involves determining or
generating a stimulus that appears “gray” under multiple viewing conditions. De-
termining when a stimulus contains no chromatic content should not involve color
memory and should provide a direct measurement of the state of chromatic adap-
tation. Fairchild used this technique to measure chromatic adaptation for soft-
copy (e.g., CRT) displays [302].
Another technique that attempts to overcome the uncertainties of memory
matching is called haploscopic color matching. In these experiments the observer
adapts to a different viewing condition in each eye and makes a color match
while simultaneously viewing both conditions. These matches are made under
the assumption that our state of chromatic adaptation is independent for each eye.
i i
i i
i i
i i
i i
i i
i i
i i
provide much information with regard to the underlying causes of the behavior.
In fact, the full mechanisms of chromatic adaptation in the visual system are not
yet understood. Typically the mechanisms are often broken into two categories:
physiological and cognitive. Physiology itself may be too broad a description, as
that could include the cognitive processes themselves. Perhaps a better descrip-
tion, as used by Fairchild, would be sensory and cognitive mechanisms [302].
Sensory mechanisms refer to those that occur automatically as a function of
the physics of the input illumination. An example of this type of mechanism
would be the automatic dilation of the pupils as a function of the overall amount
of illumination. Other examples include the transition from rods to cones as a
function of input luminance as well as gain-control mechanisms on the photore-
ceptors themselves. Gain control will be discussed in more detail below. While
the sensory mechanisms may be automatic, they are not instantaneous. In fact, the
60-second adaptation time necessary for the memory-matching experiments de-
scribed above is necessary to allow time for the sensory mechanisms [296]. Cog-
nitive mechanisms can be considered higher-level mechanisms where the brain
actually alters perception based upon prior experiences and understanding of the
viewing conditions.
While pupil dilation and the transition from the rods to the cones plays a large
role in our overall light and dark adaptation, these sensory mechanisms do not
play a large role in chromatic adaptation. Although some imaging systems are
designed for use at low light levels, for most common situations, the rods play a
very small role in chromatic adaptation and color appearance. We will focus in-
stead on the role of the cones, specifically, on gain control of the cone photorecep-
tors. One of the simplest descriptions of and explanation for chromatic adaptation
can be thought of as an automatic and independent re-scaling of the cone signals
based upon the physical energy of the viewing condition. This theory was first
postulated by von Kries in 1902 and still forms the basis of most modern models
of chromatic adaptation. In terms of equations, this von Kries-type adaptation is
shown below:
La = α L, (10.1a)
Ma = β M, (10.1b)
Sa = γ S, (10.1c)
where L, M, and S are the cone signals, and α , β , and γ are the independent gain
components that are determined by the viewing condition. While we know today
that the hypothesis of von Kries is not entirely true, it is still remarkable how
effective such a simple model can be for predicting chromatic adaptation.
i i
i i
i i
i i
Evidence of the receptor gain control can be seen in the after-images that we
encounter in everyday life. An example of this is shown in Figure 10.5. Focusing
on the + symbol in the top frame of Figure 10.5 will cause half of the cones
on your retina to adapt to the red stimulus while the other half will adapt to the
green stimulus (interestingly enough, due to the optics of the eye, the left half of
each retina will adapt to the green while the right half will adapt to the red). After
i i
i i
i i
i i
focusing on the top frame for about 60 seconds, one should move one’s gaze to the
bottom frame. The image shown on the bottom should now look approximately
uniform, despite the fact that the left side is tinted red and the right side is tinted
green. This is because the photoreceptors in the retina have adapted to the red and
green fields. Staring at a blank piece of paper should result in an after-image that
is opposite to the adapting field (green/cyan on the left and reddish on the right).
See also Figure 5.20 for another example.
Another way to think about the von Kries receptor gain control is to consider
that the cones themselves become more or less sensitive based upon the physical
amount of energy that is present in a scene. For instance, if a scene is predomi-
nantly illuminated by “blue” light, then we would expect the S cones to become
less sensitive, while the L cones would become (relatively) more sensitive. Con-
versely, if there is more red energy in the illumination, such as with an incandes-
cent light bulb, then we would expect the short cones to become relatively more
sensitive than the long cones.
This example is illustrated in Figure 10.6. The spectral power distribution
(SPD) of incandescent illumination (e.g., CIE Illuminant/Source A; see Section
9.1.1) is illustrated by the green curve in Figure 10.6, while the individual cone
CIE A
0.9
Color representing
0.8 CIE A rendered into
0.7 the sRGB color space
L
M
0.6
S
0.5
0.4
0.3
0.2
0.1
0.0
400 450 500 550 600 650 700
Wavelength (nm)
i i
i i
i i
i i
functions are shown with solid and dashed lines. The relative cone responses
prior to chromatic adaptation are shown as solid lines, while the adapted cones are
represented by the dashed lines. The patch of color in Figure 10.6 represents the
spectral power distribution of CIE A approximately rendered into the sRGB color
space with a white point of CIE D65; that is, the tristimulus values of the printed
background when viewed under daylight illumination should be very similar to a
blank page viewed under incandescent lighting.
Clearly, we can see that the patch of color in Figure 10.6 is a deep yellow-
orange. The blank paper illuminated by incandescent light will not appear nearly
as orange due to the effects of chromatic adaptation. Normalizing the photore-
ceptors independently as a function of the SPD of the light-source results in the
dashed lines shown in Figure 10.6. If we were to then go ahead and integrate
those new adapted cone responses with the background SPD, we would see that
the relative integrated responses are about equal for all the cones, and thus the
adapted perceptual response is approximately white. Alternative models based
upon the von Kries-style of receptor gain control will be discussed in the next
section.
Receptor gain control is not the only form of sensory adaptation for which
we need to account. The visual system, as a generality, is a highly nonlinear
imaging device. As a first-order assumption, many often say that the visual system
behaves in a compressive logarithmic manner. While this is not entirely true,
it does describe the general compressive nature of the visual system (compare
Figure 17.9). Models of chromatic adaptation can include nonlinear compression
as well as subtractive mechanisms. The nonlinear compression can take the form
of simple power functions as well as more sophisticated hyperbolic or sigmoidal
equations. However, for speed and simplicity, most current adaptation models
rely on simple linear gain controls. More sophisticated color appearance models,
such as CIECAM02, do contain these nonlinear components.
Although independent photoreceptor gain controls cannot account entirely
for chromatic adaptation, models based upon this theory can still be very use-
ful. From a purely physiological standpoint, we can consider photoreceptor gain
control as a natural function of photopigment depletion, or bleaching, and regen-
eration. At a fundamental level, color begins when a photon of light hits a pho-
topigment molecule in the cones. The probability of the photopigment molecule
absorbing the light is determined by both the wavelength of the photon and the
properties of the molecules, which are slightly different for the photopigments in
each of the cone types.
When a photopigment molecule absorbs a photon, it undergoes a chemical
transformation known as isomerization, which eventually leads to the transmis-
i i
i i
i i
i i
Figure 10.7. Example of a cognitive form of chromatic adaptation. In the middle frame,
the yellow candy has a green filter placed over it causing its appearance to shift towards
green. When the same green filter is placed over the entire image (bottom frame) the yellow
candy appears yellow again (adapted from [509]).
i i
i i
i i
i i
sion of an electrical current. This process essentially depletes the number of avail-
able photopigment molecules for further light absorption. If all of the molecules
in a particular receptor are depleted, then that receptor is said to be bleached.
Until that point, however, we can say that with fewer and fewer photopigment
molecules available, that particular cone will become less and less sensitive to
light. Interested readers are encouraged to find more information on this process
in vision science texts such as Human Color Vision [566] or Color Vision: From
Genes to Perception [353]. See also Section 3.4.2.
Cognitive mechanisms of chromatic adaptation are more difficult to model, or
even comprehend. It is somewhat counter-intuitive to think that our knowledge
and understanding of a particular scene can fundamentally alter our perception
of that scene, though that is indeed the case. An example of this can be seen in
Figure 10.7. We see the same image with differently colored filters placed over
it. In the top frame, we see the original image, while in the middle frame a green
filter has been cut and placed over the yellow candy. We can see that the color
appearance of the candy has been changed drastically, and now it actually looks
more green than its neighbor.
When we instead place the same green filter over the entire image, as shown in
the bottom frame of Figure 10.7, the appearance of the same candy reverts back to
yellow. This is because our brain recognizes that the color change is a result of a
change in illumination, through the information present in the entire image. This
example is based upon a classic demonstration by Hunt [509]. Cognitive effects of
chromatic adaptation are based upon our understanding of the viewing conditions
as well as past experiences, and they can be considered a higher-level form of
adaption than the pure sensory mechanisms. These higher-level mechanisms do
not necessarily follow the same time course of adaptation either and can be almost
instantaneous. Again, this can be seen in Figure 10.7 when glancing between the
bottom and middle frame. While the color of the “yellow” candy itself does not
change between the two images, the overall appearance changes immediately.
i i
i i
i i
i i
a color looks like, nor can it tell us if two stimuli match when viewed under
disparate lighting conditions. Chromatic-adaptation models get us closer to be-
ing able to describe the appearance of a stimulus and form a fundamental part
of a more complete color-appearance model, as will be explained in Chapter 11.
Chromatic-adaptation models can also be used to extend colorimetry to predict if
two stimuli will match if viewed under disparate lighting conditions where only
the color of the lighting has changed. This section will outline several computa-
tional models of chromatic adaptation, including the von Kries model, CIELAB,
as well as the underlying adaptation model in CIECAM02.
i i
i i
i i
i i
and then transforming back to XYZ after adaptation. One commonly used linear
transform from XYZ to LMS and its inverse are given by
⎡ ⎤ ⎡ ⎤
L X
⎣M ⎦ = MHPE ⎣Y ⎦ (10.4a)
S Z
⎡ ⎤⎡ ⎤
0.38971 0.68898 −0.07868 X
= ⎣−0.22981 1.18340 0.04641⎦ ⎣Y ⎦ ; (10.4b)
0.00000 0.00000 1.00000 Z
⎡ ⎤ ⎡ ⎤
X L
⎣Y ⎦ = M −1 ⎣M ⎦ (10.4c)
HPE
Z S
⎡ ⎤⎡ ⎤
1.91019 −1.11214 0.20195 L
= ⎣ 0.37095 0.62905 0.00000⎦ ⎣M ⎦ . (10.4d)
0.00000 0.00000 1.00000 S
The adapting illumination can be either measured off of a white surface in the
scene, ideally a perfect-reflecting diffuser, or alternatively for a digital image the
illumination can often be approximated as the maximum tristimulus values of
the scene, as shown in (10.5a). Since both the forward and inverse cone trans-
forms are linear 3 × 3 matrix equations, as is the von Kries normalization, the
entire chromatic adaptation transform can be cascaded into a single 3 × 3 matrix
multiplication.
i i
i i
i i
i i
i i
i i
i i
i i
The simple von Kries adaptation transform has been shown to provide a re-
markably good fit to corresponding experimental color data, such as the data gen-
erated by Breneman in 1987 [126] and Fairchild in the early 1990s [302]. Al-
though it is remarkable that such a simple model could predict the experimental
data as well as the von Kries model does, there were some discrepancies that
suggested that a better adaptation transform was possible.
Notice that for each of the CIELAB equations, the CIE XYZ tristimulus val-
ues are essentially normalized by the tristimulus values of white. This normal-
ization is in essence a von Kries-type normalization, though for CIELAB this is
performed on the actual tristimulus values rather than on the LMS cone signals.
This type of transform was actually coined a wrong von Kries transform by Ter-
stiege [1122].
Interestingly, one of the known weaknesses of the CIELAB color space, which
was designed to be perceptually uniform, is changing hue when linearly increas-
ing or decreasing chroma (these terms are explained in depth in Chapter 11). That
is so say that lines of constant perceived hue are not straight in the CIELAB color
space, most noticeably in the blue and red regions, as quantified by experiments
such as those by Hung and Berns [500]. This often results in the purple-sky phe-
nomenon when boosting chroma in the CIELAB color space. An example of this
i i
i i
i i
i i
Figure 10.8. The image on the left was modified by boosting its chroma in CIELAB color
space by a factor of two, shown on the right. Note the undesired purple color which appears
in parts of the water. The blockiness of the purple areas are JPEG compression artifacts;
Grand Bahama, 2004.
is shown in Figure 10.8, which shows an image before and after boosting chroma
by a factor of two.
In Figure 10.9 the Munsell renotation data is plotted in CIELAB for a given
lightness value. Since the Munsell data was experimentally generated to be con-
CIELAB b*
80
40
-40
-80
-120
-150 -100 -50 0 50 100 150
CIELAB a*
Figure 10.9. The Munsell Renotation data at Value 4 plotted in the CIELAB color space
showing curvature for the lines of constant hue.
i i
i i
i i
i i
stant hue, the “spokes” corresponding to lines of constant hue should be straight
[831]. Thus, the root cause of this type of color error was linked to the chromatic-
adaptation transform using CIE XYZ instead of LMS in evaluations by Liu et al.
in 1995 and Moroney in 2003 [690, 793].
Here, LMSw are the cone excitations for the adapting field, or white point, LMSn
are the noise coefficients, typically a very small number, βLMS are the nonlinear
exponential values, and αLMS are linear scaling factors. The von Kries-style pho-
toreceptor gain control should be immediately obvious in the Nayatani equations,
especially as the noise coefficients approach zero.
The exponential values βLMS are functions of the adapting cone signals them-
selves, essentially meaning they monotonically increase as a function of the adapt-
ing luminance [820]. By combining both a photoreceptor gain control with the
nonlinear exponential functions, the Nayatani et al. model is able to predict cor-
responding colors caused by chromatic adaptation, as well as color appearance
changes caused by changes in overall luminance level. These include the increase
in apparent colorfulness and contrast as luminance levels increase. More details
on these color appearance changes are discussed in Chapter 11.
i i
i i
i i
i i
Figure 10.10. An example of discounting the illuminant: the laptop screen and the paper
have identical chromaticities, yet the laptop appears more orange.
i i
i i
i i
i i
computer screen has the same approximate chromaticity as the paper, though the
latter appears to be much more white. For demonstrative purposes, the photograph
in Figure 10.10 was imaged with an additional tungsten light source, though the
chromaticity of the lighting was closely matched to the laptop screen.
The fact that two objects with identical viewing conditions and tristimulus
values do not match in appearance suggests that this difference cannot be at the
sensory level and must rather be at the cognitive level. In color-appearance ter-
minology, one such cognitive mechanism is often referred to as discounting the
illuminant [302, 509]. In such cases, the observer recognizes that the color is a
combination of the object color itself and the illumination. Thus, in the viewing
conditions shown in Figure 10.10, the observer recognizes that the paper itself is
white while the laptop screen is providing the orange illumination. In the same
situation, the self-emitting laptop screen is not recognized as a white object with
orange illumination, and so it retains its overall orange cast.
Fairchild aimed to extend traditional colorimetry along with the von Kries
photoreceptor gain control to create a model that was capable of predicting this
incomplete chromatic adaptation. The model begins by converting the XYZ tris-
timulus values into LMS cone responses, using the Hunt-Pointer-Estevez trans-
formation normalized to CIE D65, from Equations 10.7c and 10.10b. The cone
signals then have a von Kries-style linear adaptation applied to them, as follows:
⎡ ⎤ ⎡ ⎤
L1 X1
⎣M1 ⎦ = MHPED65 ⎣Y1 ⎦ (10.10a)
S1 Z1
⎡ ⎤⎡ ⎤
0.4002 0.7076 −0.0808 X1
= ⎣−0.2263 1.1653 0.0457⎦ ⎣Y1 ⎦ ; (10.10b)
0 0 0.9184 Z1
⎡ ⎤ ⎡ ⎤
L1 L1
⎣M1 ⎦ = A ⎣M1 ⎦ (10.10c)
S1 S1
⎡ ⎤⎡ ⎤
aL 0 0 L1
= ⎣ 0 aM 0 ⎦ ⎣M1 ⎦ . (10.10d)
0 0 aS S1
i i
i i
i i
i i
calculated using
PL
aL = , (10.11a)
LW
1 +Ywv + lE
PL = , (10.11b)
1 +Ywv + l1E
3 LLWE
lE = LW
. (10.11c)
LE +M SW
ME + SE
W
where
c = 0.219 − 0.0784 log10 (YW ) . (10.13)
i i
i i
i i
i i
i i
i i
i i
i i
⎡ ⎤
X
⎡ ⎤
0.8951 0.2664 −0.1614 ⎢ Y ⎥
⎢
⎥
⎢ ⎥⎢Y ⎥
= ⎣−0.7502 1.7135 0.0367⎦ ⎢ ⎥ (10.15b)
⎢Y ⎥
0.0389 −0.0685 1.0296 ⎣ Z ⎦
Y
Normalized tristimulus values must be used in this transform because of the non-
linearity introduced to the short-wavelength B signal.
The sharpened “cone” signals are plotted along with the Hunt-Pointer-Estevez
cone signals in Figure 10.11. The first thing to note is that these cone signals
are not meant to be physiologically realistic, as evidenced by the negative lobes
present. They can be thought to represent types of interactions (both inhibition
and excitation) between the cone signals that may occur at the sensory level,
though they were not designed specifically in that manner. There is experimen-
tal evidence that does suggest that the cone signals do interact during chromatic
adaption and are not independent as von Kries suggested [244]. The sharpened
cone signals effectively allow for a constancy of both hue and saturation across
changes in chromatic adaptation.
1.0
L
M
0.8
S
0.6
0.4
0.2
0.0
−0.2
400 450 500 550 600 650 700
Wavelength (nm)
Figure 10.11. The sharpened cone signals of the Bradford chromatic-adaptation transform
(solid) and the Hunt-Pointer-Estevez cone signals (dashed).
i i
i i
i i
i i
Note that the transform is based upon a reference illuminant of CIE D65 which
is calculated by transforming the CIE XYZ of D65 using the Bradford matrix
MBFD . In (10.16), RGBw are the sharpened cone signals of the adapting reference
conditions (reference white) and RGBwr are the cone signals of CIE D65. Note
the absolute value in calculating the Ba signal, necessary because the sharpened
cones can become negative.
The D term corresponds to the degree of discounting the illuminant. When
D is set to one, the observer is completely adapted to the viewing environment,
while when D is set to zero we would say that the observer is completely adapted
to the reference condition, in this case CIE D65.
Since D65 is always the numerator of (10.16), we can say that if D ∈ (0, 1),
the observer is in a mixed adaptation state, between the viewing condition and the
reference condition. The CIE XYZ tristimulus values for the reference D65 can be
calculated by multiplying by the inverse of the Bradford matrix and multiplying
back in the luminance value:
⎡ ⎤ ⎡ ⎤
Xa R aY
⎣Ya ⎦ = M −1 ⎣GaY ⎦ (10.17a)
BFD
Za B aY
⎡ ⎤⎡ ⎤
0.9870 −0.1471 0.1600 RaY
= ⎣ 0.4323 0.5184 0.0493⎦ ⎣GaY ⎦ . (10.17b)
−0.0085 0.0400 0.9685 BaY
i i
i i
i i
i i
They also developed an empirical formula to calculate the D value from the lumi-
nance of the adapting conditions:
1
Ra = D + 1 − D R, (10.18a)
Rw
1
Ga = D + 1 − D G, (10.18b)
Gw
1
Ba = D p + 1 − D |B p | , (10.18c)
Bw
Bw 0.0834
p= , (10.18d)
1.0
F
D=F−+
,. (10.18e)
1/4 LA2
1 + 2 LA + 300
i i
i i
i i
i i
Calabria and Fairchild showed that all of the transforms that use sharpened
cone responses behave similarly (and better than non-sharpened cones), and they
are perceptually identical to each other for most practical applications [145]. The
CIE then selected a sharpened transform that best fits the existing experimental
data and has a high degree of backwards compatibility to the nonlinear trans-
form in CIECAM97s [195]. This transform is known as CIECAT02, and it is
essentially a von Kries-style normalization using the following sharpened cone
transformation:
⎡ ⎤ ⎡ ⎤
R X
⎣G⎦ = MCAT02 ⎣Y ⎦ (10.19a)
B Z
⎡ ⎤ ⎡ ⎤
X R
⎣Y ⎦ = M −1 ⎣G⎦ (10.19b)
CAT02
Z B
⎡ ⎤
0.7328 0.4296 −0.1624
MCAT02 = ⎣−0.7036 1.6974 0.0061⎦ (10.19c)
0.0030 −0.0136 0.9834
⎡ ⎤
1.0961 −0.2789 0.1827
−1
MCAT02 = ⎣ 0.4544 0.4739 0.0721⎦ (10.19d)
−0.0096 −0.0057 1.0153
The CAT02 sharpened cone sensitivities are plotted along with the Bradford
cone sensitivities in Figure 10.12, showing that the Bradford signals are very sim-
ilar, though slightly sharper for the long wavelength cones.
To calculate corresponding colors using the CIECAT02 transform, assuming
complete adaptation to both the viewing conditions, we can then just use the
sharpened cone signals in a traditional von Kries adaptation matrix, as follows:
⎡R ⎤
w,2
⎡ ⎤ 0 0 ⎡ ⎤
⎢ Rw,1 ⎥
X2 ⎢ ⎥ X1
⎣Y2 ⎦ = M −1 ⎢ ⎥
Gw,2
CAT02 ⎢
0 0 ⎥ MCAT02 ⎣Y1 ⎦ . (10.20)
⎢ Gw,1 ⎥
Z2 ⎣ Bw,2 ⎦ Z1
0 0
Bw,1
Here, RGBw,1 and RGBw,2 are the sharpened cone responses of the adapting con-
ditions (white points).
i i
i i
i i
i i
10.5. Application: Transforming sRGB Colors to D50 for an ICC Workflow 553
0.4
0.2
0.0
−0.2
400 450 500 550 600 650 700
Wavelength (nm)
Figure 10.12. The sharpened cone signals of the CIECAT02 chromatic-adaptation trans-
form (solid) and the Bradford cone signals (dashed).
i i
i i
i i
i i
where
⎡ ⎤
0.4124 0.3576 0.1805
MsRGB = ⎣0.2126 0.7152 0.0722⎦ (10.21e)
0.0193 0.1192 0.9505
It is important to note that the actual sRGB transform uses a piecewise-linear and
nonlinear component rather than a standard gamma of 2.2 as is often expressed
(see Section 8.1).
The forward transform from linearized RGB values into CIE XYZ tristimu-
lus values will provide us with values normalized for D65. To use these values
in an ICC color-managed workflow, we must apply a chromatic-adaptation ma-
trix, transforming into D50. We will use the linear Bradford matrix, since that is
recommended and used by the ICC [523]:
⎡ ⎤ ⎡ ρD50 ⎤ ⎡ ⎤
X ρD65 0 0 X
⎣Y ⎦ −1 ⎢ 0 γD50 ⎥ ⎣ ⎦ .
= MBFD ⎣ γD65 0 ⎦ MBFD Y (10.22)
Z D50 βD50 Z
0 0 β D65
D65
i i
i i
i i
i i
To avoid confusion between the traditional RGB used to describe sharpened cone
signals, and the RGB of a digital device, we will use ρ , γ , and β in these equa-
tions. The sharpened cone signals for D50 and D65 are given by
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
ρ 0.9963 0.9642
⎣γ ⎦ = ⎣1.0204⎦ = MBFD ⎣1.0000⎦ ; (10.23a)
β D50 0.8183 0.8249 D50
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
ρ 0.9414 0.9504
⎣γ ⎦ = ⎣1.0405⎦ = MBFD ⎣1.0000⎦ . (10.23b)
β D65 1.0896 1.0889 D65
By using the values calculated in (10.23) as well as the traditional sRGB color
matrix in (10.22), we can generate a new matrix that will transform directly from
linearized sRGB into CIE XYZ adapted for a D50 white:
⎡ ⎤
0.9963
0 0
⎢ 0.9414 ⎥
−1 ⎢
⎢ 1.0204 ⎥
MsRGBD50 = MBFD ⎢ 0 0 ⎥ ⎥ MBFD MsRGB , (10.24a)
⎣ 1.0405
0.8183 ⎦
0 0
1.0896
Equation (10.24b) can now be used to directly take sRGB encoded values
into the standard ICC working space. In fact, when you read any ICC profiles
encapsulating sRGB, the “colorants” in those files will generally take the form of
the pre-adapted transform from Equation (10.24b).
i i
i i
i i
i i
∑ I (λ ) Rs (λ ) S (λ )
λ
R= ; (10.25a)
∑ S (λ )
λ
∑ I (λ ) Gs (λ ) S (λ )
λ
G= ; (10.25b)
∑ S (λ )
λ
∑ I (λ ) Bs (λ ) S (λ )
λ
B= . (10.25c)
∑ S (λ )
λ
This equation is normalized by the relative energy of the light source, S (λ ). The
full spectral image is represented by I (λ ) and Rs (λ ), Gs (λ ), and Bs (λ ) are the
spectral sensitivities of the digital camera.
As example input into the system, we will use the METACOW image, which
is a full-spectral synthetic image designed for imaging systems evaluation [293].
In addition to providing spectral information at each pixel, METACOW was de-
signed to be maximally metameric to test the variation of an imaging system from
the color-matching functions. Recall from Chapter 7 that a metameric pair of
stimuli are stimuli that appear to match but that have different spectral reflectance
or transmittance values. Due to the disparate spectral properties of the stimuli,
the matches often break down when viewed under different lighting (illuminant
metamerism) or when viewed by a different person or capture device (observer
metamerism).
i i
i i
i i
i i
Figure 10.13. The METACOW image integrated with CIE 1931 Standard Observer and
CIE D65, then transformed into the sRGB color space.
One half of each of the objects in the METACOW image was rendered to have
the same spectral reflectance as the corresponding patch in the GretagMacbeth
ColorChecker (shown in Figure 12.30), while the other half was designed to be
a metameric match for the CIE 1931 Standard Observer when viewed under CIE
D65 [757]. An example of METACOW integrated with CIE D65 and the 1931
Standard Observer and converted to sRGB is shown in Figure 10.13. The full
spectral METACOW images are available on the DVD of this book.
For the sensor spectral sensitivities of our digital camera, we can choose a
simple three-channel RGB camera with Gaussian spectral sensitivities centered at
cr = 450 nm, cg = 550 nm, and cb = 600 nm with a full-width half-max (FWHM)
of 100 nm. The formulation for these sensitivities is given by
(cr − λ )2
Rs (λ ) = exp − ; (10.26a)
2σ 2
(cg − λ )2
Gs (λ ) = exp − ; (10.26b)
2σ 2
(cb − λ )2
Bs (λ ) = exp − . (10.26c)
2σ 2
i i
i i
i i
i i
2.0
Relative sensitivity
1.8
1.6
1.4 CIE A
1.2
1.0 R
0.8 G
B
0.6
0.4
0.2
0.0
400 450 500 550 600 650 700
Wavelength (nm)
Figure 10.14. The simulated spectral sensitivities of our digital camera plotted along with
CIE Illuminant A.
The full-width half-max value indicates the width of the Gaussian at half its max-
imum height. This value relates to σ according to [1232]:
√
FW HM = 2σ 2 ln 2. (10.27)
i i
i i
i i
i i
Figure 10.15. The scene values as captured by the digital camera simulation with no white
balancing.
Figure 10.16. The digital-camera image chromatically adapted to D65 and transformed
into the sRGB color space.
i i
i i
i i
i i
(see Section 12.12). Since we know both the spectral sensitivities of our digital
camera as well as the spectral sensitivities of the CIE 1931 Standard observer, we
can generate a transform to try to map one to the other. The simplest way to do
this is to create a linear transform through least-squares linear regression. With
X̂ Ŷ Ẑ representing the predicted best estimate of CIE XYZ tristimulus values, the
desired 3 × 3 transform is
⎡ ⎤ ⎡ ⎤
X̂ R
⎣Ŷ ⎦ = M ⎣G⎦ . (10.28)
Ẑ λ B λ
y = M · x, (10.29)
where
−1
M = y · xT · x · xT , (10.30a)
⎡ ⎤ ⎛⎡ ⎤ ⎞−1
X 0 1 R 0 1
M = ⎣Y ⎦ R G B ⎝⎣G⎦ R G B ⎠ , (10.30b)
Z B
i i
i i
i i
i i
Figure 10.17. Our transformed digital-camera signals (solid) and the CIE 1931 Standard
observer (dashed).
Once we have our Mcamera transformation to CIE XYZ, we can then apply a
standard chromatic-adaptation transform, such as CIECAT02. We can also move
into a standard color space, such as sRGB, for ease of communication from our
device-specific RGB values. In order to do that, we need to choose the “white” of
our input image. We could use the integrated signals of just CIE A and our camera
sensitivities, since we know that was the illuminant we imaged under, or we can
use a white-point estimation technique. In this case, we will choose a simple
white-point estimate, which is the maximum value of each of the RGB camera
signals. The output white will be set to D65, the native white point of sRGB.
The full transform to achieve a white-balanced sRGB signal from our unbalanced
camera is then
⎡ ⎤
ρD65
⎡ ⎤ 0 0 ⎡ ⎤
⎢ ρmax ⎥
R ⎢ βD65 ⎥ R
⎢ ⎥
⎣G⎦ −1
= MsRGB −1
MCAT02 ⎢ 0 0 ⎥ MCAT02 Mcamera ⎣G⎦ .
⎢ βmax ⎥
B sRGB ⎣ γD65 ⎦ B
0 0
γmax
(10.32)
The resulting image is shown in Figure 10.16.
i i
i i
i i
i i
What we should see now is that the chromatic-adaptation transform did a very
good job of removing the overall yellow tint in the image, and the left halves of
the cows look much like the traditional Color Checker. The metameric halves
still look very strange and clearly do not match. This is because the sRGB color
space has a much smaller gamut than the metameric samples in the METACOW
image, and so the output image shows clipping in the color channels (below zero
and greater than 255).
The gamut of an output device essentially describes the range of colors that
the device is capable of producing. The process of moving colors from the gamut
of one device to that of another is known as gamut mapping. In this case, we
used a simple clipping function as our gamut-mapping algorithm. A more clever
gamut-mapping algorithm may be required to prevent such artifacts or to make
the image appear more realistic. Details of more sophisticated gamut-mapping
algorithms can be found in [798], and an overview is given in Section 15.4.
This example is meant to illustrate the power that a chromatic-adaptation
transform can have for white balancing digital cameras. Through device-
characterization and the CIECAT02 transform we are able to transform our sim-
ulated digital-camera signals into the sRGB color space with successful results.
For most cameras and spectral reflectances, these techniques should prove to be
very useful.
i i
i i
i i
i i
i i
i i
i i
i i
after rendering, the transform from sharpened RGB to display values is [1208]
⎡ ⎤ ⎡ ⎤⎡ ⎤
Rd Rw 0 0 R
⎣Gd ⎦ = Md M −1 ⎣ 0 Gw 0 ⎦ ⎣G ⎦ . (10.35)
sharp
Bd 0 0 Bw B
where Md is the transform that takes XYZ tristimulus values to the display’s color
space, and (Rw , Gw , Bw ) represents the white balance associated with the viewing
environment.
After rendering and white balancing for the display environment, we assume
that further corrections are carried out to prepare the image for display. These
will involve tone reproduction to bring the dynamic range of the image within
the range reproducible by the display device, gamma correction to account for the
nonlinear response of the display device to input voltages, and possibly correction
for the loss of contrast due to ambient illumination in the viewing environment.
Each of these issues is further discussed in Chapter 17.
While many computer-graphics practitioners do not follow this procedure, we
hope to have shown that maintaining color accuracy throughout the rendering
pipeline is not particularly difficult, although it requires some knowledge of the
illuminants chosen at each stage. The combined techniques described in this sec-
tion, as well as those in Section 8.12, enable a high level of color fidelity without
inducing extra computational cost.
Finally, the white-balancing operation carried out to account for the illumi-
nation in the viewing environment can be seen as a rudimentary form of color-
appearance modeling. Under certain circumstances, it may be possible to replace
this step with the application of a full color-appearance model, such as iCAM,
which is the topic of Chapter 11.
i i
i i
i i
i i
Chapter 11
Color and Image
Appearance Models
i i
i i
i i
i i
Color appearance models take into account measurements, not only of the stimu-
lus itself, but also of the the viewing environment, and they generate, in essence,
viewing condition independent color descriptions. Using color appearance mod-
els, we can determine whether or not two stimuli match even when viewed under
markedly different conditions. Consider the problem of viewing an image on a
LCD display and trying to create a print that matches the image, but that will be
viewed in a very different context. Color appearance models can be used to pre-
dict the necessary changes that will be needed to generate this cross-media color
reproduction.
This chapter will provide an overview of the terminology used in color science
and color appearance, as well as discuss some well-known color appearance phe-
nomena. The CIECAM02 color appearance model and its uses will be discussed,
as well as two spatial or image appearance models: S-CIELAB and iCAM.
11.1 Vocabulary
As with all areas of study, it is important to have a consistent terminology or
vocabulary to enable accurate and precise communication. When studying color,
this is all the more important, as color is something we all grew up with and most
likely acquired different ways of describing the same thing. Color and appearance
vocabulary is often interchanged haphazardly, for example, easily swapping terms
such as luminance, lightness, and brightness. This may be more prevalent when
discussing a subject such as color, as it is easy to assume that since we have been
exposed to color our entire life, we are all experts.
Even in our education, treatment of color is highly varied. To a child, color
might be made up of their three primary paints or crayons: red, blue, and yel-
low. Many graphic artists or printers work under the premise of four primaries:
cyan, magenta, yellow, and black. Computer-graphics researchers may assume
that color is made up of red, green, and blue. Physicists might assume that color is
simply the electromagnetic spectrum, while chemists may assume it is the absorp-
tion or transmission of this spectrum. All of these assumptions can be considered
to be correct, and to each of the disciplines, all of the others may be considered
quite wrong.
In the field of color appearance, the standard vocabulary comes from the Inter-
national Lighting Vocabulary, published by the CIE, and the Standard Terminol-
ogy of Appearance, published by the American Society for Testing and Materials
(ASTM) [49, 187]. The definition of terms presented below comes directly from
these works.
i i
i i
i i
i i
That the definition of color requires the use of the word it is trying to define
is an uncomfortable circular reference that can lead to confusion. The original
authors of this vocabulary were clearly concerned with this potential confusion,
and they added a note that summarizes well the study of color appearance:
i i
i i
i i
i i
Related colors are viewed in the same environment or area as other stimuli,
while unrelated colors are viewed in isolation. We generally think that color stim-
uli are not viewed in complete isolation (unless you are an astronomer), and as
such most color appearance models have been designed to primarily predict re-
lated colors. The pedigree of these models, however, are the many color-vision
experiments performed that used stimuli viewed in isolation. These experiments
have been powerful tools for gaining an understanding of how the human visual
system behaves. It is also important to understand the differences between these
types of stimuli when utilizing models designed to predict one specific type of
color appearance.
There are several important color perceptions that only exist when viewed as
related or unrelated stimuli. Perhaps the most fascinating of these cases is the so-
called “drab” colors such as brown, olive, khaki, and gray. These colors can only
exist as related colors. It is impossible to find an isolated brown or gray stimulus,
as evidenced by the lack of a brown or gray light bulb. Even if the surface of
these lights looked brown or gray, they would appear either orange or white when
viewed in isolation.
This is not to say that we cannot make a stimulus look brown or gray using
only light, as it is quite easy to do so using a simple display device such as a
CRT or LCD. We can only generate those perceptions when there are other colors
being displayed as well, or if we can see other objects in the viewing environment.
When we talk about relative attributes of color appearance, such as lightness or
chroma, then by definition, these attributes only exist for related colors. The
absolute color appearance terminology can be used for either related or unrelated
colors.
Of all the color appearance attributes, hue is perhaps the easiest to understand.
When most people are asked to describe the color of an object, they will generally
describe the hue of the object. That said, it is almost impossible to define hue
without using examples of what a hue is.
i i
i i
i i
i i
Figure 11.1. The concept of a hue-circle or wheel. The physiological hues of red, yellow,
green, and blue form a closed ring with other colors falling between (a combination of)
adjacent pairs [49].
green, and blue. These unique hues follow the opponent color theory of vision
postulated by Hering in the late 19th century [462]. It has been noted that several
of these particular hues are never perceived together, and other colors can be
described as a combination of two of them. For example, there is no stimulus that
is perceived to be reddish-green or a yellowish-blue, but purple can be described
as a reddish-blue and cyan as a bluish-green. This opponent theory suggests the
fundamental notion that human color vision is encoded into roughly red-green and
blue-yellow channels. Thus, opponent color representation is a crucial element of
all color appearance models.
The definitions for the achromatic and chromatic colors is also important, as
the achromatic colors represent the third channel, the white-black channel theo-
rized by Hering. Although hue, or more precisely hue-angle, is often described as
a location on a circle, there is no natural meaning for a color with a hue of zero,
and so that value can be considered an arbitrary assignment (as is a hue angle
of 360 degrees). Achromatic colors describe colors that do not contain any hue
i i
i i
i i
i i
information, but this definition does not extend to a meaningful placement on the
hue circle.
The only color appearance attributes of achromatic colors are defined to be
brightness and lightness, though these descriptions also apply for chromatic col-
ors. These terms are very often confused and interchanged with each other, but it
is important to stress that they have very different meanings and definitions.
i i
i i
i i
i i
would thus appear white. As a simple mathematical relation, we can describe the
general concept of lightness by
Brightness
Lightness = . (11.1)
Brightness of White
Lightness and brightness are used to describe the relative amount of light a
colored stimulus appears to reflect or emit. Hue describes the general “color”
of a given stimulus. The terms used to describe the amount of hue or chromatic
content a stimulus has are also often confused and interchanged. These terms,
colorfulness, chroma, and saturation have very distinct meanings, and care should
be taken when choosing which attribute to describe the color of a stimulus.
i i
i i
i i
i i
Figure 11.2. The appearance attributes of lightness, chroma, and hue can be represented
as coordinates in a cylindrical color space. Brightness, colorfulness, and hue can also be
represented as such a space.
i i
i i
i i
i i
Figure 11.3. A rendered shadow series illustrating constant saturation. The glancing
illumination causes a decrease in brightness and colorfulness (as well as lightness and
chroma) across the surface of each globe, but saturation remains constant.
i i
i i
i i
i i
Saturation
Lightness
on
rati
Satu
tant
ns
Co
Hue Angle
Figure 11.4. The appearance attributes of lightness, saturation, and hue can be represented
as coordinates in a conical color space originating at a single point at black.
Similarly to our behavior with lightness and brightness, the human visual sys-
tem generally behaves more like a relative chroma detector. As the luminance of
the viewing conditions increases, the perceptions of lightness and chroma tend to
remain constant while the brightness of a white stimulus increases. In this same
situation the colorfulness also increases as a function of luminance. This can be
visualized by thinking of an typical outdoor scene. On a sunny day, everything
lo¡oks very colorful, while the same scene viewed on cloudy day appears much
less colorful. This appearance phenomenon is known as the Hunt effect and will
be discussed in more detail Section 11.2.2.
i i
i i
i i
i i
i i
i i
i i
i i
match, where the relationship between objects in the scene is held constant. We
can think of this as analogous to dynamic range reduction for image capture and
rendering (discussed in Chapter 17), since we want to generate a relative color ap-
pearance match that closely matches in appearance to the original high dynamic
range scene.
Stimulus. The stimulus is the color element of interest. As with standard col-
orimetry, the stimulus used for most color appearance research is a small
uniform color patch, which subtends 2◦ of visual angle.
Surround
Stimulus
Proximal field
Background
10o
2o
Figure 11.5. The components of the viewing field used for color appearance specification.
i i
i i
i i
i i
being extended. Ideally, the stimulus is described by the full spectral represen-
tation in absolute radiance terms, though in practice this is difficult to achieve.
When the spectral power distribution is unavailable, the stimulus should be de-
scribed using a standard device-independent space, such as CIE XYZ tristimulus
values, or LMS cone responsivities.
As we progress and apply color appearance modeling to color imaging and
synthesis, the definition of the stimulus can get somewhat blurred. Is the stimulus
a single pixel, a region of pixels, or the entire image? While it may be most conve-
nient to simply assume that the entire image is the stimulus, that is surely an over-
simplification. Currently there is no universally correct definition of the stimulus
for spatially complex scenes such as images. Therefore when using images with
color appearance models, care should be taken to fully understand the manner in
which the models are applied. CIECAM02, the current CIE-recommended color
appearance model, is designed to be used for color management and imaging;
most applications assume that each pixel can be treated as a separate stimulus.
Spatial models of color appearance, also known as image appearance models, are
designed for use with complex images and will be discussed in more detail in
Section 11.4.
The proximal field can be useful for measuring local contrast phenomena such
as spreading and crispening, which are introduced in Section 11.2.1. Again the
question of defining the locally spatial proximal field becomes very difficult when
considering digital color images. Should the proximal field for any given pixel be
considered to be all of the neighboring pixels, or perhaps just the edge border of
the image itself? While it is an important factor in determining the color appear-
ance of a stimulus, in most real-world applications, the proximal field is assumed
to be the same as the background. Again, the ideal description of the proximal
field would be a measure of both spatial and spectral properties, though these are
often unavailable.
i i
i i
i i
i i
For practical applications, color appearance models tend to simplify and ex-
press the surround in distinct categories: dark, dim, and average. For instance,
movie theaters are usually considered to have a dark surround, while most people
tend to view televisions in a dim surround. A typical office or computing environ-
ment is often said to be an average surround. A more detailed discussion on the
effect of the surround on color appearance is available in Section 11.2.4.
i i
i i
i i
i i
factors that cannot be readily explained using our model of a simplified viewing
field can also have a large effect on the perceived appearance of a stimulus. As we
have established by now, the perception of color cannot be adequately explained
by the physics of light alone. The human observer is the critical component that is
ultimately responsible for any sensation or perception. The human visual system
relies both upon sensory mechanisms and automatic changes based on biological
and physiological factors, as well as on cognitive interpretations.
The full scope of cognitive mechanisms is not yet understood, though we are
able to recognize and model some of the ultimate behavior and responses. One of
the most important cognitive behaviors affecting color appearance is the mode of
appearance. Like much of color itself, the mode of color appearance is a difficult
concept to grasp at first and might be best described with an example.
Often times, we catch a glimpse of colors that at first appear vibrant and
highly chromatic and then suddenly become mundane. When walking past a
semi-darkened room recently, we were struck by what appeared to be a wall that
appeared painted with a mural of abstract colored rectangles of various shades of
brown and orange. The appearance of the wall was quite striking and highly chro-
matic. Upon entering the room, it became obvious that the colors were the result
of the street-lights, light from very orange low-pressure sodium lamps, entering
through an open window. Almost immediately, the appearance of the various
highly chromatic rectangles changed and appeared to be the same white wall illu-
minated by the orange light.
This scene was captured with a RAW (non-white balanced) camera, and is
shown at the top of Figure 11.6. The approximation of the immediate change
in appearance upon realizing it was the light from the window is shown in the
bottom frame of Figure 11.6. This immediate change in appearance was entirely
cognitive, and it can be attributed to a switch from a surface mode of viewing to
an illumination mode.
There are five modes of viewing that affect color appearance: illuminant, illu-
mination, surface, volume, and film. A brief overview of these modes is presented
here, though a more complete description can be found in the texts by the Optical
Society of America [860], or more recently by Fairchild [302].
The illuminant mode of appearance is color appearance based on the per-
ception of a self-luminous source of light. Color appearance perceptions in the
illuminant mode typically involve actual light sources, and more often then not,
these are the brightest colors in the viewing field, for example, looking at a traffic
light at night or at a light fixture indoors.
The cognitive assumption that the brightest objects in a scene are light sources
can lead to some interesting appearance phenomena when non-illuminant objects
i i
i i
i i
i i
Figure 11.6. The wall of a room that appeared to be a mural of highly chromatic different
colored squares (top) and the change in appearance once it was obvious that it was orange
light entering from an open window (bottom).
in a scene appear much brighter than the surrounding scene. These objects can
sometimes be perceived in an illuminant mode and may be described as glowing.
Evans coined this perception brilliance, though Nayatani and Heckaman have
more recently touched upon on this topic [288, 448, 822].
i i
i i
i i
i i
Another example of an object appearing to glow may occur when there are
fluorescent objects in a scene. Fluorescence is found in materials that absorb en-
ergy at one wavelength and emit that energy as light at much longer wavelengths.
Fluorescent objects can appear to glow, in part because they absorb energy from
non-visible UV portions of the spectrum and emit this as light in the visible por-
tions. Thus, these objects can appear to be much brighter than the objects in the
surrounding scene, and they are often said to take on the appearance of a light
source.
The illumination mode of appearance is similar to the illuminant mode, except
that perceived color appearance is thought to occur as a result of illumination
rather than the properties of the objects themselves. Consider the example given
in Figure 11.6. Upon entering the room, it became evident that it was a single
color wall that was being illuminated through a series of windows. The orange
color was instantly recognized as coming from an illumination source. There are
many clues available to a typical observer of a scene when determining if color is
a result of illumination. These clues include the color of the shadows, the color
of the entire scene, complex color interactions at edges and corners, as well as the
color of the observer themselves.
The perceived color of the wall, as described above, is an example of the
surface mode of appearance. At first it appeared that the surface of the wall
was painted with a colored mural. In this mode of appearance, the color of a
surface is perceived as belonging to the object itself. For example, an observer
may know that the skin of an apple is red, and the apple will maintain a certain
degree of “redness” despite large changes in illumination. This is an example
of discounting-the-illuminant, as we discussed in Chapter 10. Any recognizable
object provides an example of the surface mode of appearance. For this mode of
appearance, we need both a physical surface and an illuminating light source.
The volume mode of appearance is similar in concept to the surface mode,
except that the color is perceived to be a part of a bulk or volume of a transparent
substance. An example of the volume mode of appearance can be found in the
perceived color of liquids, such as a cup of coffee. Coffee is often classified into
categories such as light, medium, and dark, based upon the color of the brewed
coffee itself. The color of the coffee is not thought to be just a characteristic
of the surface, but rather it exists throughout the entire volume. Adding milk to
the coffee increases the scattering and the beverage becomes opaque. This is an
example of a volume color changing into a surface color. Volume color requires
transparency as well as a three-dimensional shape and structure.
The final mode of appearance, called either the aperture or film mode, encom-
passes all remaining modes of appearance. In the film mode, color is perceived
i i
i i
i i
i i
as an aperture that has no connection with any object. One example is a photo-
graphic transparency on a light-box. Any object can switch from surface mode to
aperture mode, if there is a switch in focus from the surface itself. This can be
accomplished purposely, by using a pinhole or a camera lens system. The differ-
ences in our perception between a soft-copy display device and a hard-copy print
may be associated with the different modes of viewing. The print is generally
considered to be viewed as a surface mode, while the self-luminous display may
be viewed in film mode or perhaps in an illuminant mode.
i i
i i
i i
i i
Figure 11.7. An example of simultaneous lightness contrast. Each row of gray patches
has identical luminance.
figure are gray patches of identical luminance, with each subsequent row decreas-
ing in luminance. When placed upon a background with a white-to-black gray
gradient, we see that the patches in each row actually appear of a different light-
ness (or perception of luminance) creating the illusion of a dark to light contrast
or tone scale.
Similarly, the relative contrast of the tone scale in each of the columns also
appears different. These observed differences are a result of the background in
the viewing field. It is for this reason that we should not consider the perception
of contrast to be a global property of an image measured by the ratio of maximum
white to black, also known as Michelson contrast. Rather, we should consider
contrast to be both a global and local perception. Peli gives an excellent overview
of contrast in complex images, and more recently Calabria and Fairchild measured
contrast perception for color imaging [146, 147, 888].
Simultaneous contrast is not limited to changes in lightness perception, as it
can also cause the color of a stimulus to shift in appearance when the color of
i i
i i
i i
i i
the background changes. This is illustrated in Figure 11.8, where each of the
small patches has an identical XYZ tristimulus value. The change in color of a
stimulus tends to follow the opponent color theory of vision. The patches on the
yellow gradient should appear more blue, while the patches on the blue gradient
should appear more yellowish-green. Interesting to note in Figure 11.8 is that
the colors of any neighboring patches do not appear very different, though the
appearance change across the entire field is readily apparent. This again suggests a
very localized perception of appearance. Shevell et al. present some very striking
examples of this spatially localized simultaneous contrast; they also developed
visual models to predict this spatial relationship [67, 1039]. In general, as the
spatial frequency of a stimulus increases, the contrast effect actually ceases and
in some cases reverses and may assimilate the background color or spread its own
color to the background.
Figure 11.8. An example of simultaneous color contrast. Each of the patches has identical
CIE XYZ tristimulus values.
i i
i i
i i
i i
Figure 11.9. An example of spreading. Only the lines contain color, though that color
appears in the background as well (adapted from [630]).
i i
i i
i i
i i
Figure 11.10. An example of lightness crispening. The color difference between the
two small patches is the same for all backgrounds, but should appear greatest when the
background is close in color to the stimuli.
the stimuli. In the figure, the differences between the small gray patches are the
same for all three backgrounds, but the difference looks the largest on the gray
background. Similar effects can be seen for color patches, known as chromatic
or chroma crispening. More details on lightness and chroma crispening can be
found in Semmelroth [1025] and Moroney [792].
i i
i i
i i
i i
day. Objects tend to appear very bright and colorful on a sunny day and somewhat
subdued on an overcast day. These occurrences can be well described by both the
Hunt effect and the Stevens effect.
The Hunt effect states that as the luminance of a given color increases, its
perceived colorfulness also increases. This effect was first identified in a seminal
study by Hunt on the effects of light and dark adaptation on the perception of
color [504]. A haploscopic matching experiment was performed, similar to that
used to generate the CIE Standard Observers. However, in haploscopic matching,
observers are presented with one viewing condition in their left eye and another
in their right eye.
In the Hunt experiment, observers used the method of adjustment to create
matches on stimuli viewed at different luminance levels for each eye. When the
adjusting stimulus was present at low luminance levels, the observers required a
significant increase in the colorimetric purity to match a stimulus viewed at a very
high luminance level. This indicated that the perception of colorfulness is not in-
dependent of luminance level. If we think about this in terms of a chromaticity
diagram, the perceived chromaticity shifts toward the spectrum locus as the lu-
minance levels increase, or shifts toward the adapted white as luminance levels
decrease.
Going back to the sunny day example, the Hunt effect partially explains why
objects appear much more vivid or colorful when viewed in a bright sunny envi-
ronment. To summarize, the Hunt effect states that as absolute luminance levels
increase the colorfulness of a stimulus also increases.
Similar to the Hunt effect, we also see that brightness is a function of chro-
matic content. The perception of brightness or lightness is often erroneously as-
sumed to be a function of luminance level alone. This is not the case and is well
illustrated by the Helmholtz-Kohlrausch effect. The Helmholtz-Kohlrausch effect
shows that brightness changes as a function of the saturation of a stimulus; i.e., as
a stimulus becomes more saturated at constant luminance, its perceived brightness
also increases.
Another way to describe this effect is say that a highly chromatic stimulus
will appear brighter than an achromatic stimulus at the same luminance level.
If brightness were truly independent of chromaticity, then this effect would not
exist. It is important to note that the Helmholtz-Kohlrausch effect is a function
of hue angle as well. It is less noticeable for yellows than purples, for instance.
Essentially this means that our perception of brightness and lightness is a function
of saturation and hue, and not just luminance. Fairchild and Pirrotta published a
general review of the Helmholtz-Kohlrausch effect as well as some models for
predicting the effect [295].
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
from the surface of the screen. Flare can be considered stray light that is not part
of the desired stimulus; it has the effect of reducing overall contrast by essentially
increasing the luminance of the black. We should not confuse the removal of flare
with the change in perception caused by the luminance of the surround.
i i
i i
i i
i i
the Hunt and Stevens effects. Spatially structured phenomena, such as crispen-
ing and simultaneous contrast, require models of spatial vision as well as color
appearance.
There are many color appearance models that have been developed, each de-
signed with specific, and perhaps different, goals in mind. Some of these models
that one may come across include CIELAB, Hunt, Nayatani, ATD, RLAB, LLAB,
ZLAB, CIECAM97s, and CIECAM02. The interested reader is encouraged to ex-
amine Fairchild’s thorough examination on the history and development of many
of these models [302]. Most of these models are beyond the scope of this text,
though we will examine CIELAB as well as CIECAM02 as simple color appear-
ance predictors. The latter is the current recommended color appearance model
from the CIE; it is based upon the strengths of many of the other models men-
tioned above.
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
Surround condition c NC F
Average 0.69 1.0 1.0
Dim 0.59 0.9 0.9
Dark 0.525 0.8 0.8
LSW
SR = . (11.3)
LDW
If the surround ratio is zero, then a dark surround should be chosen; if it is less
than 0.2, a dim surround should be selected. An average surround is selected for
all other values. The values c, Nc , and F are then input from Table 11.1. It is
possible to linearly interpolate the values from two surround conditions, though it
is generally recommended to use the three distinct categories.
i i
i i
i i
i i
CIECAM02: ⎡ ⎤ ⎡ ⎤
R X
⎣G⎦ = MCAT02 ⎣Y ⎦ , (11.4)
B Z
with MCAT02 defined as
⎡ ⎤
0.7328 0.4296 −0.1624
MCAT02 = ⎣−0.7036 1.6975 0.0061⎦ . (11.5)
0.0030 0.0136 0.9834
This equation was designed to produce a maximum value of 1.0 for a stimulus
viewed at 1000 lux.
The adapted cone signals are then calculated using a simple linear von Kries-
style adaptation transform:
Yw D
Rc = + 1 − D R, (11.7a)
Rw
Yw D
Gc = + 1 − D G, (11.7b)
Gw
Yw D
Bc = + 1 − D B. (11.7c)
Bw
Note that these equations multiply back in the relative luminance of the adapt-
ing white Yw . While typically this is taken to be 100, there are some applications
where this is not the case. Fairchild presents a compelling argument that Yw should
always be replaced with 100.0 in these calculations, especially when using the
chromatic-adaptation transform outside of CIECAM02 [302].
i i
i i
i i
i i
There are several other factors that can be considered part of the greater
chromatic-adaptation transform. These terms, although not explicitly part of the
von Kries adaptation, account for changes in chromatic adaptation as a function
of overall luminance levels, as well as some background luminance and induction
factors that can account for some color appearance changes as a function of the
background luminance. These factors are calculated using
1
k= , (11.8a)
(5LA + 1)
2
FL = 0.2k4 (5LA ) + 0.1 1 − k4 (5LA )1/3 , (11.8b)
Yb
n= , (11.8c)
Yw
1
Nbb = Ncb = 0.725 0.2 , (11.8d)
√ n
z = 1.48 + n. (11.8e)
The adapted RGB values are then converted into cone signals prior to the non-
linear compression. This is done by inverting the MCAT02 matrix to obtain adapted
XY Z tristimulus values and then transforming those into cone signals using the
Hunt-Pointer-Estevez transform, normalized for the reference equal-energy con-
dition. Note that the RGB terminology is used in CIECAM02 rather than LMS
to maintain consistency with older color appearance models. This transform is
given by:
⎡ ⎤ ⎡ ⎤
R Rc
⎣G ⎦ = MHPE M −1 ⎣Gc ⎦ ; (11.9a)
CAT02
B Bc
⎡ ⎤
0.38971 0.68898 −0.07868
MHPE = ⎣−0.22981 1.18340 0.04641⎦ ; (11.9b)
0.00000 0.00000 1.00000
⎡ ⎤
1.096124 −0.278869 0.182745
−1
MCAT02 = ⎣ 0.454369 0.473533 0.072098⎦ . (11.9c)
−0.009628 −0.005698 1.015326
The cone signals are then processed through a hyperbolic nonlinear compres-
sion transform. This transform was designed to be effectively a power function of
a wide range of luminance levels [302]. These compressed values RGBa are then
i i
i i
i i
i i
12 Ga Ba
a = Ra − + , (11.11a)
11 11
1
b= Ra + Ga − 2Ba . (11.11b)
9
Hue angle is then calculated from these opponent changes in a manner identi-
cal to the calculation in CIELAB:
−1 b
h = tan . (11.12)
a
i i
i i
i i
i i
1+
π ,
et = cos h + 2 + 3.8 . (11.13)
4 180
This eccentricity factor is necessary, because the four physiological hues, red,
green, yellow, and blue, are not perceptibly equidistant from each other. This is
especially true between the blue and red hues, as there is a large range of percepti-
ble purple colors in that range. Note, that for this calculation, the cosine function
π
is expecting radians while the hue angle is represented in degrees; hence, the 180
term.
The data in Table 11.2 are then used to determine hue quadrature through a
linear interpolation between the calculated hue value and the nearest unique hue:
100 (h − hi ) /ei
H = Hi + . (11.14)
(h − hi ) /ei + (hi+1 − h) /ei+1
i i
i i
i i
i i
with the absolute luminance level function FL . This accounts for both the Stevens
effect as well as the changes in perceived contrast as a function of the surround:
4 J
Q= (Aw + 4) FL0.25 . (11.17)
c 100
i i
i i
i i
i i
as = s cos(h), (11.21e)
bs = s sin(h). (11.21f)
i i
i i
i i
i i
If we have the hue angle h, then we can proceed, or we can calculate it from hue
quadrature H and Table 11.2:
From there, we calculate the temporary magnitude value t and the eccentricity
value et . It is important to note that for the remainder of these equations we use
the parameters from the output viewing conditions:
⎛ ⎞1/0.9
⎜ C ⎟
t =⎜
⎝
⎟
⎠ , (11.26a)
J
(1.64 − 0.29n )0.73
100
1
π
et = cos h + 2 + 3.8 . (11.26b)
4 180
J 1/cz
A = Aw , (11.27a)
100
(50000/13) Nc Ncb et
p1 = , (11.27b)
t
A
p2 = + 0.305, (11.27c)
Nbb
21
p3 = . (11.27d)
20
π
hr = h . (11.28)
180
i i
i i
i i
i i
i i
i i
i i
i i
Note that if any of the values of Ra − 0.1, Ga − 0.1, Ba − 0.1 are negative, then the
corresponding RGB must be made negative.
At this point, we leave the Hunt-Pointer-Estevez cone space and move into
the sharpened cone space for the inverse chromatic-adaptation transform:
⎡ ⎤ ⎡ ⎤
Rc R
⎣Gc ⎦ = MCAT02 M −1 ⎣G ⎦ ; (11.33a)
HPE
Bc B
⎡ ⎤
1.910197 −1.112124 0.201908
−1
MHPE = ⎣ 0.370950 0.629054 −0.000008⎦ . (11.33b)
0.000000 0.000000 1.000000
Finally we can transform the adapted signals back into CIE XYZ tristimulus
values. These values can then be further transformed into a device-specific space
for accurate color appearance reproduction between devices:
⎡ ⎤ ⎡ ⎤
X R
⎣Y ⎦ = M −1 ⎣G⎦ . (11.35a)
CAT02
Z B
i i
i i
i i
i i
i i
i i
i i
i i
of chromatic adaptation and color constancy serve some of the same purposes in
image rendering and certainly provide some of the critical groundwork for recent
research in image appearance modeling.
This section will focus on two models that have their pedigree in CIE col-
orimetry and color appearance modeling: S-CIELAB and iCAM. S-CIELAB is
a simple spatial extension to the CIELAB color space, and was designed to mea-
sure color differences of complex images. Fairchild and Johnson based iCAM on
S-CIELAB, combined with a spatially variant color appearance model [291,292].
11.4.1 S-CIELAB
The S-CIELAB model was designed as a spatial pre-processor to the standard CIE
color difference equations [1292]. The intention of this model was to account for
color differences in complex color stimuli, such as those between a continuous
tone image and its half-tone representation.
The spatial pre-processing uses separable convolution kernels to approximate
the contrast-sensitivity functions (CSF) of the human visual system. These ker-
nels behave as band-pass functions on the luminance channels and low-pass func-
tions for the chromatic opponent channels. It is important to emphasize that the
CSF kernels chosen are tied heavily into the color space in which they were de-
signed to be applied. The opponent color space chosen in S-CIELAB is based
upon color appearance data from a series of visual experiments on pattern color
performed by Poirson and Wandell [908, 909]. This space, and the corresponding
measured contrast-sensitivity functions, were both fit to the visual data and should
be used together.
For S-CIELAB, the CSF serves to remove information that is imperceptible to
the human visual system and to normalize color differences at spatial frequencies
that are perceptible. This is especially useful when comparing images that have
been half-toned for printing. For most common viewing distances, these dots are
not resolvable and tend to blur into the appearance of continuous colors. A pixel-
by-pixel color-difference calculation between a continuous image and a half-tone
image results in extremely large errors, while the perceived difference can actually
be small. The spatial pre-processing stage in S-CIELAB blurs the half-tone image
so that it more closely resembles the continuous tone image.
The original S-CIELAB implementation uses separable convolution kernels
to perform the spatial filtering, along with the traditional CIELAB color dif-
ference equations. Johnson and Fairchild introduced a slight refinement to this
model based upon spatial filtering in the frequency domain and the use of the
CIEDE2000 color-difference metrics [555]. This implementation has been folded
i i
i i
i i
i i
Input Image
Convert to ACC
Opponent Color Space
Spatial Filter
Convert to XYZ
Calculate CIELAB
coordinates
i i
i i
i i
i i
i i
i i
i i
i i
1.4
Sensitivity
Achromatic
1.2 Red−green
Yellow−blue
1.0
0.8
0.6
0.4
0.2
0.0
0 10 20 30 40 50 60
Spatial Frequency (cycles−per−degree)
Figure 11.12. The convolution kernels of S-CIELAB represented in the frequency do-
main. Notice the general approximation of the human contrast-sensitivity functions.
to 1.0. This assures that when solid patches are filtered and used as stimuli, the
S-CIELAB model reduces to a standard color difference equation.
The final kernel size is determined to be a function of the viewing conditions,
primarily the resolution of the display, the number of pixels in the image, and
the viewing distance. Equation (11.38) shows this relationship for an image size
expressed in pixels, a display resolution expressed in dots-per-inch, and a viewing
distance expressed in inches. The spatial frequency of the kernel size is expressed
as a function of samples/pixels per degree of visual angle. In practice, the spread
of the Gaussian σi is multiplied by samples per degree to generate the convolution
kernel: ⎛ ⎞
size
⎜ dpi ⎟ 180
samples per degree = tan−1 ⎜ ⎟
⎝ distance ⎠ π . (11.38)
We can get a general idea of the type of spatial filtering performed by these
convolution kernels by transforming them into the frequency domain using a 1D
Fourier transform. If we do this for a large enough kernel, greater than 120 sam-
ples per degree, we can see the full extent of the filtering. This is shown in Fig-
ure 11.12. We can see that the achromatic filter behaves in a band-pass nature,
while the two chromatic filters are low-pass in nature. The achromatic filter essen-
tially blurs less than the red-green, which in turn blurs less than the blue-yellow.
This follows the generally known behavior of the contrast-sensitivity functions of
i i
i i
i i
i i
the human visual system. Thus the S-CIELAB pre-processing provides a rela-
tively simple way to take into account the spatial properties of the visual system,
while also relying on the strength of the CIELAB color difference equations.
i i
i i
i i
i i
1.00
1.2
0.90
0.80 1
0.70
Relative Response
0.8
0.60
0.50
FL
0.6
0.40
0.4
0.30
0.20
0.2
0.10
0.00 0
390 440 490 540 590 640 690 0 200 400 600 800 1000 1200
Wavelength (nm) Adapting Luminance
CIECAM02 Chromatic
Adaptation CIECAM02 FL
i i
i i
i i
i i
metric images as input for the stimulus and surround in absolute luminance units.
Relative colorimetry can be used, along with an absolute luminance scaling or ap-
proximation of absolute luminance levels. Images are typically specified in terms
of CIE XYZ tristimulus values, or another well-understood device-independent
RGB space such as sRGB.
The adapting stimulus used for the chromatic-adaptation transform is then cal-
culated by low-pass filtering of the CIE XYZ image stimulus itself. This stimulus
can also be tagged with absolute luminance information to calculate the degree of
chromatic adaptation, or the degree of adaptation can be explicitly defined. A sec-
ond low-pass filter is performed on the absolute luminance channel, CIE Y, of the
image and is used to control various luminance-dependent aspects of the model.
This includes the Hunt and Stevens effects, as well as simultaneous contrast ef-
fects caused by the local background. This blurred luminance image representing
the local neighborhood can be identical to the low-pass filtered image used for
chromatic adaptation, but in practice it is an image that is less blurred.
In an ideal implementation, another Gaussian filtered luminance image of sig-
nificantly greater spatial extent representing the entire surround is used to control
the prediction of global image contrast. In practice, this image is generally taken
to be a single number indicating the overall luminance of the surrounding viewing
conditions, similarly to CIECAM02. In essence, this can be considered a global
contrast exponent that follows the well-established behavior of CIECAM02 for
predicting the Bartleson and Breneman equations.
The specific low-pass filters used for the adapting images depend on viewing
distance and application. A typical example might be to specify a chromatic-
adaptation image as a Gaussian blur that spans 20 cycles per degree of visual
angle, while the local surround may be a Gaussian blur that spans 10 cycles per
degree [1267]. Controlling the extent of the spatial filtering is not yet fully un-
derstood, and it is an active area of research. For instance, recent research in
high dynamic range rendering has shown that the low-pass filtering for the local
chromatic adaptation and contrast adaptation may be better served with an edge-
preserving low-pass function, such as the bilateral filter as described by Durand
and Dorsey, and also later in Section 17.5 [270, 625, 626]. Research in HDR ren-
dering is one example of application-dependence in image appearance modeling.
A strong local chromatic adaptation might be generally appropriate for predicting
actual perceived image-differences or image-quality measurements, but inappro-
priate for image-rendering situations where generating a pleasing image is the
desired outcome.
A step-by-step process of calculating image appearance attributes using the
iCAM framework as described above follows. The first stage of processing in
i i
i i
i i
i i
iCAM is calculating chromatic adaptation, as is the first stage for most gen-
eral color appearance models. The chromatic-adaptation transform embedded in
CIECAM02 has been adopted in iCAM, since it has been found to have excellent
performance with all available visual data. This transform is a relatively sim-
ple, and easily invertible, linear chromatic-adaptation model amenable to image-
processing applications.
The general CIECAM02 chromatic-adaptation model is a linear von Kries
normalization of sharpened RGB image signals from a sharpened RGB adaptation
white. In traditional chromatic-adaptation and color appearance modeling that we
have discussed up to this point, the adapting white is taken to be a single set of
tristimulus values, often assumed to be the brightest signal in the image or the
XYZ tristimulus values of the scene measured off a perfect reflecting diffuser.
It is at this stage that image appearance deviates from traditional color ap-
pearance models. Instead of a single set of tristimulus values, the adapting signal
is taken to be the low-pass filtered adaptation image at each pixel location. This
means that the chromatic adaptation is actually a function of the image itself.
These adapting white signals in the blurred image can also be modulated by the
global white of the scene, if that is known.
The actual calculations of the adapted signals are identical to those in
CIECAM02, but will be presented again here for clarity. The von Kries nor-
malization is modulated with a degree-of-adaptation factor D that can vary from
0.0 for no adaptation to 1.0 for complete chromatic adaptation. The calculation of
D is also identical to the CIECAM02 formulation and can be used in iCAM as a
function of adapting luminance LA for various viewing conditions. Alternatively,
as in CIECAM02, the D factor can be established explicitly.
The chromatic-adaptation model is used to compute pixel-wise corresponding
colors for CIE Illuminant D65 that are then used in the later stages of the iCAM
model. It should be noted that, while the chromatic-adaptation transformation is
identical to that in CIECAM02, the iCAM model is already significantly different
since it uses the blurred image data itself to spatially-modulate the adaptation
white point. Essentially, any given pixel is adapted to the identical pixel location
in the blurred image.
⎡ ⎤ ⎡ ⎤
R X
⎣G⎦ = MCAT02 ⎣Y ⎦ ; (11.39a)
B Z
1 −LA − 42
D = F 1− exp ; (11.39b)
3.6 92
i i
i i
i i
i i
RD65
Rc (x, y) = D + (1 − D) R (x, y) ; (11.39c)
RW (x, y)
GD65
Gc (x, y) = D + (1 − D) G (x, y) ; (11.39d)
GW (x, y)
BD65
Bc (x, y) = D + (1 − D) B (x, y) . (11.39e)
BW (x, y)
The use of the blurred XYZ image as a spatially modulated adapting white
point implies that the content of an image itself, as well as the color of the overall
illumination, controls our state of chromatic adaptation. In this manner, iCAM
behaves similar, with regards to color constancy, to the spatial modulations of the
Retinex approach to color vision. This behavior can result in a decrease in overall
colorfulness or chroma for large uniform areas, such as a blue sky. While this may
be the correct prediction for the overall image appearance, it may produce unde-
sirable results when using this type of model for image-rendering applications.
Another example of the localized spatial behavior inherent in an image ap-
pearance model is the modulation of local and global contrast using the absolute
luminance image and surround luminance image or value. This modulation is
accomplished using the luminance level control function FL from CIECAM02.
This function is essentially a compressive function that slowly varies with abso-
lute luminance; it has been shown to predict a variety of luminance-dependent
appearance effects in CIECAM02 and earlier models [302]. As such, it was also
adopted for the first incarnation of iCAM. However, the global manner in which
the FL factor is used in CIECAM02 and the spatially localized manner used in
iCAM is quite different; more research is necessary to establish its overall appli-
cability for image appearance.
The adapted RGB signals, which have been converted to corresponding col-
ors for CIE Illuminant D65, are transformed into LMS cone responses. These
LMS cone signals are then compressed using a simple nonlinear power function
that is modulated by the per-pixel FL signal. Then, the cone signals are trans-
formed into an opponent color space, approximating the achromatic, red-green,
and yellow-blue space analogous to higher-level encoding in the human visual
system. These opponent channels are necessary for constructing a uniform per-
ceptual color space and correlates of various appearance attributes. In choosing
this transformation for the iCAM framework, simplicity, accuracy, and applica-
bility to image processing were the main considerations.
The uniform color space chosen was the IPT space previously published by
Ebner and Fairchild and introduced in Section 8.7.4 [274]. The IPT space was
i i
i i
i i
i i
J = I, (11.41a)
C = P2 + T 2 , (11.41b)
P
h = tan−1 , (11.41c)
T
i i
i i
i i
i i
Q = FL 1/4 J, (11.41d)
M = FL 1/4C. (11.41e)
i i
i i
i i
i i
1.2
Sensitivity
Achromatic
1.0 Red−green
Yellow−blue
0.8
0.6
0.4
0.2
0.0
0 10 20 30 40 50 60
Spatial Frequency (cycles−per−degree)
Figure 11.14. The spatial filters used to calculate image differences in the iCAM frame-
work.
The spatial filters are designed to be used in the frequency domain, rather than
as convolution kernels in the spatial domain. The general form for these filters is
The parameters, a, b, and c, in (11.43) are set to 0.63, 0.085, and 0.616,
respectively, for the luminance CSF for application on the Y channel. In (11.43),
f is the two-dimensional spatial frequency defined in terms of cycles per degree
(cpd) of visual angle. To apply these functions as image-processing filters, f is
described as a two-dimensional map of spatial frequencies of identical size to
the image itself. For the red-green chromatic CSF, applied to the C1 dimension,
the parameters (a1 , b1 , c1 , a2 , b2 , c2 ) are set to (91.228, 0.0003, 2.803, 74.907,
0.0038, 2.601). For the blue-yellow chromatic CSF, applied to the C2 dimension,
the parameters are set to (5.623, 0.00001, 3.4066, 41.9363, 0.083, 1.3684). The
resulting spatial filters for a single dimension are shown in Figure 11.14.
The band-pass nature of the luminance contrast-sensitivity function, as well
as the low-pass nature of the two chromatic channels can be seen in Figure 11.14.
These filters should look similar to those used in S-CIELAB (see Figure 11.12),
though slightly smoother. The smoothness is a result of defining the filters in the
frequency domain rather than as discrete convolution kernels of limited extent.
Two other important features can be seen with regard to the luminance CSF: its
behavior at 0 cycles per degree (the DC component) and that the response goes
above 1.0.
i i
i i
i i
i i
Care must be taken with the DC component when performing spatial filter-
ing in the frequency domain. The DC component contains the mean value of the
image for that particular channel. As with S-CIELAB, we would like the image-
difference metric to collapse down into a traditional color-difference metric for
solid patches. For this to happen, it is important that the mean value of any given
channel does not change. The luminance spatial filter described by (11.43) and
shown in Figure 11.14 goes to zero at the DC component. Therefore, it is neces-
sary to first subtract the mean value of the luminance channel, apply the spatial
filter, and then add the mean value back to the image.
The other important feature of the luminance CSF is that it goes above 1.0 for
a band of frequencies ranging roughly between 3 and 15 cycles per degree. This
is where the visual system is most sensitive to color differences, and, as such,
these regions are more heavily weighted. Care must be taken when applying
a frequency-domain filter that goes above 1.0, as this can often lead to severe
ringing. However, when the filter is sufficiently broad, this is often not a problem.
When the filter itself becomes very narrow, such as when applied to a large high-
resolution image, it may be necessary to normalize the luminance CSF such that
the maximum is at 1.0. Even then, care must be taken to minimize ringing artifacts
when using a narrow image-processing filter.
The images are filtered by first transforming into the frequency domain by
use of a Fast Fourier Transform, FFT, multiplying with the CSF filter, and then
inverting the FFT to return to the spatial domain:
FiltImagelum = FFT−1 {(FFT {Image − mean(Image)}) · CSFlum }
+ mean(Image), (11.44a)
−1
FiltImagechrom = FFT {(FFT {Image}) · CSFchrom } . (11.44b)
i i
i i
i i
i i
that the CSF functions be allowed to change shape as a function of the adapting
field, though this is clearly an indicator of the existence of multi-scale mecha-
nisms in the visual system.
Since spatial frequency adaptation cannot be avoided in real-world viewing
conditions, several models of spatial frequency adaptation or contrast-gain con-
trol have been described for practical applications [229, 290, 310, 311, 884, 1220].
These models generally alter the nature or behavior of the CSF based upon either
assumptions on the viewing conditions, or based upon the information contained
in the images themselves. A simplified image-dependent mechanism for spatial
frequency adaptation is given by
CSF
CSFadapt = , (11.45a)
α · FFT(Image) + 1
1
α= . (11.45b)
D · Xsize ·Ysize
This model essentially normalizes the contrast-sensitivity function based upon the
amount of information present in the image itself. It can almost be thought of as
a von Kries-style frequency adaptation. More details on the development of this
spatial frequency adaptation model can be found in [294].
When using this type of model, the frequency representation of the image
itself is typically blurred to represent spatial frequency channels. This blurring
can be done by performing a Gaussian convolution on the frequency image, which
can be thought of as multiplying the spatial image by a Gaussian envelope. The
scaling function α converts the frequency representation into absolute units of
contrast at each spatial frequency.
The D factor is similar to the degree of chromatic-adaptation factor found in
CIECAM02 and is set to 1.0 for complete spatial frequency adaptation. Spatial
frequency adaptation is important when calculating image differences between
images that may have regular periodic patterns, such as stochastic half-tone pat-
tern jpeg-compressed images with an 8-pixel blocking pattern. The regular period
of these patterns reduces the visual sensitivity to the spatial frequency of the pat-
tern itself, thus making the pattern less visible.
One potential benefit of spatial frequency adaptation is the ability to predict
visual masking without the need for multi-scale approaches. If a masking fre-
quency is present in an image, the contrast-sensitivity function for that particular
frequency region will become less sensitive.
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
Figure 11.15. Original image encoded in sRGB. (Photo courtesy of Ron Brinkmann.)
The next thing to notice is that both the contrast and the overall colorfulness
of the reproduced image are markedly decreased. This is because our original
was viewed at a very low luminance level in a darkened environment. When we
view the printed reproduction in an environment with a very high luminance level
and surround, the Hunt and Stevens effects, as well as the surround effects, will
increase the perceived contrast and colorfulness. It is necessary to decrease the
actual image contrast to account for these changes in the perceived contrast.
i i
i i
i i
i i
Figure 11.16. Image processed in CIECAM02 from a D65 monitor viewed in a dark sur-
round to a D50 print viewed in an average surround. (Photo courtesy of Ron Brinkmann.)
This example is meant to illustrate the predictive power of using a color ap-
pearance model for cross-media color reproduction. In general, CIE colorimetric
matches will not be adequate for reproducing accurate colors across a wide va-
riety of viewing conditions. In these situations, color appearance matches are a
much more desirable outcome. CIECAM02 was designed specifically for color
management applications, such as the example described here, and is an excellent
choice for many color reproduction or rendering systems.
i i
i i
i i
i i
Figure 11.18. Processing the image through the CIECAT02 global chromatic-adaptation
transform using the tristimulus values of the building itself as the adapting white.
i i
i i
i i
i i
Figure 11.19. The low-pass filtered image used in a per-pixel chromatic-adaptation trans-
form.
the image is still very blue. Since we know that the building is white, we can
take the CIE XYZ tristimulus values averaged across a number of pixels in the
building and use those as our adapting white in a standard CIECAT02 chromatic-
adaptation transform. The results of this calculation are shown in Figure 11.18.
We can see immediately that the building does indeed look white, but at the same
time the sky looks far too blue. This is because the sky is also being adapted by
the orange illumination, when in reality an observer would adapt locally when
viewing this scene.
In a situation like this, we would like to use a local chromatic-adaptation trans-
form, similar to that used in the iCAM image appearance framework. This can be
Figure 11.20. The adapted image using a per-pixel chromatic-adaptation transform with
complete adaptation to the low-passed image.
i i
i i
i i
i i
accomplished by using (11.39), along with the blurred adapting image shown in
Figure 11.19. The results of this calculation are shown in Figure 11.20.
We can see immediately that the resulting image shows some of the desired
localized adaptation results, as the building looks mostly white while the sky looks
less blue. Unfortunately, there are also severe haloing artifacts around the edges
of the image, as well as undesirable color de-saturation in areas such as the roof.
It would be better to combine some aspects of the global chromatic-adaptation
transform with some aspects of the per-pixel chromatic adaptation. This can be
accomplished by modifying the adaptation equations slightly to include a global
adaptation factor. The modified chromatic adaptation formulae are then
RD65
Rc (x, y) = D + (1 − D) R (x, y) , (11.46a)
RW (x, y) + Rg
GD65
Gc (x, y) = D + (1 − D) G (x, y) , (11.46b)
GW (x, y) + Gg
BD65
Bc (x, y) = D + (1 − D) B (x, y) . (11.46c)
BW (x, y) + Bg
Figure 11.21. The adapted image using a combination of the localized per-pixel
chromatic-adaptation transform as well as a global RGBg equal energy white.
i i
i i
i i
i i
Figure 11.22. The adapted image using a combination of the localized per-pixel
chromatic-adaptation transform as well as a global RGBg derived from the tristimulus val-
ues of the building itself. In essence, a combination of Figure 11.18 and Figure 11.20.
image was locally adapted to the low-passed image shown in Figure 11.19 and
globally adapted to an equal energy illumination, where RGBg equaled 1.0. Fig-
ure 11.22 was calculated with the same localized adaptation, but with the global
white point set to the tristimulus values of the building itself, the same tristimulus
values used to calculate Figure 11.18.
We can see from these results that combining the per-pixel chromatic-
adaptation transform with the global adaptation based on (11.46) allows for a
wide variety of white balancing effects.
color difference of about 1.05 units from the original, while the red arch image has
i i
i i
i i
i i
Figure 11.24. The same image as Figure 11.23 altered by changing the color of the arch.
(Photo courtesy of Ron Brinkmann.)
an average color difference of about 0.25 units. This means that the noise image,
though barely perceptible, has on average a four times larger color difference.
We can process these images through the iCAM image difference framework,
using 11.43–11.44, for a viewing distance corresponding to 80 cycles per degree.
i i
i i
i i
i i
Figure 11.25. The same image as Figure 11.23 altered by adding noise such that the
colorimetric error is the same as Figure 11.24. (Photo courtesy of Ron Brinkmann.)
When we do that, the mean color difference in IPT space for the red arch image
becomes approximately 0.75, while the mean color difference for the noise image
becomes 0.25. These units have been scaled by 100 on the I dimension, and 150
on the P and T dimensions, to match the units of CIELAB [274]. We see that by
applying the spatial filters as a pre-processing stage, the image difference calcu-
lations match up with what we would intuitively expect, while a color difference
calculation without the filters does not.
i i
i i
i i
i i
Part III
Digital Color
Imaging
i i
i i
i i
i i
Chapter 12
Image Capture
Previous chapters presented the theory of light propagation and the perception
of color, as well as its colorimetric representation. The final part of this book
details how man-made devices capture and display color information in the form
of images.
When devices have sensors that individually respond to the same range of
wavelengths of the electromagnetic spectrum to which human sensors are re-
sponsive, they generate images that are similar to what humans perceive when
they view their environment. Such devices enable the preservation of scenes of
value in people’s lives, in the form of images and movies. Images and videos
are also used for art, for entertainment, and as information sources. This chapter
presents image-capture techniques for electromagnetic radiation in the visible part
of the spectrum. Although we focus predominantly on digital cameras and the ra-
diometry involved with image capture, we also briefly describe photographic film,
holography, and light field techniques. We begin with a description of the optical
imaging process which enables images to be projected onto sensors.
An image starts with a measurement of optical power incident upon the cam-
era’s sensor. The camera’s response is then correlated to the amount of incident
energy for each position across the two-dimensional surface of its sensor. This is
followed by varying levels of image processing, dependent on the type of camera
as well as its settings, before the final image is written to file. A typical diagram of
the optical path of a digital single-reflex camera (DSLR) is shown in Figure 12.1.
The main optical components of a DSLR camera are:
Lens. The lens focuses light onto an image plane, where it can be recorded.
Mirror. The mirror reflects light into the optical system for the view finder. This
ensures that the image seen through the view-finder is identical to the im-
633
i i
i i
i i
i i
Sensor for
Penta exposure control
prism
Finder
eyepiece
Image plane
Finder
screen Infrared rejection
Optical low pass filter
Cover glass
Microlens array
Sub Color filter array
mirror Sensor
Package
Main
Objective lens mirror
Shutter
Autofocus
Figure 12.1. The components of a typical digital single lens reflex camera (after [1141]).
age that will be captured by the sensor. During image capture, the lens is
temporarily lifted to allow light to reach the sensor; in that position, it also
blocks stray light from entering the camera through the view-finder.
View-finder. Light reflected off the mirror is passed through a penta-prism to di-
rect it toward the finder eyepiece.
Infrared rejection filter. Since humans are not sensitive to infrared light, but cam-
era sensors are, an infrared rejection filter is typically present to ensure that
the camera only responds to light that can be detected by the human visual
system.
Optical low-pass filter. This filter ensures that high frequencies in the scene are
removed before spatial sampling takes place, in cases where these frequen-
cies would cause aliasing.
Micro-lens array. Since for each picture element (pixel) the light-sensitive area is
smaller than the total area of the pixel, a micro-lens can help ensure that all
light incident upon the pixel surface is directed towards the light-sensitive
part. A micro-lens therefore improves the efficiency of the camera.
i i
i i
i i
i i
Color filter array. To enable a single sensor to capture a color image, the sensor
is fitted with a color filter array, so that each pixel is sensitive to its own
set of wavelengths. The pattern repeats itself over small clusters of pixels,
of for instance 2 × 2 or 4 × 4 pixels. The camera’s firmware will perform
interpolation (demosaicing) to reconstruct a color image. The nature of the
filter, as well as the signal processing, determine the primaries and the white
point of the camera.
Sensor. The sensor is the light sensitive device that is placed in the image plane.
It records the incident number of photons for each of its pixels and converts
it to charge. After an exposure is complete, the charge is converted, pixel
by pixel, to a voltage and is then quantized. The camera’s firmware then ap-
plies a variety of image processing algorithms, such as demosaicing, noise
reduction, and white balancing.
Some camera types, such as point-and-shoot cameras, have no mirrors, and in-
stead have either an electronic view-finder, or have separate optics for the view-
finder. Cameras for motion imaging largely consist of the same components, bar
the mirrors, and may rely on an electronic shutter to determine the length of light
measurement for each frame. Differences can also be found in the in-camera im-
age processing algorithms. Digital video cameras, for instance, may use noise
reduction techniques that work over a sequence of frames, rather than by neces-
sity having to rely on the information present in a single frame [1000]. Digital
cinema cameras do not include elaborate signal processing circuitry and rely on
off-camera post-processing to shape and optimize the signal.
i i
i i
i i
i i
i i
i i
i i
i i
tationally symmetric with respect to this central axis, then further simplifications
are possible. The study of such systems is called Gaussian, first-order, or paraxial
optics.
According to Maxwell’s and Carathéodory’s theorems, in Gaussian optics
imaging may be approximated as a projective transformation. An object point
P = (px py pz )T generally projects to P = (px py pz )T according to [666]
Using homogeneous coordinates (see Section A.10) and applying symmetry ar-
guments, this projection can be represented with the following transformation:
⎡ ⎤ ⎡ ⎤⎡ ⎤
px f 0 0 0 px
⎢ py ⎥ ⎢ 0 f 0 0 ⎥ ⎢ py ⎥
⎢ ⎥=⎢ ⎥⎢ ⎥, (12.2)
⎣ pz ⎦ ⎣ 0 0 z0 f f − z0 z0 ⎦ ⎣ pz ⎦
pw 0 0 1 −z0 1
where we have chosen two focal points z0 and z0 and two focal lengths f and f .
The 3D position of the transformed point is found by applying the homogeneous
divide, yielding P = (px /pw py /pw pz /pw )T .
As we are still discussing a general transformation effected by an arbitrary
rotationally symmetric optical system, we will only assume that this system sits
somewhere between the object point P and the image point P and is centered
around the z-axis. This geometric configuration is shown in Figure 12.2. Note that
we use a right-handed coordinate system with the positive z-axis, i.e., the optical
axis, pointing toward the right. The positive y-axis points up. The x = 0-plane
is called the meridional plane, and rays lying in this plane are called meridional
rays. All other rays are called skew rays. Meridional rays passing through an
optical system will stay in the meridional plane.
Due to the fact that the optical system is assumed to be circularly symmet-
ric around the z-axis, we can drop the x-coordinate and focus on the y- and z-
coordinates. The former is given by
f py
py = , (12.3)
z − z0
i i
i i
i i
i i
Object Image
Principal Principal
Plane Plane
y
P0 = (py, pz) f’ f f f’
py P1 = (p’y, p’z)
p’y
N F H N’ F’ H’
pz z0 z’0 p’z z
Figure 12.2. Projective transformation from P = (py pz )T to P = (py pz )T . The trans-
formation is characterized by the object focal point at z0 , the image focal point at z0 , the
object focal length f , and the image focal length f .
f f
pz − z0 = , (12.4)
z − z0
yielding the formula for the perspective transformation of a pinhole camera. A
pinhole camera is an optical system consisting of a small hole in a surface that sep-
arates image space from object space. It is a camera model that is frequently used
in computer graphics. In addition, a real pinhole camera can be built
cheaply [956].
Associated with an optical imaging system that effects this transformation are
several important points located along the z-axis. These points are collectively
referred to as cardinal points [666]:
Image focal point. Located at F = (0 0 z0 )T , it ia also known as the back focal
point or the second focal point. From (12.4), we have that for an image
located at P1 = F , that its conjugate object point is located at infinity.
i i
i i
i i
i i
In a physical system, the radii of the lens elements are limited in spatial extent.
This means that only a fraction of the amount of light emitted by a point source
can ever be imaged. Further, the smallest diameter through which light passes
is either determined by fixed lens elements, or by an adjustable diaphragm. The
limiting element in this respect is called the aperture stop. Correspondingly, the
element limiting the angular extent of the objects that can be imaged is called the
field stop. This element determines the field of view of the camera.
Further important parameters characterizing an optical imaging system are the
entrance pupil and exit pupil. The entrance pupil is the aperture stop as seen from
a point both on the optical axis and on the object. An example with a single lens
and an aperture stop is shown in Figure 12.3. This means that the size of the
entrance pupil is determined both by the size of the aperture stop as well as the
lenses that may be placed between the object and the aperture stop. The axial
object point forms the apex of a cone whose base is determined by the entrance
pupil. The entrance pupil therefore determines how much light emanated from
this point is imaged by the system.
The exit pupil is the aperture stop as seen from the image plane through any
lenses that may be located between the aperture stop and the image plane (see
Figure 12.3). The exit pupil forms the base of a second cone, with the axial point
in the image plane as the apex. The entrance and exit pupils are generally not
the same, given that the presence of lenses may magnify or minify the apparent
size of the aperture stop. The ratio of the entrance and exit pupil diameters is
called the pupil magnification or pupil factor, a parameter used in determining
the radiometry associated with the optical system.
i i
i i
i i
i i
Aperture
stop
Chief
Ray
Optical
axis
Mxp Mep
Object
Plane Image
Exit Plane
pupil
Entrance
pupil
Figure 12.3. The entrance and exit pupils of an optical system with a single lens and an
aperture stop. Also shown is a chief ray, which passes through the center of the aperture
stop.
Finally, rays starting from any off-axis point on the object that passes
through the center of the aperture stop are called chief rays [447]. An exam-
ple is shown in Figure 12.3. The chief ray is aimed at the center of the entrance
pupil Mep and passes through the center of the exit pupil Mxp . The marginal ray,
on the other hand, starts from the on-axis object point and passes through the rim
of the aperture stop (or entrance pupil).
i i
i i
i i
i i
+y
+x
Ψ
Ψ
dΨ
v
s r dΨ Ψ
d +y
h
θ r
d dA
dA +x
θ h
Object s
plane
+z
Figure 12.4. A differential area dA is imaged onto the area dA on the image plane by a
system characterized by its entrance and exit pupils (after [666]).
Following Lee [666], we analyze a differential area dA, located off-axis in the
object plane, which projects to a corresponding differential area dA on the image
plane. In between these two areas is the optical system, which is characterized by
the entrance and exit pupils. The geometric configuration is shown in Figure 12.4.
The chief ray starting at dA makes an angle θ with the optical axis. The distance
between dA and the entrance pupil along the optical axis is indicated by s; this
patch is located a distance h from the optical axis.
Further, d is the radius of the entrance pupil, and the differential area dΨ on
the entrance pupil is located a distance r from the optical axis. We are interested in
integrating over the entrance pupil, which we do by summing over all differential
areas dΨ. The vector v from dA to dΨ is given by
⎡ ⎤
r cos(Ψ)
v = ⎣r sin(Ψ) − h⎦ . (12.5)
s
This vector makes an angle α with the optical axis, which can be computed from
s
cos(α ) = . (12.6)
v
i i
i i
i i
i i
Assuming that the differential area dA is Lambertian (see Section 2.9.2), then the
flux incident upon differential area dA is given by [666]
s
d 2π r dΨ dr
v s
dΦ0 = Le dA (12.7a)
r=0 Ψ=0 v 2 v
d 2π
r s2 dΨ dr
= Le
2 dA (12.7b)
r=0 Ψ=0
r2 cos2 (Ψ) + (r sin(Ψ) − h)2 + s2
d
2π s2 + h2 + r2 rs2 dr
= Le dA
3/2 (12.7c)
r=0
(s2 + h2 + r2 )2 − 4h2 r2
⎛ ⎞
π ⎜ s2 + h2 − d 2 ⎟
= Le dA ⎝1 −
1/2 ⎠ . (12.7d)
2
(s2 + h2 + d 2 )2 − 4h2 d 2
As shown in Figure 12.4, all quantities used to compute the flux over the
entrance pupil have equivalent quantities defined for the exit pupil; these are
all indicated with a prime. Thus, the flux over the exit pupil can be computed
analogously:
⎛ ⎞
π ⎜ s2 + h2 − d 2 ⎟
dΦ1 = L dA ⎝1 −
1/2 ⎠ . (12.8)
2 e
(s2 + h2 + d 2 )2 − 4h2 d 2
Under the assumption that the optical system does not suffer light losses, the
flux at the entrance and exit pupils will be the same. The irradiance at the exit
pupil is then given by
dΦ0
Ee = (12.9a)
dA ⎛ ⎞
π dA ⎜ s2 + h2 − d 2 ⎟
= Le ⎝1 −
1/2 ⎠ , (12.9b)
2 dA
(s2 + h2 + d 2 )2 − 4h2 d 2
which is equivalent to
⎛ ⎞
π ⎜ s2 + h2 − d 2 ⎟
= L ⎝1 −
1/2 ⎠ . (12.9c)
2 e
(s2 + h2 + d 2 )2 − 4h2 d 2
i i
i i
i i
i i
If the index of refraction at the object plane is n, and the index of refraction at the
image plane is n , then it can be shown that the irradiance at the exit pupil is given
by
⎛ ⎞
2 2 2 2
π n ⎜ s +h −d ⎟
Ee = Le ⎝1 −
1/2 ⎠ (12.10a)
2 n 2 2 2 2 2 2
(s + h + d ) − 4h d
2
π n
= Le G, (12.10b)
2 n
where
s2 + h2 − d 2
G = 1−
1/2 . (12.10c)
(s2 + h2 + d 2 )2 − 4h2 d 2
Equation (12.10a) is called the image irradiance equation. While this equations
is general, the quantities involved are difficult to measure in real systems. How-
ever, this equation can be simplified for certain special cases. In particular on-axis
imaging, as well as off-axis imaging, whereby the object distance is much larger
than the entrance pupil diameter form simplified special cases. These are dis-
cussed next.
i i
i i
i i
i i
the image will be. Thus, it is a value that can be used to indicate the speed of the
optical system.
A related measure is the relative aperture F, which is given by
1
F= . (12.14)
2n sin (β )
The relative aperture is also known as the f-number. In the special case that the
image point is at infinity, we may assume that the distance between image plane
and exit pupil s equals f , i.e., the image focal length. The half-angle β can then
be approximated by tan−1 (d / f ), so that the relative aperture becomes
1
F∞ ≈ (12.15a)
2n sin (tan−1 (d / f ))
1 f
≈ . (12.15b)
n 2d
1 f
F∞ ≈ . (12.15c)
m p n 2d
If the object and image planes are both in air, then the indices of refraction
at both sites are approximately equal to 1. If the magnification factor is assumed
to be close to 1 as well, then the relative aperture for an object at infinity may be
approximated by
f
F∞ = , (12.16)
D
where D is the diameter of the entrance pupil. An alternative notation for the f -
number is f /N, where N is replaced by the ratio f /D. Thus, for a lens with a focal
length of 50 mm and an aperture of 8.9 mm, the f -number is written as f/5.6. The
image irradiance Ee can now be written as
2
π D2 Le mp
Ee = . (12.17)
4 f
i i
i i
i i
i i
s2 d 2 dA
Ee ≈ π Le (12.18a)
(s2 + d 2 + h2 )
2 dA
s2 d 2 dA
≈ π Le . (12.18b)
(s2 + h2 )
2 dA
This equation can be rewritten by noting that the cosine of the off-axis angle θ
(see Figure 12.4) is given by
s
cos (θ ) = √ . (12.19)
s2 + h2
The image irradiance is therefore approximated with
2
d dA
Ee ≈ π Le cos4 (θ ) . (12.20)
s dA
The ratio dA/dA is related to the lateral magnification m of the lens through
dA
m= , (12.21)
dA
so that we have 2
d
Ee ≈ π Le cos (θ )
4
m2 . (12.22)
s
The lateral magnification satisfies the following relation:
m f
= . (12.23)
m−1 s
The image irradiance can therefore be approximated with
2
d
Ee ≈ π Le cos (θ )
4
, (12.24)
(m − 1) f
or, in terms of the f -number,
π Le 1
Ee ≈ cos4 (Θ) . (12.25)
4 F 2 n2 (m − 1)2 m2p
π Le π Le d 2
Ee ≈ cos 4
(Θ) ≈ cos4 (Θ) . (12.26)
4 F2 4 f2
i i
i i
i i
i i
Note that the fall-off related to Θ behaves as a cosine to the fourth power. This
fall-off is a consequence of the geometric projection. Even in systems without
vignetting, this fall-off can be quite noticeable, the edges of the image appearing
significantly darker than the center [666]. However, lens assemblies may be de-
signed to counteract these effects, so that in a typical camera system the fall-off
with eccentricity does not follow the cosine to the fourth power fall-off.
12.1.5 Vignetting
Effective
aperture
stop
Chief
Ray
Optical
axis
Object Image
Plane Cross-section of off-axis Plane
ray bundle is smaller than
on-axis ray bundle
Figure 12.5. In this optical system that consists of two lenses, the effective aperture of the
off-axis ray bundle is smaller than the effective aperture of the on-axis ray bundle.
i i
i i
i i
i i
becomes
2
π n
Ee (x , y ) = L (x , y ) T V (x , y ) G, (12.27)
2 e n
where G is defined in (12.10c). We have also introduced the transmittance T of
the lens assembly to account for the fact that lenses absorb some light.
Figure 12.6. Example of an image showing veiling glare; Bristol Balloon Fiesta, Ashton
Court Estate, August 2007.
i i
i i
i i
i i
that may even be outside the field of view may result in a spatially uniform in-
crease in image irradiance and thereby in a corresponding reduction of contrast.
This effect is called veiling glare.
Specific undesirable patterns may be imaged as a result of reflections inside
the lens assembly. These patterns are collectively called flare sources. For in-
stance, light reflected off the blades of the diaphragm may produce an image of
the diaphragm, resulting in iris flare.
An example of an image suffering from flare is shown in Figure 12.6. Under
normal exposure settings, and with good quality hardware, flare should remain
invisible. In this particular example, the sun was directly visible, causing reflec-
tions inside the camera, which are seen here as bright patches that are not part of
the environment.
Accounting for flare sources with a spatially dependent function g(x , y ), the
image irradiance is therefore more generally modeled with
2
π n
Ee (x , y ) = L (x , y ) T V (x , y ) G + g(x , y ). (12.28)
2 e n
The more components a lens has, the more surfaces there are that could reflect
and refract light. Zoom lenses, which can focus light onto the image plane at a
variable range of magnifications [545, 680, 873], as shown in Figure 12.7, tend
to have more internal surfaces than other lens types, so that they are also more
liable to produce flare. Within a camera, an extra source of flare is the sensor
itself, which may reflect as much as 30% of all light incident upon it. This is due
to the different indices of refraction of the Si (nSi ≈ 3) and SiO2 (nSiO2 = 1.45)
layers in the sensor itself [812]. Substituting these values into (2.138a) yields a
reflectance of R = 0.35. Most of this reflected light will re-enter the lens, possibly
to be reflected and refracted again.
To minimize the occurrence of flare, lenses can be coated with an anti-
reflective coating, as discussed further in Section 12.6.1. Similarly, the anti-
Figure 12.7. From left to right: the same scene was captured with different magnification
settings; Long Ashton, Bristol, May 2007.
i i
i i
i i
i i
reflective coating can be sandwiched between the Si/SiO2 interface to reduce re-
flections off of the sensor. Photographers may further reduce the effect of lens
flare by employing a lens hood, so that light entering the optical system from
oblique angles is blocked.
Finally, a related phenomenon is known as glare. Glare is caused by stray
reflections entering the lens directly due to strongly reflective surfaces being im-
aged [335]. If these reflections are off dielectric materials, then they will be po-
larized and can therefore be subdued by using polarization filters.
12.2 Lenses
So far we have examined optical systems more or less as a black box and under
several assumptions, such as Gaussian optics, derived the image irradiance that is
the basis of optical image formation. The components of optical systems typically
consist of refractive elements as well as an aperture. In this section we discuss
lenses as well as some of their attributes.
If a lens consists of a single element, it is called a simple lens. If it is made
up of multiple components, it is called a compound lens. Simple lenses are char-
acterized by their index of refraction as well as the shape of their front and back
surfaces. They may also be coated to improve their optical properties. For now
we concentrate on their shape.
The front and back surfaces of a lens may be either planar, spherical, or curved
in some different way. In the latter case, such lens elements are called aspherics.
Lenses with spherical surfaces are important because they are relatively straight-
forward to produce and polish, while still allowing high accuracy and minimal
artifacts [447]. Indicating the radius of the surface on the side of the object plane
with d0 and the radius of the surface facing the image plane with d1 , then depen-
dent on the sign of the radius different lens types may be distinguished, as shown
in Figure 12.8.
Note that an infinite radius means that the surface is planar. The sign of the
radius indicates the direction of curvature in relation to the vector along the optical
axis that points from the object plane to the image plane. A positive radius is for
a spherical surface that has its center on the side of the object plane, whereas a
negative radius is for a surface with its center on the side of the image plane.
Lenses that are thicker at the center, such as the three convex lenses in Fig-
ure 12.8, are variously called convex, converging, or positive. They tend to direct
light towards the optical axis. If the opposite holds, i.e., they are thinner at the
center, they are called concave, diverging, or negative [447].
i i
i i
i i
i i
P2
θi
lo li
h θr
ϕr
P0 so V si C P1
|| P0 - V || = so n1
n0
|| V - P1 || = si
|| P2 - P0 || = lo
Indices of
|| P1 - P2 || = li
refraction
i i
i i
i i
i i
According to Fermat’s principle (see Section 2.9.1), the optical path length of
this ray is given by
n0 lo + n1 li . (12.29)
The path lengths of the two ray segments lo and li can be expressed in terms of
the radius of the sphere d, the object distance so , and the image distance si :
lo = d 2 + (so + d)2 − 2d (so + d) cos (φ ), (12.30a)
li = d 2 + (si − d)2 + 2d (si − d) cos (φ ). (12.30b)
Substituting this result into the optical path length (12.29) yields
n0 d 2 + (so + d)2 − 2d (so + d) cos (φ )
+ n1 d 2 + (si − d)2 + 2d (si − d) cos (φ ). (12.31)
Applying Fermat’s principle by using the angle φ as the position variable results
in [447]
n0 d (so + d) sin (φ ) n1 d (si − d) sin (φ )
− = 0, (12.32)
2lo 2li
which can be rewritten to read
n0 n1 1 n1 si n0 so
+ = − . (12.33)
lo li d li lo
Any ray that travels from P0 to P1 by means of a single refraction through
the spherical surface obeys this relationship. Of course, there exist angles φ (and
therefore intersection points P2 ) for which the point P1 is never reached.
If we apply the principle of Gaussian optics, i.e., cos(φ ) ≈ 1, and only con-
sider paraxial rays, then we can simplify (12.30) to approximate
lo ≈ so , (12.34a)
l i ≈ si . (12.34b)
i i
i i
i i
i i
P3
P2
d1
d0
P’ P0 C2 V1 t V2 C1 P1
sn so si
n1
|| P0 - V1 || = so n0
|| V2 - P1 || = si Indices of
|| P’ - V1 || = sn refraction
|| V1 - V2 || = t
Figure 12.10. Two facing spherical surfaces form a thin lens, which projects light ema-
nating from on-axis point P0 to point P1 .
i i
i i
i i
i i
In the limit that the object distance so becomes infinite, the image distance
becomes the image focal length si = fi . Conversely, a similar argument can be
made for the object focal length for points projected on an infinitely far-away
image plane: so = fo . But since we are dealing with a thin lens, we may set
fi = fo . Replacing either with a generic focal length f , we have
1 1 1
= (n1 − 1) − . (12.41)
f d0 d1
Note that all light rays passing through the focal point, located a distance f in
front of the lens, will travel through space in parallel after passing through the
lens. Such light is called collimated. Combining (12.40) with (12.41), we find the
well-known Gaussian lens formula:
1 1 1
= + . (12.42)
f so si
the image is called virtual if the rays leaving the optical system are divergent.
i i
i i
i i
i i
12.3 Aberrations
Much of the analysis of lens systems presented so far uses approximations that
are only accurate to a first order. A full lens design requires more complicated
analysis that is beyond the scope of this book [1061, 1062]. Lens design pro-
grams have taken much of the burden of analysis out of the hands of the lens
designer. In addition, ray tracing, discussed in the context of computer graphics
in Section 2.10.1, can now be effectively used to evaluate lens design.
i i
i i
i i
i i
Even so, deviations from idealized conditions do occur in practice. They are
collectively known as aberrations and can be classified into two main groups:
chromatic aberrations and monochromatic aberrations.
Chromatic aberrations occur when the material used to manufacture the lens
has a wavelength-dependent index of refraction. Such materials show dispersion
of light (see Section 3.6.2), which is clearly unwanted in lens design. Monochro-
matic aberrations form a group of distortions that make the image either unclear
(spherical aberration, astigmatism, and coma), or warped (distortion, Petzval field
curvature). Both classes of aberrations are discussed next.
The assumption that ray paths make only small angles with the optical axis,
i.e., the limitation to paraxial rays, enabled sine and cosine functions to be approx-
imated as linear and constant, respectively. Thus, we used only the first terms of
their respective Taylor expansions:
∞
(−1)n
sin (x) = ∑ (2n + 1)! x2n+1 ; (12.45a)
n=0
∞
(−1)n 2n
cos (x) = ∑ x . (12.45b)
n=0 (2n)!
If we were to use the first two terms of these expansions, we would get the so-
called third-order theory, which is more accurate, but more complicated. The
aberrations discussed here, however, are those that result from a deviation from
first-order (Gaussian) optics.
With respect to (12.35), the extra term varies with h2 , where h is the distance from
the point on the lens where the ray of interest strikes to the optical axis. This term
causes light passing through the periphery of the lens to be focused nearer the
i i
i i
i i
i i
Optical
axis
Paraxial
focus
{
Object Longitudinal Image
Plane spherical Plane
aberration
Circle of least
{
Image
confusion Plane
Paraxial
focus
{ Transversal
spherical
aberration
Caustic
lens than light that passes through the lens nearer the optical axis. This process
is called spherical aberration and is illustrated in Figure 12.12. Here, the rays
intersect the optical axis over a given length, and this phenomenon is therefore
called longitudinal spherical aberration. These same rays intersect the image
plane over a region, and this is called transversal spherical aberration.
Finally, in the presence of spherical aberration, the rays form a convex hull
which is curved. This shape is called a caustic. The diameter of the projected
spot is smallest when the image plane is moved to the location associated with the
so-called circle of least confusion, which in Figure 12.12 is in front of the focus
point that would be computed if only paraxial rays were taken into account.
i i
i i
i i
i i
Image
Plane
Chief
Ray Optical
axis
Negative
Object coma
Plane
Image
Plane
Chief
Ray Optical
axis
Object Positive
Plane coma
through the principal points), then the result is a positive coma. If the marginal
rays hit the image plane closer to the optical axis than the principal ray, the coma
is called negative.
i i
i i
i i
i i
12.3.3 Astigmatism
A further aberration, called astigmatism, occurs for off-axis object points. Recall
that the meridional plane is defined by the object point and the optical axis. The
chief ray lies in this plane, but refracts at the lens boundaries. The sagittal plane
is orthogonal to the meridional plane and is made up of a set of planar segments,
each of which intersect parts of the chief ray.
We consider a ray bundle lying in the meridional plane, as well as a separate
ray bundle lying in the sagittal plane. In the absence of other forms of aberrations,
all the rays in the meridional plane will have the same optical path length and will
Chief
Ray Optical
axis (z)
Object
Plane
Figure 12.15. A flat bundle of rays located in the meridional plane has a different focal
point than a flat bundle of rays lying in the sagittal plane, leading to astigmatism.
i i
i i
i i
i i
Meridional
plane
Circular
Meridionally
cross-
elongated
Saggital section
cross-section
plane
Sagitally
elongated
cross-section
Figure 12.16. The meridional focal point is elongated, as is the sagittal focal point, albeit
in orthogonal directions. In between these focal point lies the circle of least confusion,
which has the smallest diameter.
therefore converge at the same focal point. However, this optical path length can
be different from the optical path length of the rays lying in the sagittal plane.
In turn, this causes the focal point of sagittal rays to lie either before or after the
focal point associated with meridional rays, as shown in Figure 12.15.
At the sagittal focal point, the meridional rays will still not have converged,
leading to an elongated focal point. Similarly, the meridional focal point will be
elongated. For rays that are neither sagittal nor meridional, the focal point will be
in between the sagittal and meridional focal points. Somewhere midway between
the two focal points, the cross-section of all rays passing through the lens will be
circular (see Figure 12.16). In the presence of astigmatism, this circular cross-
section will be the sharpest achievable focal point; it is called the circle of least
confusion.
12.3.4 Petval Field Curvature
For spherical lenses, the object and image planes, which were hitherto considered
to be planar, are in reality curving inwards for positive lenses and outwards for
negative lenses. In fact, both “planes” are spherical and centered around the lens.
This is known as Petzval field curvature. The planar approximation is only valid
in the paraxial region. If a flat image plane is used, for instance because the
image sensor is flat, then the image will only be sharp near the optical axis, with
increasing blur evident in the periphery.
The occurrence of field curvature can be counteracted by using a combination
of positive and negative lenses. For instance, the inward curvature of the image
plane associated with a positive lens can be corrected by placing a negative lens
near the focal point of the positive lens. The negative lens is then known as a field
flattener.
i i
i i
i i
i i
Figure 12.17. The effect of pincushion and barrel distortions upon the appearance of a
regularly shaped object.
12.3.5 Distortions
Another source of monochromatic aberrations is called distortion and relates to
the lateral magnification of the lens (defined for thin lenses in Section 12.2.2).
While this measure was defined as a constant for the whole lens, it is only valid
under paraxial conditions, i.e., near the optical axis. In real spherical lenses, the
lateral magnification is a function of distance to the optical axis.
If the lateral magnification increases with distance to the optical axis, then a
positive distortion or pincushion distortion results. For lenses with a decreasing
lateral magnification, the result is a negative distortion or barrel distortion. Both
distortions affect the appearance of images. In the case of a pincushion distortion,
straight lines on the object appear to be pulled toward the edge of the image.
i i
i i
i i
i i
A barrel distortion causes straight lines to be pushed toward the center of the
image as shown in the diagram of Figure 12.17 and the photographic example of
Figure 12.18 [1235].
Single thin lenses typically show very little distortion, with distortion being
more apparent in thick lenses. Pincushion distortion is associated with positive
lenses, whereas barrel distortion occurs in negative lenses. The addition of an
aperture stop into the system creates distortion as well. The placement of the
aperture stop is important; in a system with a single lens and an aperture stop, the
least amount of distortion occurs when the aperture stop is placed at the position
of the principal plane. The chief ray is then essentially the same as the principal
ray.
Placing the aperture stop after a positive lens will create pincushion distortion,
whereas placing the aperture stop in front of the lens at some distance will create
barrel distortion. For a negative lens, the position of the aperture stop has the
opposite effect. The effects of distortion in a compound lens can be minimized by
using multiple single lenses which negate each other’s effects and by placing the
aperture half way between the two lenses.
Other types of geometric distortion are decentering distortion and thin prism
distortion; the former is caused by optical elements that are not perfectly centered
with respect to each other, and the latter is due to inadequate lens design, man-
ufacturing, and assembly and is characterized by a tilt of optical elements with
respect to each other.
i i
i i
i i
i i
One can correct for chromatic aberration by applying a pair of thin lenses hav-
ing different refractive indices, a configuration termed thin achromatic doublets.
The wavelength-dependent focal length f (λ ) of the pair of lenses is then given by
1 1 1 d
= + − , (12.47)
f (λ ) f1 (λ ) f2 (λ ) f1 (λ ) f2 (λ )
assuming that the lenses are a distance d apart, their respective focal lengths are
f1 (λ ) and f 2 (λ ), and their indices of refraction are n1 (λ ) and n2 (λ ). If the index
of refraction of the surrounding medium is taken to be 1, then the focal lengths of
the two lenses are given by
1
= k1 (n1 (λ ) − 1) , (12.48a)
f 1 (λ )
1
= k2 (n2 (λ ) − 1) . (12.48b)
f 2 (λ )
In these equations, we have replaced the factor that depends on the radius of the
front and back surfaces of the two lenses by the constants k1 and k2 , which do not
depend on wavelength. Substituting (12.48) into (12.47) gives
1 d
= k1 (n1 (λ ) − 1) + k2 (n2 (λ ) − 1) − .
f (λ ) 1 1
k1 (n1 (λ ) − 1) k2 (n2 (λ ) − 1)
(12.49)
For the focal lengths f (λR ) and f (λB ) to be identical, the two thin lenses
should be spaced a distance d apart. This distance can be found by solving (12.49)
for d, giving
1 k1 (n1 (λB ) − n1 (λR )) + k2 (n2 (λB ) − n2 (λR ))
d= . (12.50)
k1 k2 (n1 (λB ) − 1) (n2 (λB ) − 1) − (n1 (λR ) − 1) (n2 (λR ) − 1)
In the special case that the two lenses are touching, we have that d = 0, giving
k1 n2 (λB ) − n2 (λR )
=− . (12.51)
k2 n1 (λB ) − n1 (λR )
It is now possible to specify the focal length of the compound lens as that of
yellow light, which has a wavelength nearly half way between the wavelengths
for blue and red light. The focal lengths for the two thin lenses are given by
1
= k1 (n1 (λY ) − 1) , (12.52a)
f1 (λY )
1
= k2 (n2 (λY ) − 1) . (12.52b)
f2 (λY )
i i
i i
i i
i i
k1 n2 (λY ) − 1 f 2 (λY )
= . (12.53)
k2 n1 (λY ) − 1 f 1 (λY )
The quantities w1 and w2 are the dispersive powers associated with the indices of
refraction n1 and n2 . If we take the following wavelengths,
that coincide with the Fraunhofer F, D, and C spectral lines (see Table 3.8 on
page 169), then we can define the dispersive power of an optical material more
precisely. Rather than dispersive power, its reciprocal V = 1/w is often used,
which is variously known as the V-number, the Abbe number, the dispersive index,
or the constringence of the material. The Abbe number V of a material is thus
defined as
nD − 1
V= , (12.56)
nF − nC
i i
i i
i i
i i
Object Chief f
Ray Image
Plane
h=d
h1 h2
Object
Plane
b
s i
so
Figure 12.19. The geometric configuration assumed for computing the blur circle.
circle is then
b = h2 − h1 (12.57a)
1 1
= di − (12.57b)
s so
so − s
= di . (12.57c)
s so
The Gaussian lens formula (12.42) for this configuration is
1 1 1
= + , (12.58)
f s i
and therefore we have
sf
i= . (12.59)
s− f
Substitution into (12.57c) yields
s f d so − s pfd
b= = . (12.60)
s − f s so s− f
In this equation, the quantity p = (so − s)/s so can be seen as the percentage focus
error. We use the blur circle next in the calculation of depth of field.
i i
i i
i i
i i
Figure 12.20. The limited depth of field causes a narrow range of depths to appear in
focus. Here, the background is out of focus as a result. Macro lenses, in particular, have
a very narrow depth of field, sometimes extending over a range of only a few millimeters;
Orlando, FL, May 2004.
there is a region of distances between the lens and object points that are all focused
reasonably sharply. This range of depths is called depth of field. As an example,
an image with a limited depth of field is shown in Figure 12.20.
What constitutes an acceptably sharp focus depends on the size of the image
plane, the resolution of the sensor, the size at which the image is reproduced, and
all of this in relation to the angular resolution of the human visual system. As
such, there is not a simple calculation as to range of values that are considered to
be in focus.
However, if for some system it is determined that a circle with a radius b on
the image plane leads to a reproduction that displays this circle as a single point.
We then assume that the camera is focused on an object at distance so and that the
blur circle is much smaller than the aperture, i.e., d b. The distance from the
lens to the nearest point snear that has acceptable focus is then
so f
snear = . (12.61a)
b
f + (s0 − f )
d
i i
i i
i i
i i
so f
sfar = . (12.61b)
b
f − (so − f )
d
With snear < so < sfar , the depth of field is now simply the distance between the
near and far planes that produce an acceptable focus, i.e., sfar − snear .
We see from (12.61b) that the far plane will become infinite if the denominator
goes to zero, which happens if
b
f= (so − f ) . (12.62)
d
Solving for so gives the hyperfocal distance:
d
so = f +1 . (12.63)
b
The corresponding near plane is given by
f d
snear = +1 . (12.64)
2 b
Thus, if the camera is focused upon the hyperfocal plane, all objects located be-
tween this value for snear and beyond will be in sharp focus.
Finally, we see that the size of the aperture d affects the depth of field. An
example of a pair of images photographed with different aperture sizes is shown
in Figure 12.21. In particular in macro photography the depth of field can become
very limited—as small as a few millimeters in depth.
Figure 12.21. Images photographed with a large aperture (f/3.2, left) and a small aperture
(f/16, right). The depth of field is reduced to less than a centimeter in the left image, due
to the use of a 60 mm macro lens. The figurine is a Loet Vanderveen bronze.
i i
i i
i i
i i
While depth of field is a concept related to the object side of the lens, a similar
concept exists for the image side of the lens. Here, it is called depth of focus or
lens-to-film tolerance and measures the range of distances behind the lens where
the image plane would be in sharp focus. The depth of focus is typically measured
in millimeters, or fractions thereof.
Figure 12.22. A diaphragm with three size settings. This diaphragm does not allow
more than three settings and is suitable for budget cameras. More typical diaphragms are
constructed using a set of blades which together form a variable-sized opening.
i i
i i
i i
i i
i i
i i
i i
i i
Focal plane shutters are normally made of a pair of curtains that slide across
the sensor at some distance from each other, forming a moving slit. The closer
this type of shutter is mounted to the sensor, the less diffraction off the edge of
the shutter will have an effect on the image quality. Focal plane shutters are often
used in single-lens reflex (SLR) cameras.
where n is the refractive index of the glass used in the fabrication of the lens. We
also assume that the thickness of the coating is a quarter of a wavelength. To
reduce the reflectance to 0, we would have to choose a coating with a refractive
√
index of nc = n. For glass with a refractive index of around 1.5, this means that
a coating with an index of refraction of around 1.22 should be sought. Unfortu-
nately, such coatings do not exist. The most suitable coatings have an index of
refraction of around 1.33 [613].
For a two-layer coating, the reflectance is given by
2
n2c,1 n − n2c,2
R= , (12.66)
n2c,1 n + n2c,2
i i
i i
i i
i i
where the coating applied to the lens has a refractive index of nc,2 and the second,
outer layer has an index of refraction of nc,1 . Once more, the coatings are assumed
to have a thickness of a quarter of a well-chosen wavelength. To guarantee no
reflections, this equation shows that we should seek a combination of coatings for
which the following relation holds:
nc,2
n= . (12.67)
nc,1
In practice, solutions of this form are feasible, and lenses employed in digital
cameras are routinely coated with multiple coats to suppress unwanted reflections.
n2e − n2o
Δx = Δy = Δd . (12.68)
2ne no
The displacement Δy is achieved by the second filter, which is rotated 90◦ with
respect to the first filter.
Unfortunately, high-quality birefringent elements are expensive. An alterna-
tive is to base the anti-aliasing filter on phase delay. Remember that the speed
2 Note that the “c” is pronounced as “k”.
i i
i i
i i
i i
of light propagation depends on the type of dielectric material the light passes
through. By etching a shallow pattern into a dielectric element, some of the
light passes through a thicker layer of material, emerging with some phase de-
lay. By varying the depth of the etched pattern, the amount of phase delay can be
controlled. The different amounts of phase-delayed light will interfere, thereby
canceling high-frequency content more than low-frequency content. The type of
pattern edged into the dielectric element can be a set of discs which are randomly
placed and have randomly chosen diameters. The advantage of this approach is
cost of manufacture, whereas the disadvantage is that unless the filter is carefully
designed, this type of phase-delay filter can act as a diffraction grating, causing
significant artifacts [5].
Figure 12.24. A CCD sensor with the infrared filter fitted (left) and removed (right).
i i
i i
i i
i i
placement of the infrared filter is typically between the lens and the sensor. An
example of an infrared rejection filter is shown in Figure 12.24.
i i
i i
i i
i i
the minimum spacing between two points that can just be distinguished. It
is known as the Rayleigh limit. As the Rayleigh limit depends largely on
the f -number, this result indicates that for high resolutions, small apertures
should be avoided.
Angular response. Lenses focus light on the image plane, which is where the
sensor is placed. Ideally, all light arrives at the sensor at a right angle. In
practice, this can be difficult to realize, and light will be incident upon each
position of the sensor over a given solid angle. As a consequence, it is
desirable to design sensors such that their off-axis response is matched to
the lens system in front of the sensor.
Dynamic range. Noise sources place a low limit on the magnitude of the signal
that can be measured by a sensor. On the other hand, the ability to collect
charge by each pixel is also limited, causing pixels to saturate if more light
is incident than can be accommodated. The range of irradiance values that
can be usefully detected by a sensor is known as its dynamic range, defined
as the ratio between the optical power at which the sensor saturates and the
noise-equivalent power. It depends on many different factors, including the
size of the pixel. As pixel dimension reduces, the supply voltages must be
lower, and this places a limit on the dynamic range [688].
i i
i i
i i
i i
Linearity. The collection of charge in response to incident photons, and the sub-
sequent conversion to a digital signal should maintain a linear relationship
between the number of photons and the value of the resulting signal.3 For
most sensors, this relationship holds to a high degree of accuracy. How-
ever, the linearity of transistors may be compromised in sensor designs that
utilize ultra-thin electrical paths (0.35 μ m and below) [688].
i i
i i
i i
i i
i i
i i
i i
i i
Read-out procedure:
1 - Shift charge under light shield
2 - Shift columns into horizontal output register
3 - Shift accumulated row into convertor
Figure 12.25. The read-out procedure on an inter-line CCD sensor is that first all charges
are moved underneath the opaque strip. In parallel, all columns are shifted down by one
position, moving the bottom scan-line into the horizontal read-out register. This register is
then read step-by-step. This process is repeated until all scan-lines are read.
to both high system complexity and high power consumption [103], although the
trend in CCD design is towards lower voltages and less power consumption. Ad-
ditionally, it is possible to design CCD sensors with a single power supply volt-
age, which is used to internally derive other required voltages using an integrated
charge pump [154].
Finally, when the light incident upon a pixel is very bright, the potential well
may fill up and charge may leak away into neighboring pixels. When this happens,
the result is a streak of light called bloom. Moderns CCDs have anti-blooming
mechanisms at each charge detection node; these allow extraneous charge to leak
away in a controlled manner. Such gates typically reduce the fill factor by about
30%, reducing sensor sensitivity.
Bloom can also be avoided manually, by avoiding long exposures. To obtain
the desired image, several short exposures, each without bloom, may be stacked
in software. The noise characteristics of the resulting image are not affected by
this method.
i i
i i
i i
i i
i i
i i
i i
i i
power is used, then the heat generated by the A/D converter may create temper-
ature variations across the chip’s surface, which in turn has a negative effect on
the sensor’s noise characteristics. Finally, the A/D converter should not introduce
noise into the system through cross-talk.
To improve the dynamic range of CMOS systems, a variety of techniques may
be incorporated. For instance, the sensor may be designed to produce a logarith-
mic response [177, 330], employ conditional resets [1273], switch between pixel
circuit configurations [850], change saturation characteristics [241], use negative
feedback resets [525], employ an overflow capacitor per pixel [22], or use mul-
tiply exposed pixels [2, 748, 809, 995, 1266, 1272]. While the dynamic range of
image sensors continues to improve, photographers can also create high dynamic
range images with current digital cameras by employing the multiple exposures
techniques discussed in Section 13.1.
A range of other image processing functionalities may be included on the
same chip, including motion detection, amplification, multi-resolution imaging,
video compression, dynamic range enhancement, discrete cosine transforms, and
intensity sorting [95, 330], making CMOS technology suitable for a wide variety
of applications.
i i
i i
i i
i i
i i
i i
i i
i i
Figure 12.27. Kodak’s proposal for a new color filter array, including unfiltered panchro-
matic pixels to efficiently derive an estimated luminance signal.
this particular correlation, the edge-detection process will have more values in a
small neighborhood of pixels to consider, so that a more accurate result can be
obtained [5].
In a camera design, the luminance channel (often approximated by just us-
ing the green channel) may be augmented by two chrominance channels, which
are usually taken to be red minus green (CR ) and blue minus green (CB ). Note
that the green value was computed using the CFA interpolation scheme. This
very simple solution is intended to minimize firmware processing times. To com-
pute the missing chrominance values, linear interpolation is employed between
neighboring chrominance values (the nearest neighbors may be located along the
diagonals).
Luminance and chrominance values are then converted to RGB. This con-
version, as discussed in Section 8.1, is device-dependent as a result of the dyes
chosen in the manufacture of the CFA, as well as the firmware processing applied
to the image.
The process of demosaicing is intimately related to the choice of color filter
array. While the Bayer pattern is currently favored in most camera designs, it is
not optimal in the sense that the filters absorb much of the incident light, leading
to a reduced quantum efficiency of the camera system. In the summer of 2007,
Kodak announced an alternative color filter array which consists of four types of
filter, namely red, green, blue, and panchromatic. The latter is effectively an un-
filtered pixel. The advantage of this approach is that a luminance channel can be
derived from the CFA with a much higher efficiency. In addition, the pattern is
constructed such that when the panchromatic pixels are ignored, the result is a
Bayer pattern that allows conventional color interpolation algorithms to be em-
ployed. The Kodak CFA is shown in Figure 12.27.
The placement of a CFA in the optical path significantly reduces the amount
of light that reaches the sensor, requiring an increase in aperture size, exposure
time, or sensor sensitivity to compensate. An alternative to the use of CFAs is to
i i
i i
i i
i i
use multiple sensors. A beam splitter is then placed in the optical path to split the
light beam into three directions so that light can be focused onto three sensors, one
for each of the red, green, and blue ranges of wavelengths. The result is a higher
quantum efficiency and better color separation. Due to cost factors associated
with the perfect registration of the three light beams, this approach is used only in
high-end equipment such as studio video cameras.
i i
i i
i i
i i
Figure 12.28. Both images were captured with a Nikon D2H; the left image with ISO 200
(exposure 1/20 s at f/20) and the right image using the Hi-2 setting, which is equivalent
to ISO 6400 (1/1000 s at f/20). The increase in sensor sensitivity is paired with a marked
increase in noise; Long Ashton, Bristol, May 2007.
i i
i i
i i
i i
photosite is then
x+u y+v
Q(x, y) = Δt Ee (x, y, λ ) Sr (x, y) ck (λ ) dx dy d λ . (12.71)
λ x y
Here, Δt is the integration time in seconds, Ee is the spectral irradiance incident
upon the sensor (W/m2 ), Sr is the spatial response of the pixel, and ck (λ ) is the
ratio of charge collected to the light energy incident during the integration (C/J).
In an ideal system, there is no noise and there are no losses, so that the charge
collected at a photosite is converted to a voltage, which may be amplified with a
gain of g, leading to a voltage V :
The A/D converter then converts this voltage into a digital count n. Assuming
that there are b bits, n will be in the range
0 ≤ n ≤ 2b − 1. (12.73)
If the quantization step is given by s, this means that the voltage is rounded to
D = n s, relating n, s, and V as follows:
The digital image n(x, y) can then be further processed in the camera firmware.
Of course, in practice, sensors do not follow this ideal model, and, in partic-
ular, noise is introduced at every stage of processing. In the following sections,
we discuss the different sources of noise, followed by an outline of a procedure to
characterize a camera’s noise response.
i i
i i
i i
i i
where the subscript n is introduced to indicate that this electron/charge count in-
cludes a noise factor. The fixed pattern noise k(x, y) is taken to have a mean of
1 and a variance of σk2 which depends on the quality of the camera design. This
model is valid in the absence of bloom; i.e., we assume that charge collection at
each photosite is independent from collection at neighboring sites.
i i
i i
i i
i i
i i
i i
i i
i i
If compared with (12.72), we see that the fixed pattern noise is multiplied with
the desired signal, whereas the remaining noise sources are added to the signal.
The combined signal and noise is then amplified by a factor of g.
D(x, y) = (Q(x, y) k(x, y) + Qdc (x, y) + Qs (x, y) + Qr (x, y)) g + Qq (x, y). (12.79)
i i
i i
i i
i i
log Q Saturation
Q - Signal
(Slope = 1.0)
Qs - Photon shot
noise (Slope = 0.5)
Other noise sources
(Slope = 0.0)
log Ee
Figure 12.29. Signal and noise as function of image irradiance. The axis are on a log-
scale, with, for this example, arbitrary units.
The standard deviation of photon shot noise, however, increases as the square
root of the signal level, i.e., in Figure 12.29 the amount of photon shot noise is
indicated with a straight line with a slope of 0.5. To limit the effect of photon shot
noise, it is therefore desirable to design sensors with a large full-well capacity,
since at the saturation point, the impact of photon shot noise is smallest. On the
other hand, the full-well capacity of a photosite depends directly on its surface
area.
With the exception of high-end digital cameras, where full-frame sensors are
used, the trend in sensor design is generally toward smaller sensors, as this en-
ables the use of smaller optical components which are cheaper. This is useful for
applications such as cell phones and web cams. In addition, the trend is toward
higher pixel counts per sensor. This means that the surface area of each photosite
is reduced, leading to a lower full-well capacity, and therefore saturation occurs
at lower image irradiances.
In addition, the dynamic range of a sensor is related to both the full-well
capacity and the noise floor. For the design of high dynamic range cameras, it
would be desirable to have a high full-well capacity, because, even in the case
that all other noise sources are designed out of the system as much as possible,
photon shot noise will limit the achievable dynamic range.
i i
i i
i i
i i
Ee = ks ρd L. (12.82)
To model the (small) variation of the illumination and reflectance as a function
of position on the image plane, the factors in this equation are parameterized as a
function of both wavelength λ and position (x, y) on the sensor:
Ee (x, y, λ ) = ks ρd (x, y, λ ) L(x, y, λ ). (12.83)
The average illumination on the test target is L̄(λ ) and the average reflectance
is given by ρ̄d (λ ). The illumination onto the test card surface that is ultimately
projected onto pixel (x, y) is then
L(x, y, λ ) = L̄(λ ) + Lr (x, y, λ ), (12.84)
where Lr is the deviation from the average L̄(λ ) for this pixel. The expected value
of L(x, y, λ ) is then
E (L(x, y, λ )) = L̄(λ ), (12.85)
i i
i i
i i
i i
Similarly, the reflectance of the test card at the position that is projected onto
sensor location (x, y) can be split into an average ρ̄d and a zero mean deviation
ρr (x, y, λ ):
where
with
The expected value of ε is then 0. The charge collected by the sensor is given
by (12.71) and can now be split into a constant component Qc , and a spatially
varying component Qv (x, y):
where
x+u y+v
Qc = ks Δt L̄(λ )ρ̄d (λ ) Sr (x, y) ck (λ ) dx dy d λ ; (12.90a)
λ x y
x+u y+v
Qv (x, y) = ks Δt ε (x, y, λ ) Sr (x, y) ck (λ ) dx dy d λ . (12.90b)
λ x y
As E(ε (x, y, λ )) = 0, we have that Qv (x, y) has a zero mean and a variance that
depends on the variance in L̄(λ ) and ρ̄d (λ ). During a calibration procedure, it
i i
i i
i i
i i
The noise sources can be split into a component that does not depend on the
level of image irradiance and a component that does. In particular, the photon
shot noise, modeled as a Poisson process, increases with irradiance (see Sec-
tion 12.10.4):
Qs (x, y) g. (12.94)
Accounting for the gain factor g, the variance associated with this Poisson process
is given by
g2 (Q(x, y) k(x, y) + E(Qdc (x, y))) . (12.95)
The signal-independent noise sources, amplifier noise and quantization noise, are
given by
Qr (x, y) g + Qq (x, y) (12.96)
q2
g2 σr2 + , (12.97)
12
i i
i i
i i
i i
where σr2 is the variance of the amplifier noise. The total variance σ 2 in the noise
introduced by the sensor is then the sum of these two variances:
q2
σ 2 = g2 (Q(x, y) k(x, y) + E(Qdc (x, y))) + g2 σr2 + . (12.98)
12
The expected value of the dark current for a given pixel can be replaced by the
sum of the average expected value over the whole sensor and the deviation from
this expected value for a given pixel. Introducing the notation QE(dc) (x, y) =
E(Qdc (x, y), we can write
where Q̄E(dc) is the average expected value of the dark current and QdE(dc) (x, y) is
the deviation from the expected value. Equation (12.98) can now be rewritten as
follows:
σ 2 = g2 Q(x, y) + Q̄E(dc) + g2 (k(x, y) − 1) Q(x, y)
q2
+ g2 QdE(dc) (x, y) − Q̄E(dc) + g2 σr2 + . (12.100)
12
Under the assumptions that |k(x, y) − 1| 1 and |QdE(dc) (x, y) − Q̄E(dc) | Q̄E(dc) ,
the variance of the sensor noise can be approximated with
q2
σ 2 ≈ g2 Q(x, y) + Q̄E(dc) + g2 σr2 + . (12.101)
12
By subtracting the two images, we get a difference image with zero mean and a
variance of 2σ 2 . The expected value of either image D1 and D2 is given by
i i
i i
i i
i i
where the sum is over the set of image pairs which are indexed by i. The variance
in σ̂i can be estimated by
2 σ̂i2 2
var σ̂i2 ≈ (12.109)
XY − 1
i i
i i
i i
i i
with variance σi2 (x, y)/n2 . From Section 12.11.3 we already have an estimate of
the dark current, which we can subtract. This yields the corrected estimate
This estimate varies over the sensor surface as a result of fixed pattern noise as
well as variations in reflectance and illumination. However, for relatively small
pixel neighborhoods, the variation due to reflectance and illumination of the test
card may be assumed to be small. It is therefore possible to compute the mean
¯ y) over small windows for each pixel. Regions of 9 × 9 pixels tend to work
d(x,
well [439], leading to estimates of
¯ y) ≈ Qi (x, y) g.
d(x, (12.113)
The ratio between a single pixel estimate and the windowed estimate is then a
rough approximation of the fixed pattern noise ke (x, y):
To refine this approximation, the average ratio over the n1 imaging conditions is
computed:
1 n1 di (x, y)
ke (x, y) ≈ ∑ d¯i (x, y) .
n1 i=0
(12.115)
i i
i i
i i
i i
i i
i i
i i
i i
responses to these representative stimuli can then be used to calibrate the device
for input stimuli that were not measured.
To find the transformation between RGB and XYZ values, several techniques
have been developed [480], including look-up tables which can be used to in-
terpolate or extrapolate new values [501, 502], least squares polynomial model-
ing [480, 574], neural networks [572], and fitting data to models of cameras and
image formation [65, 162, 439].
i i
i i
i i
i i
Table 12.1. CIE 1931 chromaticity coordinates (xy) and luminous reflectance factor Y
for each of the 24 patches of the Gretag-Macbeth ColorChecker, which is shown in Fig-
ure 12.30. Also given is the equivalent Munsell notation. All values are for patches illumi-
nated with CIE illuminant C.
4 Note that the photograph of Figure 12.30 was taken under D65 illumination, which was not evenly
distributed over the chart. This particular photograph is only intended as an example of a chart that
can be used for calibration—not to allow accurate color reproduction on the basis of this figure. To
achieve good color reproduction, a color rendition chart must be placed in the scene that is to be
photographed.
i i
i i
i i
i i
where pn is the pixel value recorded by the camera for the nth stimulus, Pn is the
corresponding measured response, N is the total number of samples, and fk is
the transformation being estimated for the kth color channel. The notation ||.||2
∗ color-difference metric defined in the CIELAB
is introduced to refer to the ΔEab
color space, as discussed in Chapter 8.7.1. The objective function above could be
augmented with additional constraints, for instance to enforce smoothness of the
function fk [65].
Typical techniques for finding the mapping from known data include linear
and polynomial regression, as well as neural networks [575].
Once the camera is calibrated, the recovered function fk can be applied to
all images captured with this camera. Color calibration is necessary for users of
digital cameras to generate high quality images that are faithful representations of
actual input data and are free of any unwanted effects of various processes inside
the capturing device. The data after the mapping is termed device independent.
Data captured from two different devices can be compared after their appropriate
mappings, which may be different for the two devices.
i i
i i
i i
i i
Figure 12.31. Rays emanating from the surface of the cube can represent the radiance at
any point within the scene (after [11]).
with rays emanating from it in all directions, as shown in Figure 12.31. Now for
any point in the cube, the value of rays may be found by intersecting them with
the two-dimensional faces of the environment cube. This simplified plenoptic
function is only four-dimensional. Rays originating from each face of the cube
(s,t) v
(u,v)
s
z
u
ray(s,t,u,v)
i i
i i
i i
i i
can be represented by L(u, v, s,t), where s and t specify the location on the cube
face, and u and v parameterize the direction of the ray, as shown in Figure 12.32.
This four-dimensional function is variously called a lumigraph, or a light
field [381, 676]. If this function is captured, then new images of the scene from
different viewpoints may be generated from a single capture. To obtain the in-
tensity of a ray emanating from a point in the scene, the intersection of the ray is
found with the (u, v) and (s,t) planes, and the value of the function L(u, v, s,t) is
the intensity of that ray.
Standard cameras are designed so that rays from several different directions
are incident on each sensor location, dependent on the focusing achieved with the
lens assembly. The sensor integrates light over a small solid angle. By modifying
standard camera designs, it is possible to reduce the solid angle over which a
sensor records, thereby separately recording each ray of light passing through
the aperture. Devices that do this are able to capture light field data, i.e., the
four-dimensional plenoptic function. Such devices are briefly introduced in the
following sections.
Photosensor
Microlens
Array Main Lens Object
i i
i i
i i
i i
Unit
Microlens Array
Signal Separator
Imaging Device
i i
i i
i i
i i
12.14 Holography
Conventional image-recording techniques, for instance photography, record light
for a given exposure time. During that time, light waves impinge on the sensor,
which thereby records intensity information. The recording medium is only sen-
sitive to the power of the light wave. In the particle model of light, the number of
photons is counted. In the wave model of light, amplitude information is retained.
An image sensor does not record where the light came from, other than the
focusing effected by the lens. The image is therefore a flat representation of
the 3D world it captures. Holographic imaging is a technique that attempts to
photographically capture 3D information by recording both amplitude and phase
information.
Recall that most light sources emit incoherent light, i.e., the phase of each
of the superposed waves is uncorrelated (see Section 2.7.1). Thus, during an
exposure, light waves arrive at the photographic plate with random phases. The
exception to this are lasers, which hold the key to holographic imaging. Shining
a laser beam at a photographic plate will cause all the light to arrive in phase.
Such a beam is called the reference beam. As the cross-section of a laser beam is
non-diverging, it has the capability to illuminate only one point of a photographic
plate. However, by inserting a lens into its optical path, the beam can be made to
diverge. This is a process known as decollimation. It then illuminates the entire
photographic plate with coherent light.
If the same laser beam is split, with some part directly aimed at the photo-
graphic plate and the other part reflected by an object, then the path length taken
by the light that is reflected will be longer than the path length of the direct beam.
This second beam is called the object beam. As its path length is longer, the light
will take slightly longer to arrive at the plate and will, therefore, be out of phase
with the reference beam. Of course, each point on the object will create a slightly
different path length.
The phase differences will cause an interference pattern in space, including
on the surface of the photographic plate. Section 2.7.1 discusses interference,
which in summary, are changes in amplitude due to light waves summing or can-
celing out. As a result, the phase difference between object and reference beams
causes local changes in amplitude to which photographic materials are sensitive.
Hence, phase information is encoded as an interference pattern. Such an encod-
ing, invented by Dennis Gabor in 1948 [342], is called a hologram; an example is
shown in Figure 12.35.
To create a hologram, a reference beam of laser light must be aimed at the
photographic plate directly, as well as at the object to be imaged. To reach the
i i
i i
i i
i i
Figure 12.35. A photograph of a hologram, which was lit with a D65 daylight simulator
at an angle of 45◦ from above.
entire plate, as well as the entire object, lenses are typically used to make the
laser light divergent. A beam splitter, typically a half-silvered mirror, can be used
to split the light into object and reference beams. It is not possible to use a pair of
lasers for this task, as this would create phase differences unrelated to the object.
An example configuration is shown in Figure 12.36.
Beam splitter
Laser
Mirror
Laser beam
Lens
Object
Photographic
plate
Interference
Mirror Lens
Figure 12.36. Configuration for creating a transmission hologram. The object reflects in
many more directions than indicated, but only the directions toward the photographic plate
are indicated.
i i
i i
i i
i i
Virtual Photographic
Object plate
Laser beam
Observer
Diffraction
Laser toward observer
Lens
To view the hologram, a laser is once more used to illuminate the photographic
plate, as shown in Figure 12.37. In this case, the observer and the illuminant are
located on opposite sides of the photographic plate; thus, this is called a transmis-
sion hologram [267, 427].
Alternatively, it is possible to create a hologram by using the light that is trans-
mitted through the photographic plate to illuminate the object. The light reflected
off the object will then create an interference pattern with the light directly inci-
dent upon the photographic plate. Such a configuration, shown in Figure 12.38,
requires the illumination and the observer to be located on the same side of the
holographic plate. Holograms created in this manner are called reflection holo-
grams. Instead of using a laser to illuminate the hologram, reflection holograms
can also be viewed under white light, because white light incorporates the wave-
length of the laser used to create the hologram [267, 427].
In all cases, the recording environment needs to be carefully controlled for
vibrations, because even very small movements during the recording process will
create large problems due to the nanometer scale at which interference fringes
occur. Usually, optical workbenches are used to combat vibration. In addition,
it is possible to use pulsed lasers, allowing for very short exposure times. To
enable recording of interference patterns in the first place, the spatial resolving
power of the photographic material needs to be very high. The grain of ordinary
photographic materials is too large to record holograms.
For illustrative purposes, we will assume that the object being imaged is the
size of a single point, and that both the light source and the object are far away
from the photographic plate. Then, both the object and reference beams can be
considered plane waves. As discussed in Section 2.7.1, the interference pattern
created by a pair of coherent monochromatic time-harmonic plane waves Eo and
Er , where the subscripts denote the object or reference beam, is expressed by the
irradiance at any point in space. Related to this quantity is the optical intensity,
i i
i i
i i
i i
Object
Reflected
light
Laser Photographic
Lens plate
Laser beam
Creation of a reflection hologram
Diffraction
Virtual
toward observer
Object
Photographic
plate
Laser
Lens
Laser beam
Viewing of a reflection hologram
Figure 12.38. Configuration for creating (top) and viewing (bottom) a reflection holo-
gram.
which is defined as
Io = E o E ∗o = |E o |2 , (12.117a)
Ir = E r E ∗r 2
= |E r | , (12.117b)
where E o and E r are the complex amplitudes of their corresponding electric vec-
tors. Adding two waves of the same frequency causes their amplitudes to sum,
yielding the following optical intensity [427]:
I = |E o + E r |2 (12.118a)
2
= |E o | + |E r | 2
+ E o E ∗r + E ∗o E r (12.118b)
2 2 √
= |E o | + |E r | + 2 Io Ir cos (ϕo − ϕr ) . (12.118c)
The third term in this equation is called the interference term and corresponds
to the interference term discussed in Section 2.7.1. In other words, the optical
intensity at each point depends on the magnitude of each of the two waves, plus
a quantity that depends on the phase difference which is only the result of the
difference in path length between the reference and object beams.
Dependent on the quality of the laser and the total path lengths involved in the
holographic imaging process, some of the laser’s coherence may be lost along the
i i
i i
i i
i i
way. The factor V ∈ [0, 1] accounts for this loss of coherence. It is known as the
van Cittert-Zernike coherence factor and is computed as follows [343]:
Imax − Imin
V= . (12.119)
Imax + Imin
The values Imin and Imax are the minimum and maximum optical intensity in an
interference pattern when the complex amplitudes of the two plane waves are
equal. The value of V is 1 in the ideal case when coherence is maximal and 0 in
the absence of coherence.
As reflection off a surface may depolarize the light, the optical intensity at a
given point in space may be reduced by an amount linear in the angle between
the two electrical vectors. This is accounted for by the cos (θ ) factor in (12.120).
If no depolarization occurs, this angle is zero and no attenuation occurs. At right
angles, the irradiance becomes zero. Accounting for the loss of coherence, as well
as depolarization, the optical intensity becomes
√
I = |E o |2 + |E r |2 + 2V Io Ir cos (ϕo − ϕr ) cos (θ ) (12.120a)
= |E o |2 + |E r |2 +V (E o E ∗r + E ∗o E r ) cos (θ ) . (12.120b)
After processing of the plate, the transmittance t(x, y) of each point of the plate is
given by a function linear in the optical intensity [427]:
t(x, y) = t0 + β Δt I, (12.121)
i i
i i
i i
i i
The transmitted amplitude therefore consists of several terms. The first term is the
constant background transmittance, the second term constitutes the unmodified
illuminating beam, the third term is the reconstructed image, and the fourth term
is a twin image [343].
The twin image is undesirable if it overlaps with the reconstructed image.
However, such overlap can be avoided by aiming the reference beam at the pho-
tographic plate at an angle ϑ . The amplitude Er is then given by
2 π i x sin (ϑ )
Er = |E r | exp . (12.123)
λ
The implications of this are that the magnitude of the reference beam’s amplitude
remains constant, i.e., |E r |2 = constant, but the square of Er becomes
2 4 π i x sin (ϑ )
Er = |E r | exp
2
. (12.124)
λ
This means that the diffracted image is angled away from its twin image by an
angle of 2ϑ . As long as the object being imaged subtends a solid angle smaller
than this, the twin image will not overlap with the desired reconstruction [343].
The interference fringes recorded by the photographic plate cause diffraction
when illuminated with a (coherent) light from the same direction as the reference
beam. Many different types of hologram have been developed; they can be clas-
sified according to whether the photographic plate is a thin layer or volumetric
and whether the hologram is exposed once or multiple times. Although compu-
tationally very expensive, interference patterns can be computed with modified
rendering algorithms and followed by transfer to photographic plates.
i i
i i
i i
i i
i i
i i
i i
i i
Chapter 13
High Dynamic Range
Image Capture
The dynamic range that can be acquired using conventional cameras is limited by
their design. Images are typically gamma encoded to eight bits per color channel.
This means that a luminance range of around two orders of magnitude can be
captured and stored. Many camera manufacturers allow Raw data to be exported
from their cameras as well. These are images that have undergone minimal in-
camera processing and are typically linearly encoded to a bit-depth of around 10
to 14. Note, here, that a linear encoding of 10 bits does not present a larger range
of luminance values than an 8-bit non-linear encoding. Even if Raw data gives
a higher dynamic range than conventional 8-bit images, at this point in time, we
cannot speak of a truly high dynamic range. In other words, limitation in current
camera designs do not allow us to capture the full range of intensities available in
typical real-world scenes.
Some of the trade-offs in camera design are outlined in the preceding chapter,
and include noise characteristics, the capacity to hold a limited amount of charge
before a pixel saturates, veiling glare, and the fact that as new sensor designs
incorporate smaller pixel sizes, quantum efficiency is reduced. All these factors
contribute to a limit on the range of intensities that can be captured in any given
exposure.
To capture the vast range of light present in many scenes, we must therefore
resort to either using dedicated, unconventional hardware, or to reconstructing a
high dynamic range (HDR) image from a set of differently exposed conventional
images. The techniques used for the acquisition of high dynamic range images
are discussed in this chapter.
709
i i
i i
i i
i i
Figure 13.1. A sequence of exposures taken with different exposure times; St. Barnabas
Monastery, North Cyprus, July 2006.
i i
i i
i i
i i
Figure 13.2. The final result, obtained after merging the exposures of Figure 13.1 into a
single high dynamic range image, followed by applying dynamic range reduction to enable
the image to be printed (discussed in Chapter 17); St. Barnabas Monastery, North Cyprus,
July 2006.
i i
i i
i i
i i
ment and the camera, and recombining multiple high dynamic range images shot
through the mesh with small lateral offsets [1115].
One of the disadvantages of multiple exposure techniques is that the set of
exposures should vary only in exposure time. As such, the scene is assumed to
be static, and the camera position is assumed to be fixed. In practice, these re-
quirements may be difficult to maintain. Ensuring the camera is steady during the
sequence of exposures is relatively straightforward. The use of a tripod mitigates
the worst artifacts associated with moving the camera, while image alignment
techniques can be used as a post-process (see Section 13.5). In the case that hand-
held exposures are taken, aside from image alignment, the effects of camera shake
should be taken into account [309].
A more complicated problem is that the scene needs to remain static. This
requirement can be met in the studio, but especially outdoor scenes can be difficult
to control. In particular, movement of clouds, vegetation, as well as humans and
animals, causes each of the exposures to depict a different scene. Recombining
multiple exposures may then lead to ghosting artifacts. These problems can be
reduced by employing a ghost removal algorithm (see Section 13.4), even though
given the current state-of-the-art, it is still preferrable to ensure the environment
being captured is, in fact, static.
i i
i i
i i
i i
1.0
Normalized irradiance
0.9 Linear response
sRGB curve
0.8 Nikon response curve
Minolta response curve
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Normalized pixel value
Figure 13.3. Camera response curves (green channel only) compared with sRGB and
linear response functions. The Nikon response curve is for a Nikon D2H, whereas the
Minolta curve is for a DiMAGE A1 camera.
The product Ee Δt is also referred to as the exposure. The camera response func-
tion g represents the processing of the irradiance values within the camera. As
camera manufacturers typically do not make the nature of this processing
available to their customers, the camera response curve g needs to be reverse-
engineered from the captured images.
To find Ee from the corresponding pixel values p, the function g needs to be
inverted to give
g−1 (p)
Ee = . (13.2)
Δt
The following section discusses how high dynamic range imagery may be ob-
tained using multiple linearized images of the same scene; the remaining sections
discuss techniques to compute the camera response function.
i i
i i
i i
i i
the exposure time is increased and more light is allowed to go through the aper-
ture, darker regions of the scene will be exposed correctly while brighter regions
will appear as burnt-out areas in the image. If enough images are captured of the
same scene with different exposure times, then each region of the scene will be
correctly exposed in at least one of the captured images.
Information from all the individual exposures in the image is used to compute
the high dynamic range image for the scene. The process of image generation is
as follows:
N
g−1 (pn (x, y))
∑ w(pn (x, y)) Δt
Ee (x, y) =
n=1
N
(13.3)
∑ w(pn (x, y))
n=1
For each pixel, the pixel values pn from each of the N exposures are linearized
by applying the inverse of the camera response g. The resulting values are then
normalized by dividing by the exposure time Δt. The irradiance value Ee of the
high dynamic range image is a weighted average of these normalized values.
Both the weight function and the camera response need to be determined be-
fore this equation can be used effectively. In the following section, we discuss
possible choices for the weight function. Different algorithms to recover the cam-
era response g are discussed afterward.
i i
i i
i i
i i
where σN (pn ) is the standard deviation of the measurement noise. If the assump-
tion is made that the noise is independent of pixel value pn (a valid assumption
for all noise sources bar photon shot noise, see Section 12.10.4), then the weight
function can be defined as [784]:
g−1 (pn )
w(pn ) = . (13.6)
dg−1 (pn )
d pn
i i
i i
i i
i i
g (k Ee’ Δt)
g (Ee’ Δt)
Figure 13.4. An illustration of the range-range plot defined by Mann and Picard (af-
ter [731]).
Here, α is 0 (from the assumption that g(0) = 0), and β and γ are scalar values.
The pixel values in each pair can be related to each other as follows:
γ
pn+1 = β k Ee Δt (13.9a)
= k γ pn . (13.9b)
The value of γ may be computed by applying regression to the known points in
the range-range plot. There is now sufficient information to compute the camera’s
response function. Note that this solution will give results accurate up to a scale
factor.
i i
i i
i i
i i
monotonically increasing function, its inverse becomes well defined and may be
computed given the specified input.
The linear system of equations is derived as follows. Taking the log on both
sides of (13.1), the following equation is obtained:
In this equation, the unknowns are ln(g−1 ) and ln(Ee (x, y)). As the function g−1
is only applied to 256 values, essentially it represents 256 unknowns. The other
set of unknowns, ln(Ee (x, y)), has as many unknowns as the number of pixels in a
single image from the input. All these unknowns may be estimated by minimizing
the following objective function:
N 0 12
O = ∑ ∑ ∑ ln(g−1 (pn (x, y))) − ln(Ee (x, y)) − ln(Δtn )
x y n=1
255
+α ∑ (ln(g−1 )) (p)2 . (13.11)
p=0
(ln(g−1 )) (p) = ln(g−1 (p − 1)) − 2 ln(g−1 (p)) + ln(g−1 (p + 1)). (13.12)
i i
i i
i i
i i
This algorithm can be used to obtain the function g−1 for a single channel. The
above process, however, needs to be repeated for each of the three color channels.
i i
i i
i i
i i
cos4 (Θ)
ρ= , (13.17)
f2
2
πd
e= Δt. (13.18)
4
The ratio R of the linearized values is used to generate the objective function. For
a pixel at location (x,y) in the nth exposure, this ratio can be written as
(x, y)Δt
Ee,n n Le (x, y) ρ (x, y) en
= (13.19)
Ee,n+1 (x, y)Δtn+1 Le (x, y) ρ (x, y) en+1
Substituting the inverse camera response with the polynomial model (13.14) into
(13.21), the following relationship is obtained:
K
∑ ck pkn (x, y)
k=0
Rn,n+1 = K
. (13.22)
∑ ck pkn+1 (x, y)
k=0
The minimum of the function occurs when the derivative of the function O with
respect to ck equals zero:
δO
= 0. (13.24)
δ ck
As with previous techniques, this gives a solution up to a scale factor. However,
when a single constraint is added, the system has a unique solution.
Thus, given approximate exposure ratios Rn,n+1 , the unknown parameters of
the inverse response function may be computed. After this computation, the actual
values of the exposure ratios may be found by searching in the neighborhood of
i i
i i
i i
i i
the initial ratio estimates. When more than two images are provided, searching
for the ratios can be time consuming, and, therefore, an iterative version of the
above process is used. First, user-specified approximate ratio estimates are used
to compute function parameters as discussed above. Next, these parameters are
used to compute an updated value of the ratio, using the following equation:
K
∑ ck pkn (x, y)
Rn,n+1 = ∑ ∑
k=0
K
. (13.25)
∑
x y
ck pkn+1 (x, y)
k=0
The updated ratio is again used to compute function parameters, and this cycle
continues until the change in linearized values (caused by the change in func-
tion parameter estimates of successive iterations) for all pixels in all the different
images is smaller than a small threshold:
" "
" −1 "
"g(i) (pn (x, y)) − g−1
(i−1) (p n (x, y))" < ε, (13.26)
where ε is the threshold on the change across iterations, and i is the iteration
number.
The order K of the polynomial representation of the camera response function
can be computed by repeating the above process several times, each time with a
different test value of K. For instance, the function parameters may be estimated
for all K values less than 10, and the solution with the smallest error is chosen.
For color images, the algorithm may be applied separately for each channel. Per-
Normalized irradiance
Normalized irradiance
1.0 1.0
0.9 R Nikon D2H 0.9 R Konica Minolta DiMAGE A1
G G
0.8 B 0.8 B
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Normalized pixel value Normalized pixel value
Figure 13.5. The response curves recovered for each of the red, green, and blue channels
of a Nikon D2H and a Konica Minolta DiIMAGE A1 camera.
i i
i i
i i
i i
channel response curves for the Nikon D2H and Konica Minolta DiMAGE A1
cameras generated with this technique are shown in Figure 13.5.
Here, the summation is over each of the N exposures; tn is the exposure time for
the nth exposure, and g is the camera’s response curve. This approach is motivated
by the fact that longer exposures tend to produce a better signal-to-noise ratio.
This weighting therefore tends to produce less noise in the resulting HDR image.
The camera response curve can be recovered by solving the following objec-
tive function O, for instance using Gauss-Seidel:
N −1 2
O= ∑ ∑ ∑ w̄(pn (x, y)) g (pn (x, y)) − tn Ee (x, y) , (13.28)
n=1 x y
(pn (x, y) − 127.5)2
w̄(pn (x, y)) = exp −4 . (13.29)
127.5
The weight function w(pn (x, y)) can then be set to the derivative of the camera
response function:
w(pn (x, y)) = g (pn (x, y)). (13.30)
To facilitate differentiating the camera response curve, a cubic spline is fit to
the camera response function. The appropriate selection of the spline’s knot
locations is crucial to obtain sensible results. As under- and over-exposed pixels
do not carry information, it is desirable to set the weight function for these
pixels to zero. In Robertson’s method, this means that the derivative of the
camera response needs to be close to zero for both small and large pixel
values.
To achieve this, the spline that is fit to the camera response curve must have a
slope of zero near 0 and 255. This makes the shape of the camera response curve
recovered with Robertson’s method markedly different from the curve obtained
i i
i i
i i
i i
with the techniques presented in the preceding sections. Each camera can only
have one response curve, and it is therefore reasonable to expect that different
response curve recovery algorithms produce qualitatively similar results. As a re-
sult, HDR assembly may proceed according to (13.27), provided that the weights
w(pn (x, y)) are not computed as the derivative of the camera response curve.
Here, the noise averaging is carried out for each pixel in each exposure over a
set of corresponding pixels in s subsequent exposures. In general, using a larger
value of s will give better noise reduction without introducing artifacts. However,
i i
i i
i i
i i
Dynamic range
6.8
6.6
6.4
6.2
Noise averaging
6.0 Excluding frames
5.8
5.6
5.4
5.2
5.0
2 3 4 5 6
s
Figure 13.6. The dynamic range obtained for different values of s, the parameter that
determines how many exposures are averaged in the noise-reduction step.
Weight
1.0
Threshold function τ
0.8
0.6
0.4
0.2
0.0
0 50 100 150 200 250
Pixel value
Figure 13.7. The multiplier reducing the weight for over-exposed pixels.
i i
i i
i i
i i
In such a distribution, the mean equals the variance. Assuming that a 1-second
exposure results in the recording of N photons, then the mean μ equals N, and the
variance σ also equals N. For an exposure time t times longer, the mean will
be μ = tN, and the variance is σ = tN. During HDR assembly, pixel values
are divided by the exposure time. This means that for a t-second exposure, after
division the mean and variance are [116]
tN
μ= = N. (13.33)
t
tN N
σ= 2 = . (13.34)
t t
As a result, increasing the exposure time by a factor of t reduces the variance, and,
therefore, the noise, by a factor of t. This means that the weight factor applied to
each pixel should be linear in the exposure time. The function h(pn , a) is therefore
given by
tn n = a,
h(pn (x, y), a) = (13.35)
τ (pn (x, y))tn n = a.
Thus, if the pixel pn (x, y) is from exposure a, the exposure time itself is taken
as the weight function. For pixels from a different exposure than a, the exposure
time is multiplied by a factor τ (pn ) which accounts for over-exposed pixels:
⎧
⎪
⎪ 0 ≤ x < 200,
⎨1
τ (x) = 1 − 3k(x)2 + 2k(x)3 200 ≤ x < 250, (13.36)
⎪
⎪
⎩0 250 ≤ x ≤ 255,
250 − x
k(x) = 1 − (13.37)
50
The multiplier τ is plotted in Figure 13.7.
Equation (13.32) uses a weight function w that depends on both the pixel
value pn (x, y), which is taken to be either a red, green, or blue value. In addition, a
representation of the pixel’s luminance Ycn , derived from the recorded pixel values
using Equation (8.9), is part of this weight function to avoid saturated pixels that
might give an unnatural color cast. The weight function itself is given by:
4 5
−1 Ycn − 127.5 12
w(pn (x, y),Ycn ) = g (pn (x, y)) g (pn (x, y)) 1 − . (13.38)
127.5
i i
i i
i i
i i
Figure 13.8. The image on the left was created without noise averaging, whereas the
frames were first noise averaged for the image on the right.
Figure 13.9. Movement of scenery between the capture of multiple exposures (shown at
top) lead to ghosting artifacts in the assembled high dynamic range image (bottom). This
is apparent in the lack of definition of the sheep’s head, but also note that the movement in
the long grass has resulted in an impression of blurriness; Carnac, France, July 2007.
i i
i i
i i
i i
Figure 13.10. High dynamic range action photography (top left) is not yet feasible using
multi-exposure techniques. Windy conditions spoiled the high dynamic range photograph
on the bottom left. Note that in either case, a single exposure would lose detail in the
sky, as shown in the images on the right. The images on the left were tonemapped with
the photographic operator (see Section 17.3). Top images: Bristol Balloon Fiesta, Ashton
Court Estate, August 2007. Bottom images: Carnac, France, July 2007.
i i
i i
i i
i i
the scene must remain stationary while the images are being captured, so that the
scene radiance entering the camera remains a constant for each pixel across the
images. If there is movement in the scene at the time of capture, then for at least
some pixels, scene radiance will be different across the image set, and the values
will no longer be comparable (Figure 13.9). Further artifacts due to motion in the
scene at the time of image capture are illustrated in Figure 13.10. Such artifacts
are commonly known as ghosts in the image. There are a variety of algorithms
that attempt to eliminate such errors, and these are discussed here.
Sk-1 Lk Sk+1
Interpolate
Boost
Sk* Lk*
fk*
Sk‘ *
Figure 13.11. The process used by Kang et al. to compute frames estimating the unknown
frame Sk . Dotted lines indicate that a warp is being computed, while solid lines labeled
with a warp indicate that the warp is being applied (after [573]).
i i
i i
i i
i i
short exposures; we denote these by Sk−1 and Sk+1 . These two frames are used to
generate an approximation of Sk that is the kth frame with a short exposure. The
process by which this information is estimated is illustrated in Figure 13.11 and
described below.
To estimate the information contained in the missing frame Sk , first bidirec-
tional flow fields are computed using a gradient-based approach. Bidirectional
F0
flow fields consist of the forward warp fk,F that is applied to Sk−1 to give Sk∗
B0
and the backward warp fk,B that is applied to Sk+1 to give Sk∗ . To compute these
flow fields, the exposures of Sk−1 and Sk+1 are boosted to match the long expo-
sure. Then, motion estimation between these boosted frames and the kth frame is
performed to generate the flow fields. Kang et al. use a variant of the Lucas and
Kanade technique [703] to estimate motion between frames.
The two new frames Sk∗ F0 and SB0 are interpolated to give S . The exposure
k∗ k∗
of this frame is adjusted to match the exposure of Lk , to give Lk∗ . The flow field
fk∗ between Lk∗ and Lk is computed using hierarchical global registration. After
. To compute the final HDR image
computation, it is applied to Sk∗ to give Sk∗
F0 , SB0 , and S . Note, that if the
at time t, the following frames are used: Lk , Sk∗ k∗ k∗
frame at time t has a short exposure, the same process may be used to estimate
the long exposure.
To compute an HDR image from captured and computed frames, the follow-
ing procedure may be used:
F0 , SB0 , and S are converted to radiance values by us-
• The frames Lk , Sk∗ k∗ k∗
ing the camera response function and known exposure times. The radiance
F0 , ŜB0 , and Ŝ , respec-
values of the above frames may be denoted by L̂k , Ŝk∗ k∗ k∗
tively. The response function may be computed using any of the techniques
discussed earlier in this section.
• For all pixel values in L̂k that are higher than a certain threshold, the corre-
sponding values in the HDR image are taken directly from Ŝk∗ , since high
values in L̂k are considered to be saturated and not good enough to pro-
vide reliable registration with adjacent frames. Note that in the case where
the frame at time t has a short exposure, all values in Ŝk below a specified
threshold are assumed to provide unreliable registration, and corresponding
values in the HDR image are simply taken from L̂k∗ .
(13.39)
i i
i i
i i
i i
1 1
fW fM
0 0
0 1 0 dmax
Pixel Value d
where the subscript k has been dropped as it is common to all frames, (x, y)
denotes pixel location, fW denotes a weight function which is illustrated in
Figure 13.12, and fMW (a, b) = f M (|a − b|) × fW (a), where f M is given by
⎧ 3 2
⎪
⎨ δ δ
2 −3 + 1 if δ < δmax ,
fM (δ ) = δmax δmax (13.40)
⎪
⎩0 otherwise.
This function is also illustrated in Figure 13.12. Its main purpose is to en-
sure that if the warped radiance values are very different from the captured
radiance values, then they should be given a relatively small weight in com-
putation of the final radiance values.
i i
i i
i i
i i
i i
i i
i i
i i
belong to moving objects. Once these are found, a single instance of each moving
object is selected and used in the panorama.
Regions in each input image that belong to moving objects are found by com-
paring each pixel in an image with corresponding pixels in the remaining images.
If a pixel differs from corresponding pixels by more than a threshold amount, it is
identified as representing a moving object. Once all pixels in all the images have
been evaluated in this manner, a region-extraction algorithm is applied to each
image to identify and label contiguous regions. These are known as regions of
difference (ROD).
RODs in different images are assumed to represent the same moving object if
they overlap. Thus, to have a single instance of a moving object in the panorama,
only one of the overlapping RODs is used in the final image. To find a single
instance of each object, RODs are represented by a graph. Each ROD is a vertex
in this graph, and if two RODs overlap, their corresponding vertices in the graph
are connected with an edge. To obtain a single instance of a moving object, the
vertex cover can be found and then removed from this graph. This leaves behind
a set of disconnected vertices, representing a single instance of all moving objects
in the scene.
As there are multiple instances of each moving object, any one of these may
be used in the panorama. To ensure that the best possible instance is used, each
ROD or vertex is given a weight. The closer the ROD is to the center of the image
and the larger its size, the higher its corresponding vertex weight. Thus finding
and removing the vertex cover with the minimum weight will leave behind the
most appropriate vertex or instance of object in the panorama.
i i
i i
i i
i i
The probability that a pixel belongs to the static part of the scene may be
computed statistically by representing the static scene with a distribution and then
finding the probability that the candidate pixel belongs to this distribution. If such
a probability is very small, it will indicate that the pixel does not belong to the
static scene and, instead, represents a moving object. Such a pixel is given a
small weight. On the other hand, a pixel with a large probability is given a large
weight. As the probability is correlated with a suitable weight for each pixel, the
weight of a pixel may be set equal to this probability.
If it is assumed that most of the pixels in any local region in the image set
represent the static part of the scene, then the distribution of the static part may be
approximated by the pixels in the immediate neighborhood of the pixel for which
the weight is being computed. Using weighted kernel density estimation [881,
980], this probability is given by
∑ w(y)KH (x − y)
y∈F
P(x|F) = , (13.41)
∑ w(y)
y∈F
While this procedure reduces ghosting artifacts in the image, they are still
apparent in the final HDR image, and further processing is typically required.
The algorithm can improve weights iteratively by using the updated weights from
(13.42) as the initial weights of pixels y belonging to the distribution in (13.41).
The updated probability can be used once more to compute the weights for each
pixel, and the process can be repeated in this manner until the weights converge.
The final weights will show little if any ghosting, as illustrated by Figure 13.13.
i i
i i
i i
i i
Figure 13.13. Ghost removal using (in reading order): 1, 4, 7, and 10 iterations. (Khan
et al., “Ghost Removal in High Dynamic Range Imaging,” IEEE International Conference
on Image Processing, 2005–2008, c 2006 IEEE [586].)
i i
i i
i i
i i
Figure 13.14. The image on the left was generated using misaligned images. The image
on the right was created with the same set of images after applying Ward’s image alignment
technique [1214]. Insets of a representative area are shown in the middle. (Image courtesy
of Greg Ward.)
age is the median value from a low resolution histogram over the grayscale image
pixels. This conversion of images to binary values is akin to normalizing the im-
ages and making them comparable with each other. The difference between two
consecutive images can be computed by using an exclusive-or (XOR) operator to
evaluate each pixel and then summing the responses over all pixels.
If there is a misalignment, this difference will be large. An upper limit has to
be set on the magnitude of translation that is used to limit the amount of transla-
tion. This process can be sped up significantly by generating image pyramids for
each image. Image alignment may now be performed at various stages, starting
with the smallest images. The final translation obtained in one level can be used
as the starting translation in the next level. Figure 13.14 shows a result obtained
by this technique.
i i
i i
i i
i i
tedious and time consuming. Furthermore, the scene needs to remain static while
the images are being captured. As this is often not the case, movement in the
scene between captures results in the appearance of artifacts in the reconstructed
high dynamic range image. These artifacts may be removed in many cases with
techniques discussed in Section 13.4.
To avoid such shortcomings altogether, cameras have been constructed that
capture multiple exposures of the scene at the same time (with a single click of
the button). As there is no time lapse between the capture of images, movement
in the scene at the time of capture is less likely to be a problem. The captured
exposures are then used off-line to generate the final HDR image.
Such devices may capture multiple exposures of the scene in the form of mul-
tiple images, or a single image, where the exposure of pixels varies across the
image. In the latter case, the image may be used to construct multiple images
with different, uniform exposures. Multiple images obtained with a single cap-
ture are then used to generate the HDR image as discussed in the previous section.
Another important advantage of these cameras over standard cameras is that
they are amenable to recording high dynamic range video. Techniques that make
HDR video from multiple exposures need to address the issue of image registra-
tion and motion estimation for scenes where there is any motion. Neither of these
problems are completely solved, and they often generate errors. Devices that cap-
ture multiple images with a single capture tend to suffer less from these problems
and can, therefore, be used without modification in a larger set of environments.
We now briefly describes such cameras.
i i
i i
i i
i i
e0 e3
e1 e2
Figure 13.15. Multiple exposures may be captured by placing this filter in front of the
sensing array (after [817]).
i i
i i
i i
i i
Transmitter
Transmitter
Sensor
Sensor
Modified
Scene Scene Scene
Radiance Radiance Radiance
Controller Controller
Figure 13.16. The concept of adaptively varying pixels’ exposures. At first, the attenuator
uniformly transmits the light (left). Once the sensors have accumulated some irradiance,
their response is used by the controller to adjust the optical attenuator (right) (after [816].
Grass Valley. One of these devices is the Viper FilmStream camera, which is pro-
duced by Grass Valley (a division of Thomson). This film camera is capable
of capturing about three orders of magnitude, which is well matched to the
dynamic range reproducible in movie theaters, but is otherwise known as
medium dynamic range.
i i
i i
i i
i i
Point Grey Research. Point Grey Research (Vancouver, Canada) offer a video
camera that captures six HDR images in a single shot. Six sensors cap-
ture HDR information from different parts of the scene, and the combined
information can be used to form environment maps.
Panavision. Panavision builds the Genesis, a film camera that outputs 10 bits
of log-encoded full resolution film, qualifying the Genesis as a medium
dynamic range camera. It serves the same market as the Viper. The sensor
is a 12.4 megapixel-wide gamut CCD device that does not employ a Bayer
pattern.
Pixim. Pixim (Mountain View, California) offer two CMOS sensors that can each
capture a dynamic range of roughly four orders of magnitude.
Figure 13.17. A full 360◦ scan taken with the SpheroCam HDR camera. The result was
tone mapped with the photographic operator (see Section 17.3). Sunrise in Tübingen, Ger-
many, taken from the rooftop of the Max Planck Institute (MPI), November 2006. (Pho-
tograph taken by Narantuja Bujantogtoch as part of the MPI for Biological Cybernetics
Database of Natural Illuminations.)
i i
i i
i i
i i
Figure 13.18. An interactive brush showing a histogram of the pixels under the circle.
This user interface allows the user to accurately select absolute luminance values, even if
they are outside the display range. The area under the brush is displayed directly, after ap-
propriate exposure control is applied, whereas the remainder of the image is tone-mapped
prior to display. (Background image courtesy of Paul Debevec. This image was pub-
lished in Mark Colbert, Erik Reinhard, and Charles E Hughes, “Painting in High Dynamic
Range,” Journal of Visual Communication and Image Representation, c Elsevier 2007.)
i i
i i
i i
i i
On the other hand, new challenges emerge. For instance, how can one ma-
nipulate high dynamic range images when they cannot be displayed directly on
conventional display devices? This is a new issue that arises in drawing programs
that offer support for high dynamic range images. With a luminance range that
exceeds the display range, drawing directly onto an HDR canvas can be likened
to key-hole surgery: only part of the display range can be visualized at any given
time. Typically, one may solve such an issue by drawing on tone-mapped repre-
sentations of images (see Chapter 17 on dynamic range reduction). Unfortunately,
it is not clear that this approach is either intuitive or accurate.
In such cases, an alternative visualization may help, for instance by attaching
a histogram visualization for the pixels under a region of interest (indicated by the
size and position of a brush), allowing accurate selection of absolute luminance
values. An example is shown in Figure 13.18. A qualitative approach may exploit
visual phenomena familiar to artists, such as glare to allow the user to intuitively
map a visual representation of glare back to the luminance value with which the
brush paints [202].
i i
i i
i i
i i
Figure 13.19. The pixels representing the bronze statue (left) were processed to pro-
duce the impression of transparency (right); Loet Vanderveen bronze, Native Visions Art
Gallery, Winter Park, FL. (
c 2006 ACM, Inc. Included here by permission [587].)
the pixels representing the background. The object shape as well the environment
map are then plugged into the rendering equation (2.207). The user is then free
to choose an arbitrary BRDF fr , thereby changing the apparent material of the
object.
In many cases, it is beneficial to simplify this rendering procedure by filter-
ing the environment directly in several different ways before texture mapping the
result onto the object. This has the added benefit of introducing desirable high
frequency content which helps mask inaccuracies in the recovered shape, as well
as inaccuracies in the environment. It allows objects to be converted to either
transparent or translucent objects. The result is that plausible material transfor-
mations can be achieved (shown in Figure 13.19), without requiring 3D scanning
equipment or multi-view techniques to derive more physically accurate 3D geom-
etry. However, it is noted that without high dynamic range images as input, this
approach would fail.
i i
i i
i i
i i
Chapter 14
Display Technologies
i i
i i
i i
i i
Heater
Control
grid Accelerating
anode
Horizontal
deflection
coils
i i
i i
i i
i i
Shadow mask
Figure 14.2. Shadow mask and aperture grill used in color CRTs ensure that each of the
three types of phosphor receives light from the correct beam.
i i
i i
i i
i i
14.2.1 Back-Lights
The light sources used in LCD displays are most typically cold-cathode fluores-
cent lamps (CCFLs), although LED-based back-lights are just entering the mar-
ket. Both types are discussed here.
A CCFL consists of a glass-tube with electrodes at both ends. The tube con-
tains an inert gas and a small amount of mercury which emits ultraviolet radiation
when stimulated by an applied field. The inside of the tube is coated with a blend
of phosphors which luminesce when excited by the ultraviolet radiation, such that
the spectrum produced by the light source is seen as white, although the spectrum
may comprise of bands that are spectrally distinct (see Figure 14.3) [170]. A dif-
fuser is placed between the light source and the LCD screen to ensure uniform
illumination.
1 Note that in our definition, a pixel is a single unit which emits light according to a single spectral
power distribution. Another definition is possible, whereby a pixel consists of three units, peaked at
long, medium, and short wavelengths.
i i
i i
i i
i i
2.5
Le (W/m2)
R
G
2.0
B
1.5
1.0
0.5
0.0
400 500 600 700
Figure 14.3. The spectral power distribution for typical phosphors used in CCFL back-
lighting (adapted from a plot provided by Louis Silverstein, VCD Sciences, Inc., Scottsdale
AZ).
i i
i i
i i
i i
Figure 14.4. Example diagrams of edge-lit (left) and direct-view back-lights (right).
Tm Tc Temperature
Figure 14.5. The different phases that liquid crystals can assume depend on temperature.
Inbetween temperatures Tm and Tc the material behaves as a liquid, albeit with ordered
structure, and has a milky appearance.
i i
i i
i i
i i
Cholesteric
Figure 14.6. In the cholesteric phase, the liquid crystal molecules are aligned parallel to
the front and back substrates (i.e., the tilt angle is 0), and form a helical structure.
the smectic phases are oriented alike, and additionally form layers. The smectic
C phases is characterized by molecules showing small random deviations from a
preferred direction which is angled with respect to the orientation of the layers.
In the smectic A phase, this tilt angle becomes perpendicular to the surfaces of
the layers. In the nematic phase, the molecules are oriented alike, but do not form
layers. All phases are anisotropic, except for the liquid phase, which due to the
random orientation of the molecules is isotropic.
It is possible to add chiral compounds, which change the nematic phase to
the cholesteric phase. In this phase, the directors of calamitic (i.e., rod-shaped)
molecules in subsequent layers form a helical structure, as shown in Figure 14.6.
The director is the vector describing the long axis of a liquid crystal molecule.
Additional phases may be distinguished, but these are less important for the man-
ufacture of liquid crystal displays.
The director of each molecule makes an angle θ with the average direction.
The wider the spread of angles around the average, the more chaotic the ordering
of the molecules, and the more isotropic the appearance of the material. The order
parameter S of a phase is a measure of the optical properties that may be expected,
and is given by [706]
S = 0.5 3 cos2 (θ ) − 1 , (14.1)
where θ is the angular deviation of individual liquid crystal molecules from the
average director, and . indicates an average over a large number of molecules.
In a fully ordered state, the molecules all perfectly align, yielding S = 1. In a fully
unordered state, we obtain S = 0. In a typical nematic phase, the order parameter
S is between 0.4 and 0.7 [706]. The ordering of calamitic molecules in smectic
and nematic phases gives rise to birefringence, as discussed in Section 2.6. This
optical property is exploited in liquid crystal displays.
i i
i i
i i
i i
y z
Pre-tilt angle
α0
x x
Rubbed alignment layer
Figure 14.7. The orientation of liquid crystal molecules in the presence of a rubbed
surface.
14.2.3 Rubbing
The orientation of liquid crystal molecules is non-random in the mesophases. In
the construction of a liquid crystal display, it is important to force the alignment
of the molecules with respect to each other, but also with respect to a substrate.
Liquid crystals are normally sandwiched between glass plates. Before any electric
field is applied, the molecules nearest to the glass plates can be oriented in a
preferred direction by creating microscopic grooves. This is achieved by coating
the glass plates with a layer of polyimide material, which is then rubbed with
a fine cloth with short fibers [1112]. The coated and rubbed polyimide layer is
called the alignment layer.
This rubbing procedure creates microscopic grooves to which liquid crystal
molecules will align. The orientation is in the plane of the glass substrate, but
also causes the molecules to orient slightly away from the substrate, known as the
pre-tilt angle α0 . The orientation of liquid crystal molecules in the rest-state is
shown in Figure 14.7.
i i
i i
i i
i i
Horizontal
polarization filter Grooved
alignment layers Vertical
polarization filter
Unpolarized
Glass with
backlight
colored filter
Address
transistor
The next step towards understanding the operation of a twisted nematic liquid
crystal display is to consider how polarized light propagates through a twisted
nematic liquid crystal layer. We assume that the light is incident at a right angle,
i.e., along the z-axis in Figure 14.9. The birefringence of the aligned molecules
polarizes light into two orientations. By placing linear polarizing filters before
the first and after the second alignment layer, light entering and exiting the liquid
crystal layer can be linearly polarized.
The first polarizing filter is known as the polarizer, whereas the second polar-
izing filter is known as the analyzer. The polarizer is required, as without it, the
liquid crystal cell by itself would not show the desired optical effect. The amount
of light transmitted through the analyzer depends on the angle of polarization pro-
duced by the liquid crystal layer with respect to this filter (as given by Malus’ law
in (2.86)).
Layer thickness d
y
x
Figure 14.9. The orientation of liquid crystal molecules in the presence of a pair of
orthogonally oriented rubbed surfaces.
i i
i i
i i
i i
Horizontal
polarization filter Grooved
alignment layers Vertical
polarization filter
Unpolarized
backlight
Polarized light
Liquid crystals
Horizontal
polarization filter Grooved
alignment layers Vertical
polarization filter
Unpolarized
backlight
Liquid crystals
Figure 14.10. The stack of LCs takes a twisted shape when sandwiched between two glass
plates with orthogonal surface etching (top). They rotate the polarization of light passing
through them. Under an electric field, liquid crystal molecules can be realigned, and light
transmission can be blocked as a result of the light having polarization orthogonal to the
second polarizer (bottom). This example shows the normally white mode.
When propagating through the liquid crystal layer, which is assumed to have
a thickness of d (typically in the range of 3.5 to 4.5 μ m), the birefringence of
the material changes the light’s polarization from linear to circular to linear again,
provided that the layer thickness is chosen to be
√
3λ
d= , (14.2)
2 Δn
where λ is a typical wavelength, normally chosen to be yellow at 505 nm, and
Δn is the difference of the indices of refraction parallel and perpendicular to the
director, as discussed in Section 2.6 and introduced in Equation (2.141).
If the direction of the polarizing filters is the same as their corresponding
alignment layers, then light can pass through. This is called the normally white
mode, an example of which is shown in Figure 14.10. If the polarizer and align-
ment layers are crossed, then light cannot pass through, and this is called the
normally black mode.
i i
i i
i i
i i
T 0.5
0.4
Normally black mode
0.3
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
V/Vth
i i
i i
i i
i i
Fd
ηd = . (14.3)
Av
The kinematic viscosity η , measured in mm2 /s, equals the dynamic viscosity ηd
divided by the density δ of the material:
ηd
η= . (14.4)
δ
However, the density of liquid crystal materials hovers around 1 Ns2 /mm4 , so
that the value of both types of viscosity in liquid crystal displays are close to
identical. The viscosity of liquid crystal materials is typically in the 20 to 45
mm2 /s range [706]. The effect of the viscous properties of liquid crystal materials
is that the time to realign the molecules for each pixel is not instant, but takes a
little while, typically at least a few milliseconds and perhaps tens of milliseconds.
In the presence of electric and magnetic fields, the equilibrium that is reached
depends on three types of deformation to the liquid crystal material. These are
splay, twist, and bend, and their torques are characterized by the elastic constants
K11 , K22 , and K33 , respectively (see Figure 14.12). The values associated with
these torques are in the order of 10−3 N [706].
Figure 14.12. Splay, twist, and bend in liquid crystal materials (after [706]).
i i
i i
i i
i i
y y
β0 y’
E0
s
β
x’
x
z=0 β0
β
x
P z=d
s’
t t’
Figure 14.13. Coordinate system at z = 0 (left) and z = d, where d is the thickness of the
cell.
i i
i i
i i
i i
The assumption made here is that the twist angle changes linearly along the z-
axis. This is a valid assumption when no voltage is applied to the cell. When
a voltage is applied, then the total twist angle (as well as the tilt angle) remains
the same. However, the twist angle then varies non-linearly with z. The model
discussed below can be extended to deal with this situation [1270].
Under the assumption of a linearly progressing twist angle, the pitch p of the
helix is constant, and is given by
2π
p= . (14.9)
β0
Although we omit the derivation, it is now possible to compute the direction of the
linear polarization with respect to the analyzer after light has passed through the
cell, which has thickness d. Assume that the analyzer is oriented in direction P,
as shown in Figure 14.13. The Jones vector, given in the s -t coordinate system,
which is the coordinate system associated with the analyzer, is [706, 1270]
Js −2 π i n̄ d E0
= R (−Ψ) exp M R (−β0 ) , (14.10)
Jt λ 0
Equation (14.10) is a general expression that is valid for several types of ne-
matic display. In particular, if the tilt angle α is chosen to be 0, and the twist
angle is β = π /2, then we have a classic twisted nematic display. For α = 0 and
β > π /2, the display is called a super-twist nematic display. Finally, α = 0 and
β = π /2 yields a so-called mixed-mode nematic display.
i i
i i
i i
i i
p = 4 d. (14.13)
2 d Δn
a= , (14.14a)
λ
2
π 2 d Δn
γ= 1+ . (14.14b)
2 λ
Further, the s and t axes are rotated over 90◦ with respect to the x- and y-axes. By
substitution of (14.14) into (14.11) and subsequently into (14.10), we obtain [706]
⎡ ⎧ ⎛ ⎞ ⎛ ⎞⎫ ⎤
⎨ 2 2 ⎬
⎢ ⎝π 2 d Δn ⎠ π i d Δn
sin ⎝
π 2 d Δn ⎠
E0 ⎥
⎢e ⎩cos 2 1 + λ
+
λ 2
1+
λ ⎭ ⎥
Js ⎢ ⎥
=⎢ ⎥.
⎛ ⎞
Jt ⎢ ⎥
⎢ π π 2 d Δn ⎠2 ⎥
⎣ −e sin ⎝ 1+ E0 ⎦
2 2 λ
(14.15)
In this equation, we have
−2 π i n̄ d
e = exp , (14.16)
λ
where Δn is the difference between the ordinary and extraordinary indices of re-
fraction of the birefringent material and is taken from (2.141). The value n̄ is the
average of these indices of refraction:
n + n⊥
n̄ = . (14.17)
2
i i
i i
i i
i i
0.5
Ee / 〈E02 〉T
0.4
0.3
0.2
0.1
0.0
0 2 4 6 8 10 12 14 16 18 20
Figure 14.14. The reduced irradiance Ee / E02 T plotted against a = (2 d Δn)/λ for the
normally black mode.
where the time average of E0 is taken. As can be seen from this equation, as
well as from Figure 14.14 which shows the reduced irradiance (effectively the
transmittance), the amount of light transmitted reaches a minima when the square
root becomes a multiple of 2. The first minimum is obtained for
2 d Δn 2
1+ = 2, (14.19)
λ
so that we find the relation √
3λ
d= , (14.20)
2 Δn
a result which was already anticipated in (14.2). Thus, a given wavelength, to-
gether with the material-dependent parameter Δn, determines the optimal wave-
length to block all light. As every wavelength in the visible range would require
i i
i i
i i
i i
0.5
Ee / 〈E0 〉T
0.4
0.3
0.2
0.1
0.0
0 2 4 6 8 10 12 14 16 18 20
2dΔn/λ
Figure 14.15. The reduced irradiance Ee / E02 T plotted against a = (2 d Δn)/λ for the
normally white mode.
a somewhat different thickness of the liquid crystal cell, the normally black mode
can only be fully realized for one wavelength. As a result, this mode leaks light
which is seen as a bluish-yellowish color.
For the normally white mode, the analyzer is aligned with the s -axis, so that
the irradiance becomes
1
Ee = |J |2 , (14.21a)
2 s ⎛ ⎞
π 2 d Δn 2 ⎠ 2 d Δn 2
cos2 ⎝ 1+ +
2 λ λ 2
1
= 2 E0 T . (14.21b)
2 2 d Δn
1+
λ
Here, maxima occur for the same values of d for which minima occur in the nor-
mally black mode. As a result, the √ thinnest cells for which maximal transmittance
happens
2
have a thickness of d = 3 λ /2 Δn. A plot of the reduced irradiance (i.e.,
Ee / E0 T ) in the normally white mode is shown in Figure 14.15.
In both the normally black and the normally white mode, the irradiance trans-
mitted through the cell depends on wavelength, as can be determined from (14.21).
This results in color shifts that depend on the selected gray level. However, when
the maximum voltage is applied in the normally white mode, the cell becomes
i i
i i
i i
i i
Figure 14.16. Shown here is a Toshiba 32-inch LCD television and a 12-inch Apple iBook
with LCD screen, photographed at different angles. The iBook is several years older than
the television and has a pronounced angular dependence. The television is virtually free of
angular artifacts.
black independent of wavelength. As the black state of the normally white mode
does not have a colored tinge, it is the preferred mode [706].
The equations shown in this section are valid for light incident at right an-
gles. For obliquely incident light, the Jones matrix method can be refined and
extended [168, 398, 684, 706, 1274, 1284], showing that dependent on the type
of cell, the transmittance shows a more or less pronounced fall-off with viewing
angle, as shown in Figure 14.16.
Moreover, these equations assume that the cell is in the off-state, i.e., no volt-
age is applied. Under this assumption, the twist angle varies linearly along the
z-axis, and the tilt angle can be assumed to be constant (see Figure 14.17). When
a voltage is applied, the total twist angle remains the same, due to the anchoring
effect imposed by the rubbed substrates. However, the distribution of twist and
angle
β=π/2
Twist angle
Tilt angle
α
0.0
z=0 z=d
Figure 14.17. Twist and tilt angles in the off-state of a twisted nematic cell (after [1270]).
i i
i i
i i
i i
angle
β=π/2
Twist angle
Tilt angle
α
0.0
z=0 z=d
Figure 14.18. Twist and tilt angles in the on-state of a twisted nematic cell (after [1270]).
tilt angles along the z-axis will vary non-linearly, as shown in Figure 14.18. A
computational model of these phenomena is discussed by Yamauchi [1270].
i i
i i
i i
i i
Analyzer
Polarizer
Electrodes
Off state On state
Light
Figure 14.20. In-plane switching cells are constructed with both electrodes on one side of
the cell, so that the tilt angle remains constant, irrespective of the applied voltage.
i i
i i
i i
i i
advantage of this configuration is that the tilt angle is constant and close to zero,
independent of the voltage applied to the electrodes. As a result, the transmittance
T varies little with viewing angle. It is given by
2 π Δn d
γ= . (14.23)
λ
The maximum transmittance of 0.5 is reached when β = π /4 and γ = π /2,
whereas for β = 0 the transmittance is 0.
While in-plane switching solves the attenuation of transmittance with viewing
angle, there are two problems associated with this technique. First, the torque
required to rotate the liquid crystal molecules is larger than with other techniques,
and this leads to a longer response time. Further, the placement of the electrodes
reduces the aperture of the system, resulting in less light being transmitted.
Vvideo = Vd
Column m Column m+1
Vg Row n
VDS TFT
VLC CLC CS LC pixel Storage
capacitor
Row n+1
i i
i i
i i
i i
Transmittance/
CRT
Emission
LCD
Tp Tr Tf Time
Figure 14.22. A comparison of LCD and CRT addressing times (after [84]). Note that
the emission time for a CRT dot is Tp ≈ 50 μ s, whereas an LCD pixel transmits light for
T f ≈ 16 ms.
using thin film transistor (TFT) technology. Pixels are addressed one row at a time
by raising the control voltage Vg for the row (n in Figure 14.21). The transistors
for the pixels in this row are now conductive.
The image signal voltage Vd applied to each of the columns then charges the
liquid crystal cells CLC for the pixels in row n only. The auxiliary capacitor CS
is also charged, which helps keep the pixel in its state during the time when all
other rows are addressed. As a consequence of charge storage in the capacitance
of each LCD pixel, the pixel modulates light until it is next addressed (unlike a
CRT, which emits light at each phosphor dot only for a brief instant).
The time to address a full frame, i.e. N rows, is T f . As a result, the time avail-
able for addressing one row of pixels is Tr = T f /N. A TFT-addressed twisted ne-
matic display reaches the desired voltage over the liquid crystal cell, and therefore
the desired transmittance, more slowly than in the per-pixel addressing of a CRT
display. The relative timing of the two technologies is shown in Figure 14.22.
In Figure 14.11 the transmittance of the liquid crystal cell against driving
voltage is shown. The relationship between the two is more or less linear over
a voltage swing of around 2 V. To construct a display that can handle 256 gray
levels, the voltage difference between two consecutive gray levels is around 7.8
mV [706]. Thus, for a stable, high quality display, the accuracy of the driving
voltages must be high.
i i
i i
i i
i i
Figure 14.23. Cross-section of a color filter layer for LCD displays (after [706]).
Figure 14.24. Spatial layout schemes for color filters used in LCD display devices
(after [706]).
The spatial layout of pixels depends on the application. For moving images,
a triangular or diagonal arrangement is preferred, whereas for still images (i.e.,
computer displays), a striped configuration is better. These layouts are shown in
Figure 14.24. A photograph of an LCD display is presented in Figure 14.25 and
shows a striped pattern.
i i
i i
i i
i i
T 1.0
R
0.8 G
B
0.6
0.4
0.2
0.0
400 500 600 700
Figure 14.26. Transmission of a set of representative LCD color filters (adapted from a
plot provided by Louis Silverstein, VCD Sciences, Inc., Scottsdale, AZ).
y 0.8
Dyeing
Printing
Pigment dispersion
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8
x
Figure 14.27. Color gamuts for different color filter technologies (after [855]).
i i
i i
i i
i i
π d Δn
λ= √ . (14.25)
3
By controlling the tilt angle, and thereby Δn, the wavelength of maximum trans-
mittance can be controlled. This phenomenon is called electronically controlled
birefringence (ECB) and enables the formation of color by shifting the peak wave-
length of light transmitted through the cell.
i i
i i
i i
i i
In dual cell gap types, the gap between the front and back substrates is halved
for the pixel fragment serving reflective mode to account for the fact that light
passes through this section twice. The areas of the pixel devoted to the transmis-
sive and reflective modes are given as a ratio, which can be varied dependent on
the application. It typically ranges from 1:4 to 4:1 [1294]. The response time in
reflective mode will be much faster than for transmissive mode for a dual cell gap
type transflective display, as the time to set a pixel depends on the square of the
thickness of the liquid crystal layer (i.e., the cell gap). Typically, the response
time is four times slower for the transmissive mode of operation.
Single cell gap transflective liquid crystal displays have liquid crystal layers
with the same thickness for both transmissive and reflective layers. This makes
manufacturing more straightforward and creates a uniform response time for both
modes of operation. However, the light efficiency is now lower.
i i
i i
i i
i i
Address
electrode
Phosphor
Glass
Barrier
panel
rib
ible region of the spectrum. If three phosphors with peaks around the red, blue,
and green parts of the spectrum are used, their combination yields a sufficiently
large color gamut.
Figure 14.28 shows the structure of a pixel in a plasma display panel. Here,
the phosphor-coated cavities contain the gas mixture. The barrier ribs minimize
light leakage between pixels; differently shaped barriers have been developed for
improved efficiency [999, 1276]. The sustain electrodes are made from trans-
parent indium tin oxide (ITO) to enable as much light to escape the cavity as
possible [276]. The phosphors are at the back side of the display. The MgO layer
protects the sustain electrodes and the dielectric layer into which these electrodes
are embedded [1160]. Due to the path taken by the light, this configuration is
known as the reflected phosphor geometry [1159].
The screen of a plasma display is composed of a large array of pixels laid out
in a matrix fashion. Each pixel is sandwiched between row and column electrodes
and can be turned on by applying a voltage to the corresponding row and column.
The display can be driven by using alternating or direct current [1120]. Pixel pitch
is typically around 800–1000 μ m2 . Plasma displays are predominantly built in
larger frames (up to a 102-inch diagonal with 1920 × 1080 pixels. PDPs typically
have peak luminance values of 300 cd/m2 with contrast ratios of 300:1, although
displays having a maximum luminance of 1000 cd/m2 and a contrast ratio of
2000:1 have been demonstrated [276]).
Plasma displays are typically power-limited: not all cells can produce maxi-
mum luminance simultaneously. Typical video material has an average luminance
of 20%; this average luminance can be distributed arbitrarily across the panel.
i i
i i
i i
i i
projectors.
i i
i i
i i
i i
Light
p-type n-type
I
V
+ -
- - -
-
Radiative - - -
recombination Conduction band
Eg
λ heat
+ + + Non-radiative
+ recombination
p-type region
+ + +
Valence band
n-type region
hc
λ= where Eg: Band-gap energy
Eg - : electron
h: Planck’s constant
c: Speed of light + : hole
λ: Wavelength
Figure 14.30. An electron and a hole may recombine radiatively or non-radiatively. Ra-
diative recombination results in light whose wavelength depends on the band-gap energy
of the semiconductor. Non-radiative recombination produces heat.
4 In this diagram, the electrons flow form the negative post of the power supply to the positive post,
i i
i i
i i
i i
LEDs are usually used in the construction of very large scale displays, such
as interactive advertisement boards. They are also used as back-lights for liquid
crystal displays. For instance, Apple’s iPod uses an LCD display with LED back-
light. Another use of LEDs to back-light a display is afforded by HDR displays,
as discussed in Section 14.16.
Electron Electron
injection transport Emission
Cathode layer layer layer
5 Organic materials are those primarily made of compounds of carbon, hydrogen, oxygen, and
nitrogen.
i i
i i
i i
i i
In an OLED, electrons and holes injected from the cathode and anode diffuse
toward the emissive layer under an electrical field, where they form electron-
hole pairs called excitons [1275]. An exciton may radiatively recombine resulting
in emission of a photon whose wavelength depends on the band energy of the
organic material in the emissive layer [1168]. To increase the emission efficiency,
the emissive layer may be doped with other organic materials [1040].
OLEDs can be made of small organic molecules or larger conjugated poly-
mers [143]. In the former type, organic layers are prepared by depositing each
layer on a transparent substrate by thermal evaporation. As polymers are too large
to evaporate and may disintegrate upon excessive heat, polymer-based OLEDs
(PLEDs) are usually created by spin-coating the polymer liquid onto indium tin
oxide coated substrates which are solidified by heating [328]. The fabrication
process of PLEDs is therefore simpler than that for OLEDs, although the manu-
facturing of the top electrodes for PLEDs is more complicated. In addition, the
production of PLEDs is more wasteful of materials [924].
OLED displays are usually fabricated on a glass substrate, but they can also
be produced on flexible plastic [400, 1103]. In addition, OLED displays can be
transparent when in the off-state, making them useful for augmented reality ap-
plications and heads-up displays [399]. In addition, transparency is necessary in
White
Organic Transparent organic Colored Transparent
Cathodes layers Anode substrate Cathodes layer Anode filters substrate
Blue
organic Colored Transparent Organic Organic Organic Transparent
Cathodes layer Anode filters substrate Cathode layer layer layer Anode substrate
i i
i i
i i
i i
as shown at the bottom of Figure 14.32 [401]. In either case, the anodes will also
have to be transparent [740].
A full-color OLED display device can be realized using a matrix layout simi-
lar to that of an LCD display. Different from an LCD display, OLEDs themselves
emit light and, therefore, they do not require a back-light. An OLED display
can use passive or active matrix addressing. In a passive configuration, the in-
tensity emitted for each pixel needs to be very high and depends on the number
of scanlines in the display—more scanlines means that each flash will have to be
correspondingly brighter due to the fact that only one line is active at any given
time. If there are N scanlines, then the intensity of each pixel will be N times
higher than the average perceived intensity [1026]. This places an upper limit on
the size of the display that can be addressed in passive matrix addressing.
Full-color can be produced by subdividing each pixel into three OLEDs emit-
ting red, green, and blue colors. Alternatively, an OLED emitting white light can
be filtered by three color filters, or an OLED emitting blue or ultraviolet light may
be used to illuminate three fluorescent materials which re-emit light with the de-
sired spectral composition [328]. One final method is to stack up OLEDs on top
of each other by separating them with transparent films. This allows for a higher
spatial resolution in the same display area and is, therefore, a favorable technique.
Different color reproduction schemes are illustrated in Figure 14.32.
Aging effects in OLED displays can be classified as either life-time aging
or differential aging. Currently, the organic molecules used in the fabrication of
OLEDs tend to degenerate over time, which means that OLED technology is not
yet suitable for use in displays that need to last for many years. Useful life is
normally measured as the length of time that elapses before the display becomes
half as bright as it was when new. The life span of an OLED micro-display also
depends on the luminance value at which it is operated. The product life cycle
for displays hovers around four years, and OLED displays currently fulfill the
requirement of lasting for this amount of time under normal conditions.
Differential aging results from the fact that the remaining life time of each
pixel depends on how it has been switched on. If pixels have had different lengths
of on time, then their brightness may be affected differently. The threshold of ac-
ceptable pixel differences across the display is typically set to 2–3% for graphics
applications and 5–8% for video applications [924].
OLEDs offer significant advantages that are likely to result in their widespread
use in the near future. They consume very little power, are luminously efficient
(in the order of 20 lm/W [894, 1029]), exhibit wide viewing angles, and have fast
response times (in the order of a few microseconds). OLEDs are also very thin
and lightweight, and can be deposited on flexible surfaces. This opens up the
i i
i i
i i
i i
eventual possibility for display devices that can be rolled-up or stuck onto curved
surfaces.
Insulator
Gate electrode
Spacer
Substrate
Nanotubes
Phosphor
Anode
Glass panel
i i
i i
i i
i i
Light
Glass substrate
Phosphors
Vs
Vacuum Electrons
Electron tunneling
Glass substrate
Vf
0 0
Figure 14.34. The structure of an SED pixel. The tunneling effect and the attraction of
electrons toward the phosphors are depicted.
i i
i i
i i
i i
several nanometers. Some of the freed electrons are accelerated toward the phos-
phors as a result of the large potential difference (≈10 kV) applied at Vs . When
these electrons strike the phosphor molecules, photons are emitted due to cathodo-
luminescence. The efficacy of current prototypes reaches that of conventional
CRT displays and is of the order of 5 lm/W.
Since phosphors produce light into a wide angle, SED image quality does not
degrade at wide viewing angles. Since the phosphor technology has been success-
fully used for accurate color reproduction in standard CRTs, the SED technology
inherits this desired property as well. Finally, owing to its slim and ergonomic de-
sign, the SED technology may become an interesting competitor in the flat panel
display market in the near future. Displays with a diagonal of 36 inches have been
announced [845]. However, there remain significant challenges to high-volume
manufacturing.
i i
i i
i i
i i
Transparent
substrate
Thin-film
mirror
hb
hg
hr
hr > hg > hg
Figure 14.36. Three primary colors can be obtained by constructing cavities of different
heights.
i i
i i
i i
i i
To construct a full color IMOD pixel, three types of subpixels are used with
each type having a different gap size. To reproduce various shades of color, each
subpixel type is further divided into sub-subpixels, or subpixel elements. The de-
sired intensity for each primary color can then be obtained by spatial or temporal
dithering.
The chief advantages of IMOD displays over other flat panels include a sig-
nificantly reduced power consumption and the ability to reproduce bright colors
under direct sunlight. These factors may render IMOD technology an attractive
alternative particularly for hand-held applications. In dark environments, supple-
mental illumination is provided by a low power front light. At the time of writing,
IMOD displays have a reflectivity of around 50% and a contrast ratio of 8 : 1.
This compares favorably with typical newspaper which has approximately 60%
reflectivity and a 4 : 1 contrast ratio [527].
i i
i i
i i
i i
0.12
Power
120 bar
0.10 290 bar
0.08
0.06
0.04
0.02
0.00
400 450 500 550 600 650 700 750
Figure 14.37. The emission spectra of an UHP lamp at different pressures (after [249]).
An UHP lamp relies on mercury discharge only: no rare earth gases are used.
This gives high brightness and high luminous efficiency [249]. The emission
spectra of UHP lamps depends on the pressure realized within the bulb. At higher
pressures, the spectrum becomes flatter, giving a better color-balanced output with
a higher color rendering index. The emission spectra of an UHP lamp at different
pressures is plotted in Figure 14.37.
2i
ti = , 0 ≤ i < n, (14.26)
2n − 1
whereby each ti constitutes the duration of one of the n sub-frames. This approach
is feasible due to the very short switching times that are possible, combined with
i i
i i
i i
i i
the comparatively slow response of the human visual system which effectively
integrates the flashes of light over periods exceeding the frame time.
In the following sections, an overview of the most commonly used projector
technologies is given.
Mirror Mirror
Dichroic beam
Blue light
combiner
Light Hot
source mirror
White Mirror LCs
light Dichroic
prism Red light Color
image
IR
Yellow light Green light Projection
lens
Mirror Mirror
Dichroic
prism
The red, green, and blue beams are then directed toward individual liquid
crystal panels, which modulate their intensity according to the pixel values of
the frame that is to be displayed. The modulated beams are then combined by a
prism and reflected toward a lens that focuses the image at a certain distance. The
operating principle of an LCD projector is shown in Figure 14.38.
i i
i i
i i
i i
Substrate Micro-mirror
Switch
Switch
Torsion hinge
Substrate
Switch
(DMD), each mirror is hinged and can be tilted between ±10◦ to change the di-
rection of the reflected light (see Figure 14.39). When tilted to +10 degrees the
incident light is reflected toward a lens which projects the light onto a screen.
This is called the on-state. If tilted by −10 degrees, the incident light is reflected
toward a light absorber (the off-state).
The intensity of the reflected light can be controlled by alternating between
the “on” and “off” states very rapidly, up to 50, 000 alternations per second. For
each tiny mirror, the ratio of the number of on- and off-states across a frame time
determines the effective intensity of light produced for that pixel.
DLP projectors can produce color using one of three different approaches that
are characterized by the number of DMDs employed. In the first approach, using a
4-color wheel
Lout 1.4 3-color wheel
equivalent
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Iin
Figure 14.40. Compared with a three-color filter wheel, the extra white improves the
luminance output (after [1096]).
i i
i i
i i
i i
DMD
Prism
Integrator rod
Projection lens
Color wheel
Arc lamp Lenses
Figure 14.41. The optical components of a 1-chip DLP projector (adapted from [489]).
single DMD, a color filter wheel rotates rapidly in front of a broadband bright light
source. When the blue region of the wheel is in alignment with the light source, it
only allows blue light to reach the DMD. Synchronously, the micromirrors align
themselves according to the blue channel of the image to be displayed. These
steps are repeated rapidly for the red and green channels, resulting in a full color
output. This RGB sequential 1-DMD design is generally not able to reproduce
very high luminances, due to the absorption of energy by the red, green, and blue
color filters.
To achieve higher luminance, the color wheel is frequently fitted with an ad-
ditional white channel (see Figure 14.40), and thus this approach uses a total of
four channels (RGBW) [92]. Since one DMD is used in both RGB and RGBW de-
signs, these set-ups are also called 1-chip DLP projection systems (Figure 14.41).
Their advantages include portability and low cost.
The disadvantage of using a white channel is that the color gamut has an
unusual shape, in that the volume spanned by the red, green, and blue primaries,
is augmented with a narrow peak along the luminance axis. This means that
colors reproduced at high luminances will be desaturated. At low luminances,
saturated colors can be reproduced. The lack of accuracy of RGBW systems at
high luminance values makes this approach useful for business graphics, but less
so for color-critical applications.
A second alternative, which has a higher light efficiency and optimizes the
lifetime of the light bulb, uses a 2-chip design. Bulbs with a long lifetime may
be deficient in the red end of the spectrum. To obtain a high light efficiency, red
light is continuously directed towards one DMD, whereas the second DMD al-
i i
i i
i i
i i
DMD 2
Prism
Integrator rod
Projection lens
Color wheel
Arc lamp Lenses
Figure 14.42. The optical components of a 2-chip DLP projector (adapted from [489]).
ternates between reflecting green and blue light. This is achieved by designing
a color wheel with yellow (red + green) and magenta (red + blue) segments.
The dichroic prism spatially separates the red component from alternatively the
green and blue components. A 2-chip projection system is outlined in
Figure 14.42
In general, 1- and 2-chip DLP devices suffer from artifacts which stem from
the rotation of the color wheel and the corresponding sequential reproduction of
the three color channels. If this speed is too low, then the time for a boundary
between two neighboring colors of the filter to pass through the optical path is
relatively long. When a projection is viewed, the viewer’s eye-tracking (see Sec-
tion 4.2.1) may coincide with the movement of the chromatic boundary that is
passing across the image [252]. Dependent on the relative speeds of the rotation
of the color wheel and the eye movement itself, this may be seen as a colored
fringe. The leading and trailing edges will exhibit color fringes. These color
separation artifacts are known as the rainbow effect.
The third alternative involves the use of three chips, with each chip controlling
a different color channel (Figure 14.43). Similar to an LCD projector, a broadband
light can be separated into three components using a prism, or, alternatively, red,
green, and blue LEDs can be used to illuminate each chip. As this approach
requires three DMDs, this alternative is usually more costly. However, 3-DMD
systems do not exhibit color fringing. They are used in DLP cinema projectors.
i i
i i
i i
i i
DMD 1
DMD 3
DMD 2
Prism
Integrator rod
Projection lens
Figure 14.43. The optical components of a 3-chip DLP projector (adapted from [489]).
LCoS
Mirror
Blue light
Light Prism
source LCoS Prism
White
light
Red light Prism Color
image
Yellow light
Projection
Green Prism lens
Mirror
light
Dichroic
prism
LCoS
i i
i i
i i
i i
i i
i i
i i
i i
Silicon
substrate
Mirror
Liquid crystals
Anode Glass panel
(Indium tin
oxide)
Figure 14.45. The components of an LCoS pixel. No polarizers are required in this
configuration, as polarization has typically already occurred in the beamsplitter.
chromaticity plane drawn in Figure 14.46. However, the aim of multi-primary display devices is to
increase the coverage of reproducible colors in the chromaticity plane, rather than extend the dynamic
range of reproducible luminance values.
i i
i i
i i
i i
y 0.9
EBU
0.8 sRGB / Rec. 709
Adobe RGB 1998
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x
Figure 14.46. Several standard gamuts are shown in a CIE 1931 chromaticity diagram.
y 0.9
sRGB
0.8 6 primaries
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x
Figure 14.47. The concept of extending gamut with multiple primaries. Each vertex of
the red polygon represents the chromaticity coordinates of one of the six primaries of an
experimental six-primary display device. This approach alleviates the gamut limitations
imposed by a three-primary display. The sRGB gamut is shown for reference.
i i
i i
i i
i i
A different approach is to use display devices that use more than three pri-
maries. Such devices are called multi-primary display devices, and they have
been actively studied in recent years [131, 981, 1269]. The advantage of using
more than three primaries is that the color gamut ceases to be restricted to a trian-
gle, but may take arbitrary polygonal shapes. This affords the flexibility necessary
to better cover the horseshoe shape of the visible color gamut (see Figure 14.47).
A multi-primary display device can be constructed by superposing the images
produced by two conventional LCD projectors7 . Each projector is fitted with three
additional interference filters placed in the optical paths of the red, green, and blue
100
Transmittance (%)
80
60
40
20
0
380 480 580 680 780
Wavelength
100
Transmittance (%)
80
60
40
20
0
380 480 580 680 780
Wavelength
Figure 14.48. Low-pass filters (shown with dashed lines) are used to select the lower
bandwidth regions of the original primaries of the first LCD projector (top). High-pass
filters (bottom) are used to select the higher bandwidth regions of the primaries of the
second projector (after [21]).
i i
i i
i i
i i
Normalized intensity
1.0 P3
P2
P4
0.8
P5
P1
0.6 P6
0.4
0.2
0.0
380 480 580 680 780
Wavelength
Figure 14.49. The spectral power distributions of six primaries of an experimental multi-
primary display device (after [21]).
beams. The first projector is fitted with three low-pass interference filters, while
the second projector is fitted with three high-pass filters [21]. This set-up allows
the selection and transmittance of light of only a narrow bandwidth from the spec-
trum of each original projector primary, as shown in Figure 14.48. These selected
regions then become the primaries of the composite multi-primary display device.
The primaries for a six-primary display device are shown in Figure 14.49.
The spectrum C(λ ) of the light emitted by an N-primary display device is
given by
N
C(λ ) = ∑ α j S j (λ ) + β (λ ), (14.27)
j=1
where S j is the spectral distribution of the jth primary, α j is its weight, and β (λ )
is the spectral intensity of the background light [21] (i.e., residual light emitted
for a black image). The corresponding tristimulus values are then given by
⎡ ⎤ ⎡! ⎤ ⎡! ⎤
X N S (λ ) x̄(λ ) d λ
! j !
β (λ ) x̄(λ ) d λ
⎣Y ⎦ = ∑ α j ⎣ S j (λ ) ȳ(λ ) d λ ⎦ + ⎣ β (λ ) ȳ(λ ) d λ ⎦ , (14.28a)
! !
Z j=1 S j (λ ) z̄(λ ) d λ β (λ ) z̄(λ ) d λ
⎡ ⎤ ⎡ ⎤
N PX j Xβ
= ∑ α j ⎣ PY j ⎦ + ⎣Yβ ⎦ , (14.28b)
j=1 PZ j Zβ
where PX j , PY j , and PZ j are the tristimulus values of each primary and Xβ , Yβ , Zβ
are the tristimulus values of the background light.
i i
i i
i i
i i
i i
i i
i i
i i
Approximate typical
monitor display range
-5 -3 -1 1 3 5
10 10 10 10 10 10
Le (in cd/m2 )
Figure 14.50. Typical luminance levels found in natural environments and the range of
luminances a typical monitor can display.
us to view HDR images on standard monitors, the full potential of an HDR image
is not realized in this manner.
This problem has led to the development of experimental high dynamic range
display devices [1021]. Such devices can be constructed with a two-layered ar-
chitecture, where the first layer is emissive and the second layer is transmissive.
Both layers can be spatially modulated, resulting in a good trade-off between spa-
tial resolution and dynamic range.
Using this principle, Seetzen et al. produced two types of HDR displays
[1021]. In the first type, a DLP device is used as a back projector to provide
the required light. The front layer consists of an LCD panel used to modulate the
light from the projector. Between the layers, a Fresnel lens is installed to couple
the two layers. The projector-based HDR display provides a proof of concept for
the system. In addition, the use of a projector means that high contrasts can be
produced at high frequencies, limited only by the resolution of the projector and
the LCD panel, and the accuracy of the alignment between them. This is a distinct
advantage in applications such as psychophysical experiments. However, its large
depth makes the device impractical in more mainstream applications.
To reduce the depth, a second model was constructed by replacing the projec-
tor with an array of ultra-bright LEDs. In the first prototypes, a hexagonal grid of
760 white LEDs is used, although the currently commercialized 37-inch display
contains 1380 white LEDs. Each LED can be individually controlled. The LCD
front panel has a resolution of 1920 by 1080 pixels (Figure 14.51).
In a darkened room the minimum and maximum display luminance were mea-
sured to be 0.015 cd/m2 and 3450 cd/m2 , respectively. In theory, the minimum
luminance can be zero when all LEDs are turned off. However, in practical situ-
ations, some stray light will always reflect off the screen, thereby increasing the
black level. The minimum value corresponds to the smallest driving signal of
LEDs larger than zero. The dynamic range of the device is therefore 230, 000 : 1,
i.e., more than five orders of magnitude.
i i
i i
i i
i i
Figure 14.51. The BrightSide HDR display at the Max Planck Institute for Biological
Cybernetics; Tübingen, Germany, November 2006.
The HDR display is addressed by deriving two sets of signals from an input
linear-light HDR image. The first set of signals drives the LEDs while the second
set controls the LCD. In principle, there are many combinations of LED and LCD
values that lead to the same net emitted luminance. It is desirable to choose the
lowest LED level possible and adjust the LCD level accordingly. This minimizes
power consumption of the display as well as aging of the LEDs.
The fact that a two-layer system allows for different combinations of LED
and LCD levels to yield the same output can also be leveraged to calibrate the
display [1023]. By adding a lightguide, some of the emitted light can be guided
towards the side of the screen, where image sensors can be fitted. A calibration
procedure can then measure the light output of each individual LED and correct
for variations due to the manufacturing process, including LED chromaticity and
non-uniformities in the LCD screen, as well as operating variations, including the
differential aging of LEDs, and thermal effects.
The low spatial resolution of the back-light has two important implications.
First, the full dynamic range can only be achieved at low spatial frequencies. In
other words, in every local neighborhood covered by a single LED, the dynamic
range will be limited to that of the LCD panel. However, this problem is sig-
nificantly alleviated by the fact that the human eye is not sensitive to extreme
dynamic ranges over small visual angles [1021, 1186].
i i
i i
i i
i i
Second, the image data presented to the LCD panel needs to be adjusted for
the low resolution produced by the LED array. This is performed by artificially
enhancing the edges of the input sent to the LCD layer, where the amount of
enhancement depends on the point spread function of the LED elements.
i i
i i
i i
i i
The most direct, albeit impractical, method to achieve this would be to measure
the distribution of emitted light by a spectroradiometer for all pixel values. The
CIE XYZ tristimulus values can then be found using
XRGB = x̄(λ ) f (λ ; R, G, B) d λ , (14.29a)
λ
YRGB = ȳ(λ ) f (λ ; R, G, B) d λ , (14.29b)
λ
ZRGB = z̄(λ ) f (λ ; R, G, B) d λ , (14.29c)
λ
f (λ , R, G, B) = fr (λ ; R) + fg (λ ; G) + fb (λ ; B) + f0 . (14.30)
Here, f0 denotes the sum of the light reflected by the display surface due to un-
wanted external light and the light emitted when all pixels are zero. This as-
sumption reduces a three-dimensional problem to three simpler one-dimensional
problems. Equation (14.30) can be rewritten using tristimulus values instead of
spectral distributions:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
XRGB XR XG XB X0
⎣YRGB ⎦ = ⎣YR ⎦ + ⎣YG ⎦ + ⎣YB ⎦ + ⎣Y0 ⎦ . (14.31)
ZRGB ZR ZG ZB Z0
i i
i i
i i
i i
scaling:
Here, fr (λ ; Rmax ) represents the emission spectra when the red channel is full on,
and vr (R) is the tone response curve when the channel is driven with R instead
of Rmax . With this formulation, all tone response curves must lie in the interval
[0, 1]. It is possible to rewrite (14.32) using the tristimulus values:
⎡ ⎤ ⎡ ⎤
XR XRmax
⎣YR ⎦ = vr (R) ⎣YRmax ⎦ , (14.33a)
ZR ZRmax
⎡ ⎤ ⎡ ⎤
XG XGmax
⎣YG ⎦ = vg (G) ⎣ YGmax ⎦ , (14.33b)
ZG ZGmax ,
⎡ ⎤ ⎡ ⎤
XB XBmax
⎣YB ⎦ = vb (B) ⎣YBmax ⎦ , (14.33c)
ZB ZBmax
where the subscripts Rmax , Gmax , and Bmax are used to distinguish the tristimulus
values produced when each channel is driven by the maximum signal value. If
a monitor satisfies both of these assumptions, a forward model can be written to
simulate its operation. For this, we can combine (14.31) and (14.33):
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
XRGB XRmax XGmax XBmax vr (R) X0
⎣YRGB ⎦ = ⎣YRmax YGmax YBmax ⎦ ⎣vg (G)⎦ + ⎣ Y0 ⎦ , (14.34a)
ZRGB ZRmax ZGmax ZBmax vb (B) Z0 ,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
XRGB vr (R) X0
⎣YRGB ⎦ = M ⎣vg (G)⎦ + ⎣Y0 ⎦ , (14.34b)
ZRGB vb (B) Z0
where the matrix M is the 3 × 3 tristimulus matrix. The forward model is useful
to find the tristimulus value corresponding to an (R, G, B) triplet used to drive
the display. In most situations though, an inverse model is required to find the
(R , G , B ) values that yield the desired tristimulus values [1035]. This model can
i i
i i
i i
i i
Note that the inverse model is valid only if the matrix M as well as the tone re-
sponse curves are invertible. For M to be invertible, the individual color channels
of the monitor must be independent. This condition is satisfied by most typical
desktop computer monitors [1034, 1035], but it is not satisfied by LCD monitors
to be used in demanding graphics arts, studio video, or digital cinema production
work. Thus, 3 × 3 linear models have been found to work well with both CRT and
LCD displays [74, 92]. However, they are not good predictors for four-channel
DLP devices [1096, 1259, 1260]. In addition, tone response curves are usually
monotonically increasing functions and are therefore invertible [1035].
As (14.35b) shows, the three key components necessary for display character-
ization are
3. The sum of the emitted light when all pixels are zero and the light reflected
off the surface of the monitor.
In the following we focus on the details of how each component can be obtained.
Before carrying out any measurements, the first step must be to set the bright-
ness and contrast controls of the monitor since these controls affect the tone re-
sponse curves. For a CRT monitor, the brightness control affects the black level
of the monitor, which is the amount of light emitted when displaying a black im-
age. It would therefore be better to refer to this control as the black level setting.
It operates on non-linear RGB values and, therefore, slides the range of values
along the electro-optic conversion curve (EOCF).
This control should first be set to minimum and then slowly increased until
the display just starts to show a hint of gray for a black input [923]. The setting is
then reduced one step to full black. Setting the brightness control to a lower value
causes information to be lost in the dark regions of the displayed material. Setting
i i
i i
i i
i i
1.0
Luminance (normalized)
0.9
0.8
Higher
0.7
contrast
0.6
0.5
0.4
0.3
0.2
Lower
0.1 contrast
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Digital counts (normalized)
Figure 14.53. The contrast control of CRTs and the brightness control of LCDs are used
to maximize the luminance range output by the display device (after [923]).
i i
i i
i i
i i
i i
i i
i i
i i
The tone response curves of CRT monitors are often modeled by a gain-offset
gamma (GOG) model . For the red channel, this is expressed as
γr
R
vr (R) = kg,r + ko,r , (14.37)
Rmax
where kg,r , ko,r , and γr denote the gain, offset, and gamma for the red chan-
nel [92]. In this equation, the gamma term γ models the inherent non-linearity
in the cathode ray tube. It is caused by the space charge effect due to the accu-
mulation of electrons between the cathode and the accelerating anode (see Sec-
tion 14.1) [173, 652]. This parameter, which is typically between 2.2 and 2.4,
cannot be altered by the user.
However, the other two parameters, gain and offset, can usually be adjusted by
the user by altering the contrast and brightness knobs, respectively. Under proper
settings (14.37) simplifies to
γr
R
vr (R) = . (14.38)
Rmax
Similar equations can be written for the green and blue channels. With this sim-
plification, we can rewrite (14.35b) for a CRT monitor as follows:
⎡ ⎤ ⎡ ⎤
R (RXYZ )1/γr
⎣G⎦ = ⎣(G )1/γg ⎦ . (14.39)
XYZ
B (BXYZ )1/γb
In computer graphics, this operation is commonly called gamma correction. In
video and HDTV, gamma correction takes place at the camera and incorporates
a power function exponent that effects not only pre-compensation for the non-
linearity of the eventual display device, but also includes compensation for ap-
pearance effects at the display.
i i
i i
i i
i i
1.0
0. 6
0. 5
0. 4
0. 3
0. 2
0. 1
0.0
0.0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1
Normalized pixel value
Figure 14.54. Measured electro-optic conversion curve (EOCF) of the red, green, and
blue channels of a Dell UltraSharp 2007FP 20.1-inch LCD monitor. A typical gamma of
2.2 is shown for comparison.
One of the important differences between LCDs and CRTs lies in their tone re-
sponse curves. These functions are usually called electro-optic conversion curves
(EOCFs). The native response of LCDs is shown in Figure 14.11, indicating
that the intrinsic, physical electro-optic conversion curves are essentially sig-
moidal [368, 1035].
However, to make LCD monitors compliant with the existing image, video,
and PC content, and to make them compatible with CRT display devices, they
are typically equipped with look-up tables to transform their response curves to
follow a 2.2 power law [297, 1035], as standardized by the sRGB standard (IEC
61966-2-1). In Figure 14.54, the measured curves of the individual color chan-
nels of a current LCD monitor are shown. Note that both axes in this graph are
normalized, so that the black level of this display cannot be inferred from this
figure. This specific monitor closely approximates the response curve of a typical
CRT with an EOCF of 2.2. For some monitors though, critical deviations from
the gain-offset gamma model are sometimes observed, and therefore the use of
external look-up tables is recommended for color critical applications [237, 297].
i i
i i
i i
i i
i i
i i
i i
i i
values sent to the display with the measured values, the display is characterized.
The output of the calibration procedure is typically an ICC profile, which is used
either by the operating system, or by application software to adjust imagery for
display on the characterized display. Such profiles are commonly in use to ensure
appropriate color reproduction (see Chapter 16).
i i
i i
i i
i i
Chapter 15
Image Properties and
Image Display
In this book, the focus is on light and color, and its implications for various ap-
plications dealing with images. In previous chapters, some of the fundamentals
pertaining to image capture and image display were discussed. This leaves us with
several attributes of images that are worth knowing about. In particular, images of
existing environments tend to have statistical regularities that can be exploited in
applications such as image processing, computer vision, and computer graphics.
These properties are discussed in the following section.
Further, we discuss issues related to the measurement of dynamic range. Such
measurements are becoming increasingly important, especially in the field of high
dynamic range imaging, where knowledge about the dynamic range of images can
be used to assess how difficult it may be to display such images on a given display
device.
The third topic in this chapter relates to cross-media display. As every display
has its own specification and properties and is located in a differently illuminated
environment, the machinery required to prepare an image for accurate display on
a specific display device, should involve some measurement of both the display
device and its environment. One of the tasks involved in this process, is matching
the gamut of an image to the gamut of the display device. Finally, we discuss
gamma correction and possible adjustments for ambient light which is reflected
off the display device.
805
i i
i i
i i
i i
First-order statistics treat each pixel independently, so that, for example, the dis-
tribution of intensities encountered in natural images can be estimated.
Higher-order statistics are used to extract properties of natural scenes which can
not be modeled using first- and second-order statistics. These include, for
example, lines and edges.
i i
i i
i i
i i
For outdoor scenes, the average image color remains relatively constant from
30 minutes after dawn until within 30 minutes of dusk. The correlated color
temperature then increases, giving an overall bluer color.1
The average image color tends to be distributed around the color of the dom-
inant scene illuminant according to a Gaussian distribution, if measured in log
space [665, 666]. The eigenvectors of the covariance matrix of this distribution
follow the spectral responsivities of the capture device, which for a typical color
negative film are [666]
1
L = √ (log(R) + log(G) + log(B)) , (15.1a)
3
1
s = √ (log(R) − log(B)) , (15.1b)
2
1
t = √ (log(R) − 2 log(G) + log(B)) . (15.1c)
6
For a large image database, the standard deviations in these three channels are
found to be 0.273, 0.065, and 0.030. Note that this space is similar to the Lαβ
color space discussed in Section 8.7.5. Although that space is derived from LMS
rather than RGB, both operate in log space, and both have the same channel
weights. The coefficients applied to each of the channels are identical as well,
with the only difference being that the role of the green and blue channels has
been swapped.
|F(u, v)|2
S(u, v) = , (15.2)
M2
where F is the Fourier transform of the image. By representing the two-dimen-
sional frequencies u and v in polar coordinates (u = f cos φ and v = f sin φ ) and
averaging over all directions φ and all images in the image ensemble, it is found
that on log-log scale amplitude as a function of frequency, f lies approximately
on a straight line [144, 314, 983, 986, 1001]. This means that spectral power as a
function of spatial frequency behaves according to a power law function. More-
over, fitting a line through the data points yields a slope α of approximately 2 for
1 Despite the red colors observed at sunrise and sunset, most of the environment remains illumi-
nated by a blue skylight, which increases in importance as the sun loses its strength.
i i
i i
i i
i i
natural images. Although this spectral slope varies subtly between different stud-
ies [144, 259, 315, 983, 1130], and with the exception of low frequencies [651],
it appears to be extremely robust against distortions and transformations. It is
therefore concluded that this spectral behavior is a consequence of the images
themselves, rather than of particular methods of camera calibration or exact com-
putation of the spectral slope. We denote this behavior by
A A
S( f ) ∝ = 2−η , (15.3)
fα f
where A is a constant determining overall image contrast, α is the spectral slope,
and η is its deviation from 2. However, the exact value of the spectral slope
depends somewhat on the type of scenes that make up the ensembles. Most in-
terest of the natural image statistics community is in scenes containing mostly
trees and shrubs. Some studies show that the spectral slope for scenes containing
man-made objects is slightly different [1295, 1296]. Even in natural scenes, the
statistics vary, dependent on what is predominantly in the images. The second-
order statistics for sky are, for example, very different from those of trees.
One of the ways in which this becomes apparent is when the power spectra are
not circularly averaged, but when the log average power is plotted against angle.
For natural image ensembles, all angles show more or less straight power spectra,
although most power is concentrated along horizontal and vertical directions [986,
1001] (see also Figure 15.3). The horizon and the presence of tree trunks are said
to be factors, although this behavior also occurs in man-made environments.
The power spectrum is related to the auto-correlation function through the
Wiener-Khintchine theorem, which states that the auto-correlation function and
the power spectrum form a Fourier transform pair [840]. Hence, power spectral
behavior can be equivalently understood in terms of correlations between pairs of
pixel intensities.
A related image statistic is contrast, normally defined as the standard deviation
of all pixel intensities divided by the mean intensity (σ /μ ). This measure can
either be computed directly from the image data, or it can be derived from the
power spectrum through Parceval’s theorem [1001]:
σ2
= ∑ S(u, v). (15.4)
μ 2 (u,v)
i i
i i
i i
i i
The above second-order statistics are usually collected for luminance images
only, as luminance is believed to carry the greatest amount of information. How-
ever, chrominance channels are shown to exhibit similar spectral behavior [878],
and therefore, all subsequent qualitative arguments are expected to be true for
color as well.
The fact that the power spectral behavior of natural images, when plotted in
log-log space, yields a straight line with a slope of around -2 is particularly impor-
tant, since recent unrelated studies have found that in image interpretation tasks,
the human visual system performs best when the images conform to the above
second-order statistic. In one such study, images of a car and a bull were morphed
into each other, with varying distances between the images in the sequence [879].
Different sequences were generated with modified spectral slopes. The minimum
distance at which participants could still distinguish consecutive images in each
morph sequence was then measured. This distance was found to be smallest when
the spectral slope of the images in the morph sequence was close to 2. Deviation
of the spectral slope in either direction resulted in a deteriorated performance to
distinguish between morphed images.
In a different study, the effect of spectral slope on the detection of mirror
symmetry in images was assessed [933]. Here, white noise patterns with varying
degrees of vertical symmetry were created and subsequently filtered to alter the
spectral slope. An experiment, in which participants had to detect if symmetry
was present, revealed that performance was optimal for images with a spectral
slope of 2.
These studies confirm the hypothesis that the HVS is tuned to natural images.
In fact, the process of whitening (i.e., flattening the power spectrum to produce a
slope α of 0) is consistent with psychophysical measurements [52], which indi-
cates that the HVS expects to see images with a 1/ f 2 power spectrum.
i i
i i
i i
i i
Figure 15.1. Example images drawn from the van Hateren database [433]. (Images cour-
tesy of Hans van Hateren.)
Here, I0 is the modified zero-order Bessel function of the first kind and N is
the window size (512 pixels). In addition, this weight function was normalized
by letting
∑ w(x, y)2 = 1. (15.7)
(x,y)
This windowing function was chosen for its near-optimal trade-off between
side-lobe level, main-lobe width, and computability [429]. The resulting images
i i
i i
i i
i i
L(x, y) − μ 2π i(ux+vy)
F(u, v) = ∑ μ
e . (15.8)
(x,y)
Finally, the power spectrum was computed as per (15.2) and the resulting
data points plotted. Although frequencies up to 256 cycles per image are com-
puted, only the 127 lowest frequencies were used to estimate the spectral slope.
Higher frequencies may suffer from aliasing, noise, and low modulation trans-
fer [1001]. The estimation of the spectral slope was performed by fitting a straight
line through the logarithm of these data points as a function of the logarithm of
1/ f . This method was chosen over other slope estimation techniques, such as
the Hill estimator [472] and the scaling method [215], to maintain compatibility
with [1001]. In addition, the number of data points (127 frequencies) is insuf-
ficient for the scaling method, which requires at least 1,000 data points to yield
reasonably reliable estimates.
With this method, second-order statistics were extracted. The 1.87 spectral
slope reported for the van Hateren database was confirmed (we found 1.88 for
our subset of 133 images). The deviations from this value for the artificial image
ensemble are depicted in Figure 15.2. The angular distribution of power tends to
show peaks near horizontal and vertical angles (Figure 15.3). Finally, the distribu-
tion of spectral slopes for the 133 images in this ensemble is shown in Figure 15.4.
−1 Slope −1.88
−2
−3
−4
−5
−6
−7
−8
−9
0 1 2 3
Spatial frequency (log10 cycles/image)
Figure 15.2. Spectral slope for the image ensemble. The double lines indicate ±2 standard
deviations for each ensemble.
i i
i i
i i
i i
−5.0
−5.5
−6.0
0 90 180 270 360
Orientation (deg)
0.15
fraction
0.10
0.05
0.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
1/f-exponent (α)
Figure 15.4. A histogram of spectral slopes, derived from an image ensemble of 133
images taken from the van Hateren database.
i i
i i
i i
i i
and the median of a data set. Kurtosis is based on the size of a distribution’s tails
relative to a Gaussian distribution. A positive value, associated with long tails in
the distribution of intensities, is usually associated with natural image ensembles.
This is for example evident when plotting log-contrast histograms, which plot
the probability of a particular contrast value appearing. These plots are typically
non-Gaussian with positive kurtosis [986].
Thomson [1126] has pointed out that for kurtosis to be meaningful for nat-
ural image analysis, the data should be decorrelated, or whitened, prior to any
computation of higher-order statistics. This can be accomplished by flattening the
power spectrum, which ensures that second-order statistics do not also capture
higher-order statistics. Thus, skew and kurtosis become measures of variations
in the phase spectra. Regularities are found when these measures are applied to
the pixel histogram of an image ensemble [1125, 1126]. This appears to indicate
that the HVS may exploit higher-order phase structure after whitening the image.
Understanding the phase structure is therefore an important avenue of research in
the field of natural image statistics. However, it appears that so far it is not yet
possible to attach meaning to the exact appearance of higher-order statistics.
i i
i i
i i
i i
Figure 15.5. Example images used for computing gradient distributions, which are shown
in Figure 15.6. From left to right, top to bottom: Abbaye de Beauport, Paimpol, France,
July, 2007; Combourg, France, July, 2007; Fougères, France, June, 2007); Kermario,
Carnac, France, July, 2007.
0
log2 probability
Abbaye de Beauport
-2
Combourg
-4 Fougeres
Kermario (Carnac)
-6
-8
-10
-12
-14
-16
-18
-1.5 -1.0 -0.5 0.0 0.5 0.1 1.5
Gradient (× 104)
Figure 15.6. The probability distribution of gradients for the four images depicted in
Figure 15.5.
i i
i i
i i
i i
i i
i i
i i
i i
using a PCA algorithm. Filters can then be found that produce extrema of the
kurtosis [1001]. A kurtotic amplitude distribution is produced by cortical simple
cells, leading to sparse coding. Hence, ICA is believed to be a better model than
PCA for the output of simple cortical cells.
The receptive fields of simple cells in the mammalian striate cortex are lo-
calized in space, oriented, and bandpass. They are therefore similar to the basis
functions of wavelet transforms [315, 853]. For natural images, strong correla-
tions between wavelet coefficients at neighboring spatial locations, orientations,
and scales, have been shown using conditional histograms of the coefficients’
log magnitudes [1049]. These results were successfully used to synthesize tex-
tures [919, 1047] and to denoise images [1048].
i i
i i
i i
i i
range images is therefore generally larger as well, although at least one standard
(the OpenEXR high dynamic range file format [565]) includes a very capable
compression scheme, and JPEG-HDR essentially occupies the same amount of
space as an equivalent low dynamic range image would [1209, 1210]. To con-
struct an image in this file format, the HDR image is tone-mapped and encoded as
a standard JPEG. The ratio between the original and the JPEG image is subsam-
pled and stored in a meta-tag. HDR-aware viewers can reconstruct the HDR from
this data, whereas other image viewers simply ignore the meta-tag and display the
JPEG directly. Encoding of a residual image, such as the ratio image, can also be
used to store an extended gamut [1073].
Recently, the problem of HDR video encoding has received some attention.
This has led to MPEG-HDR, a high dynamic range video encoder/decoder
promising high compression ratios [733], as well as a second video-encoding
scheme that is based on a model of human cones [435], which was briefly dis-
cussed in Section 4.8.
In general, the combination of smallest step size and ratio of the smallest and
largest representable values determines the dynamic range that an image encoding
scheme affords. Images have a dynamic range that is bounded by the encoding
scheme used. The dynamic range of environments is generally determined by the
smallest and largest luminance found in the scene. Assuming this range is smaller
than the range afforded by the encoding scheme, such an image can be captured
and stored. The question then remains: What is the actual dynamic range of such
a scene?
A simplistic approach to measure the dynamic range of an image may, there-
fore, compute the ratio between the largest and smallest pixel value of an im-
age. Sensitivity to outliers may be reduced by ignoring a small percentage of the
darkest and brightest pixels. Alternatively, the same ratio may be expressed as a
difference in the logarithmic domain.
However, the recording device or rendering algorithm may introduce noise
that will lower the useful dynamic range. Thus, a measurement of the dynamic
range of an image should factor in noise. A better measure of dynamic range
is, therefore, a signal-to-noise ratio, expressed in decibels (dB), as used in signal
processing. This concept is discussed further in the following section.
Assuming we have a reasonable measure of dynamic range, it then becomes
possible to assess the extent of the mismatch between the dynamic range of an
image and the dynamic range of a given display device. A good tone-reproduction
operator will be able to reduce the dynamic range of an image to match that of the
display device without producing visible artifacts.
i i
i i
i i
i i
where μs is the mean average of the signal, and σn is the standard deviation of
the noise. In the case that signal and noise are independent, the SNR may be
computed by
σs
SNR indep
= 20 log10 . (15.12)
σn
For high dynamic range images, which typically have long-tailed histograms,
using the mean signal to compute dynamic range may not be the most sensible
approach, since small but very bright highlights or light sources tend to have little
impact on this measure. Thus, rather than use σs or μs to compute one of the above
SNR measures, it may be a better approach to compute the peak signal-to-noise
ratio, which depends on the maximum luminance Lmax of the image:
(Lmax )2
SNRpeak = 20 log10 . (15.13)
σn
While μs , σs , and Lmax are each readily computed for any given image, the
estimation of the noise floor σn in an image is a notoriously difficult task, for
which many solutions exist. In particular, assumptions will have to be made about
the nature of the noise.
In some specific cases, the noise distribution may be known. For instance,
photon shot noise, which is characterized by the random arrival times of photons
at image sensors, has a Poisson distribution. For a sensor exposed for T seconds,
and a rate parameter of ρ photons per second, the probability P of detecting p
photons is given by
(ρ T ) p e−ρ T
P(p) = . (15.14)
p!
The mean μn and standard deviation σn are then
μn = ρ T, (15.15a)
σn = ρ T . (15.15b)
i i
i i
i i
i i
Thus, for photon shot noise, the signal-to-noise ratio can be computed by [1282]
The accuracy of the noise floor and, therefore, the measure of dynamic range,
is dependent on the noise reduction technique, as well as on the filter parame-
ter settings chosen. This makes these techniques limited to comparisons of the
dynamic range between images that are computed with identical techniques and
parameter settings.
On the other hand, the signal-to-noise ratio may be computed by relying on
the minimum and maximum luminance (Lmin and Lmax ):
max
L
SNRmax = 20 log10 . (15.18)
Lmin
i i
i i
i i
i i
1 x +y
2 2
−
Rσ (x, y) = e 2σ2 . (15.19)
2π σ2
Figure 15.7. Example images used for assessing the impact of filter-kernel parameters
on SNR measures; Tower of Hercules Roman Lighthouse, A Coruña, Spain, August 2005;
Rennes, France, July 2005; Insel Mainau, Germany, June 2005.
i i
i i
i i
i i
As any Gaussian filter has infinite non-zero extent, a second parameter determines
into how many pixels the kernel is discretized. This number, k, is typically odd.
While the continuous Gaussian filter has unit area, after discretization, the filter
no longer sums to 1. We therefore normalize the discretized filter kernel before
starting the computations.
As an example, for k = 3, 5, 7, the filter kernels are
⎡ ⎤
⎡ ⎤ 0.003 0.013 0.022 0.013 0.003
0.075 0.124 0.075 ⎢0.013 0.060 0.098 0.060 0.013⎥
⎢ ⎥
⎣0.124 0.204 0.124⎦ ; ⎢0.022 0.098 0.162 0.098 0.022⎥ ;
⎢ ⎥
0.075 0.124 0.075 ⎣0.013 0.060 0.098 0.060 0.013⎦
0.003 0.013 0.022 0.013 0.003
⎡ ⎤
0.000 0.000 0.001 0.002 0.001 0.000 0.000
⎢0.000
⎢ 0.003 0.013 0.022 0.013 0.003 0.000⎥⎥
⎢0.001 0.013 0.059 0.097 0.059 0.013 0.001⎥
⎢ ⎥
⎢0.002 0.002⎥
⎢ 0.022 0.097 0.159 0.097 0.022 ⎥.
⎢0.001 0.013 0.059 0.097 0.059 0.013 0.001⎥
⎢ ⎥
⎣0.000 0.003 0.013 0.022 0.013 0.003 0.000⎦
0.000 0.000 0.001 0.002 0.001 0.000 0.000
(dB RMS)
32
Hercules Roman Lighthouse
30
dep
28
SNR
26
24
22 σ=1
σ=2
20 σ=3
σ=4
18 σ=5
σ=6
16
0 5 10 15 20 25 30 35
k
Figure 15.8. Different values for Gaussian filter-kernel parameters k and σ computed
using the Tower of Hercules image for the SNRdep metric.
i i
i i
i i
i i
26
(dB RMS)
Hercules Roman Lighthouse
24
indep
22
SNR
20
18
σ=1
16 σ=2
σ=3
σ=4
14 σ=5
σ=6
12
0 5 10 15 20 25 30 35
k
Figure 15.9. Different values for Gaussian filter-kernel parameters k and σ computed
using the Tower of Hercules image for the SNRindep metric.
44
(dB)
42
SNR
40
38
36
34 σ=1
σ=2
32 σ=3
σ=4
30 σ=5
σ=6
28
0 5 10 15 20 25 30 35
k
Figure 15.10. Different values for Gaussian filter-kernel parameters k and σ computed
using the Tower of Hercules image for the SNRpeak metric.
i i
i i
i i
i i
i i
i i
i i
i i
45
SNR (dB)
Hercules Roman Lighthouse
40 peak
SNR
dep
SNR
35 indep
SNR
30
25
20
15
10
0 5 10 15 20 25 30 35
k
Figure 15.11. SNR measures plotted as a function of the median-filter parameter k for the
Tower of Hercules image.
SNR (dB)
60
Church Tower
50 peak
SNR
dep
SNR
indep
SNR
40
30
20
10
0
0 5 10 15 20 25 30 35
k
Figure 15.12. SNR measures plotted as a function of the median-filter parameter k for the
Church Tower image.
i i
i i
i i
i i
50
SNR (dB)
Trees
45
40
35
30 peak
SNR
dep
25 SNR
indep
SNR
20
15
10
0
0 5 10 15 20 25 30 35
k
Figure 15.13. SNR measures plotted as a function of the median-filter parameter k for the
Tree image.
Figure 15.14. Luminance values after filtering with a median filter with kernel size k = 3
(left) and k = 15 (right).
i i
i i
i i
i i
SNR (dB)
180
Tower of Hercules
160
Church Tower
Trees
140
120
100
{
{
60
40
20
SNR Gauss
SNR indep
SNR indep
peak
peak
SNR peak
SNR max
SNR dep
SNR dep
SNR
i i
i i
i i
i i
On the other hand, the RMS-based measures rely on average values in the
image. This results in a tendency to underestimate the dynamic range, as small
but very bright features do not contribute much to this measure. However, it is
precisely these features that have a significant impact on the dynamic
range.
The best approaches therefore appear to be afforded by the peak signal-to-
noise measures. Note that each of these measures, whether guided by a Gaussian
or a median filter, rate the night scene to have the highest dynamic range. This is
commensurate with visual inspection, which would rank this scene first, followed
by the Tower of Hercules, and then the tree scene (which is expected to have the
lowest dynamic range of the three example images).
The Gaussian-based peak-SNR measures provide very similar results, inde-
pendent of whether the image is first blurred by a small filter kernel prior to deter-
mining Lmax . The median-based peak-SNR metric shows a somewhat higher dy-
namic range, presumably because this filter is better able to leave edges
intact.
As such, of the filters and measures included in this comparison, we would
advocate the use of a median filter-based measure of dynamic range, employing a
peak-SNR measure, rather than a RMS measure.
i i
i i
i i
i i
Figure 15.16. The processing steps necessary to prepare a low dynamic range image for
display on a specific display device.
undo all the device-dependent encoding steps and bring the image into a common
subspace where further processing may occur to account for a specific rendering
intent. In addition, gamut mapping may occur in this subspace. The image is
then ready to be transformed into the device-dependent space associated with the
display device for which the image is being prepared. An overview of the steps
involved is given in Figure 15.16 [795].
The first step in the transform from original low dynamic range image to the
device-independent subspace is to linearize the data by undoing the gamma en-
coding. With known primaries and white point, the image can then be transformed
into a device-independent color space (see Chapter 8). A color appearance model
may then be applied to account for the viewing environment for which the image
was originally intended (see Chapter 11).
The image is now in a common subspace, where further image processing may
be applied, such as enhancing for accurateness, pleasantness, etc., depending on
the reason for displaying a particular image; this step may be omitted depending
on rendering intent.
The next step in the common subspace is to account for the difference in
color gamut between the display device and the gamut defined by the image’s
primaries. In particular, if the color gamut of the display device is smaller than
the image’s, as for example in printing devices, the range of colors in the image
must be shoe-horned into a smaller gamut. It should be noted here that source
and target gamuts may overlap, so that in practice gamut compression may be
applied for some colors and gamut expansion may be applied for other colors,
even within the same image. Algorithms to accomplish this are called gamut
mapping algorithms and are discussed in Section 15.4.
i i
i i
i i
i i
After gamut mapping, the linear, enhanced, and gamut-mapped image must
be encoded for the display device on which the image is to be displayed. The first
step is to apply a color appearance model in backward direction, using appearance
parameters describing the viewing environment of the display. The image is then
transformed into the display’s device-dependent color space, followed by gamma
correction to account for the non-linear relationship between input values and
light intensities reproduced on the display..
The human visual system’s adaptability is such that if one or more of the above
steps are omitted, chances are that inaccuracies go largely unnoticed. Many ap-
plications are not color critical, and this helps ameliorate the effects of inaccurate
color reproduction. However, if two color reproductions are shown side-by-side,
for instance when a print is held next to a computer monitor, large to very large
differences may become immediately obvious. For accurate cross-media color
reproduction, all the steps summarized in Figure 15.16 need to be executed.
Figure 15.17. The processing steps involved to display a high dynamic range image on a
specific display device.
i i
i i
i i
i i
Figure 15.18. The processing steps involved to display a high dynamic range image on a
high dynamic range display device.
i i
i i
i i
i i
Dynamic range
expansion?
Figure 15.19. The processing steps involved to display a low dynamic range image on a
high dynamic range display device.
i i
i i
i i
i i
where Lpeak is the display’s peak luminance level (in cd/m2 ) and Q is the associ-
ated perceived image quality. The constants c1 and c2 relate to the optimal peak
luminance level Lopt and the corresponding maximum perceived image quality
Qmax as follows:
1
Lopt = , (15.21a)
b
a
Qmax = exp(−1) . (15.21b)
b
Values of Lopt and Qmax for various contrast ratios are given in Table 15.1. Further,
it was found that the optimal contrast ratio ΔLopt as a function of peak luminance
Lopt can be broadly modeled with the following relation:
ΔLopt ≈ 2862 ln Lopt − 16283. (15.22)
All studies suggest that simply emulating LDR display hardware when dis-
playing legacy content on HDR displays would be a sub-optimal solution. In-
creasing luminance and contrast would be preferred. Algorithms to achieve ap-
propriate up-scaling are termed inverse tone reproduction operators. Fortunately,
provided that the LDR image is properly exposed and does not contain compres-
sion artifacts, the procedure for up-scaling can be straightforward. In fact, user
studies indicate that linear up-scaling is adequate and is preferred over non-linear
scaling [26].
In line with these findings, Rempel et al. propose a linear scaling for pixels
representing reflective surfaces and a non-linear scaling for pixels representing
light sources and other scene elements that are perceived as bright [955]. A similar
technique was proposed specifically for the treatment of light sources [776, 778].
Table 15.1. Optimal display peak luminance values and corresponding maximum per-
ceived image quality for various contrast ratios (after [1022]).
i i
i i
i i
i i
In video encodings, values over 235 (the white level) are typically assumed to
represent light sources or specular highlights, and this value is therefore used to
determine which type of up-scaling is used. Given that the conversion from low
dynamic range to high dynamic range values contains two segments, there is a risk
of introducing spatial artifacts when neighboring pixels are scaled by different
methods. These effects can be mitigated by applying spatial filtering [955]. If
the input image was not linear to begin with, it is first linearized by undoing the
gamma encoding.
Input images are not always well-behaved; for instance they may contain
under- or over-exposed areas. In certain cases, it may be possible to apply texture
synthesis techniques to infer and re-synthesize the missing image areas [1207].
In addition, images may have been encoded using lossy compression. In this
case, the up-scaling to a higher display range may amplify the artifacts that were
initially below visible threshold to values that are easily perceived. Additional
processing is therefore likely to be necessary to deal with problem images.
i i
i i
i i
i i
Figure 15.20. The images on the left are assumed to be specified in the Adobe RGB color
space. Conversion to the sRGB color space creates some pixels that are out-of-gamut.
These are encoded as red pixels on the right. Top: Castle Coombe, UK, September 2006.
Bottom: Stained glass window in Merchant Venturers Building, Bristol, UK, March 2006.
some colors and expand others. The discussion in this section outlines the issues
related to gamut mapping, and follows the work by Ján Morovic [795, 797, 798].
To achieve a consistent output across media it is therefore necessary to map an
image from one color space into a different one. Simple color space conversion
is appropriate for colors that are within the overlapping region in both source and
target color spaces. Colors that are outside the target color gamut need to be
mapped to colors that are within the target device’s gamut. The gray region in
Figure 15.21 indicates the set of colors that cannot be reproduced given that the
two example gamuts only partially overlap.
Remapping out-of-gamut colors to colors that are exactly at the boundary of
the target device’s gamut is called gamut clipping. This approach has the ad-
vantage that most colors remain unaffected and will therefore be accurately re-
produced. However, if only out-of-gamut colors are remapped, then undesirable
visible discontinuities may be introduced at the boundaries.
An alternative approach is to warp the entire gamut, thereby distorting all
colors. This is called gamut mapping. The aim is then to keep the distortion
of all colors to a minimum and, hopefully, below the visible threshold. Such an
approach avoids jarring discontinuities, however, the trade-off lies in the fact that
all colors will be distorted by some amount.
i i
i i
i i
i i
Figure 15.21. Partially overlapping gamuts (left) cause the colors located in the gray
region to be out-of-gamut.
i i
i i
i i
i i
Currently, models to quantify the appearance of images are still in their in-
fancy, although progress in this area is being made (see Chapter 11). This makes
a direct implementation of the above definition of gamut mapping more compli-
cated. Instead, a set of heuristics for gamut mapping have been identified; they
are listed here in decreasing order of importance [721, 795]:
Ensure that gray values are mapped to gray values. This means that achromatic
values in the original gamut should remain achromatic in the reproduction
gamut.
Reduce the number of out-of-gamut colors. Although ideally all the colors are
brought within range, the overall appearance of an image can be improved
by clipping some values against the boundaries of the reproduction gamut.
In the extreme case, all the out-of-gamut colors are clipped, although this
is unlikely to be the optimal case.
Increase saturation. Given that the reproduction gamut is smaller than the origi-
nal gamut, it is often preferable to maximize the use of this smaller gamut,
even if this means increasing saturation. This is similar to maximizing lu-
minance contrast, albeit along a different dimension.
It should be noted that this list of requirements, and their ordering, is largely based
on practice and experience in color reproduction.
i i
i i
i i
i i
media gamuts or image gamuts have also been discussed in the literature [176,
528, 621, 796, 989].
To achieve effective gamut clipping, as well as gamut mapping, an algorithm
to intersect a line along which the mapping is to be carried out with the calcu-
lated gamut must be applied. Such problems can be solved with techniques akin
to ray tracing [1042], although specific gamut intersection algorithms have been
proposed as well [124, 467, 468, 794].
i i
i i
i i
i i
Original Gamut
Reproduction Gamut
Effective Gamut
Out-of-gamut colors
Image color
Reproduction color
Image and reproduction color
Figure 15.22. A simple gamut clipping scheme, whereby out-of-gamut colors are moved
toward the reproduction gamut boundary along the axes of the color space in which gamut
clipping is executed.
with the color space’s axes, then this constitutes the smallest possible shift and,
therefore, the smallest possible distortion in the final result.
Gamut boundaries, however, rarely ever align with the axes of any color space.
A more accurate gamut clipping approach therefore shifts out-of-gamut colors
along a direction orthogonal to the nearest gamut boundary, as shown in Fig-
ure 15.23.
Original Gamut
Reproduction Gamut
Effective Gamut
Out-of-gamut colors
Image color
Reproduction color
Image and reproduction color
Figure 15.23. Gamut clipping where out-of-gamut colors are moved toward the nearest
point on the reproduction gamut boundary.
i i
i i
i i
i i
Reproduction value
Out-of-gamut values
Gamut clipping
Polynomial
Piece-wise linear
Linear
Original value
ors inside the reproduction gamut to maintain their relationship with neighboring
colors which may be outside the reproduction gamut. This means that all colors
are remapped by some amount. Typically, colors that are outside the reproduc-
tion gamut are remapped over larger distances than colors inside the reproduction
gamut. Mappings can be classified either as linear, piece-wise linear, polynomial,
or a combination of these methods. Example remapping functions are shown in
Figure 15.24.
i i
i i
i i
i i
Lightness
Lightness
80 80 80
60 60 60
0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Chroma Chroma Chroma
Lightness
80 80
Original Gamut
Reproduction Gamut 60 60
Image color
Reproduction color 40 Cusp 40 Cusp
20 20
0 0
0 20 40 60 80 100 0 20 40 60 80 100
Chroma Chroma
i i
i i
i i
i i
Table 15.2. Percentage of out-of-gamut errors accumulated for different color spaces
(after [577]).
wider gamut. This would result in fewer out-of-gamut problems. For instance, us-
ing monochromatic primaries with xy chromaticity coordinates (0.7117, 0.2882),
(0.0328, 0.8029), (0.1632, 0.0119) for red, green and blue, respectively, yields
5.3% of out-of-range colors with average and maximum errors of 1.05 and 28.26,
respectively. Although the average error is now much smaller, the maximum error
remains large. This is due to the fact that the choice of primaries is important. In
this case, the primaries are such that the line between the green and red primaries
excludes a significant portion of yellow colors [577].
Finally, it is possible to select primaries that are outside the spectral locus. An
example is the RIMM-ROMM space [1072]. In this color space, the red primary is
at 700 nm, but the green and blue primaries are outside the spectral locus, having
xy chromaticity coordinates of (0.7347, 0.2653), (0.1596, 0.8404), and (0.0366,
0.0001). In the above experiment, no patches were out-of-gamut, and the average
and maximum ΔEab errors are 0.71 and 4.29, respectively [577]. Although it is
clear that such a color space cannot be realized with any physical device, it is a
useful intermediate color space, as it largely avoids gamut conversion problems.
i i
i i
i i
i i
Figure 15.26. Gamma chart. Displaying this chart on a monitor enables the display
gamma to be estimated (after an original idea by Paul Haeberli).
where γ is the gamma value associated with the display device [923].
The EOCF of a monitor can be measured with a photometer. However, under
the assumption of power law behavior, it is also possible to estimate the gamma
exponent by displaying the image shown in Figure 15.26. The alternating black
and white scanlines on one side should produce a middle gray as these lines are
fused by the human visual system. This middle gray can be matched to an entry in
the chart on the other side, which consists of patches of gray of varying intensities.
The best match in the chart is the gray value that the monitor maps to middle gray.
From the actual value in that patch p ∈ [0, 1] the gamma exponent of the monitor
can be derived:
log(0.5)
γ= . (15.24)
log(p)
i i
i i
i i
i i
where LR is the amount of reflected ambient light. While this component can once
more be measured with a photometer with the display device switched off, there
also exists a technique to estimate the reflections off a screen, similar to reading a
gamma chart, albeit slightly more complicated [251].
This technique is based on Weber’s law (see Section 5.5.2). It requires a
method to determine a just noticeable difference (JND) ΔLd for a particular dis-
play value Ld , obtained with all the light sources in the room switched off. We
can repeat this experiment with the room lights on, yielding a second JND ΔLb .
To determine the just noticeable contrast, a test pattern with a set of patches
increasing in contrast can be displayed. By choosing 1/ f noise as the test pattern
(as shown in Figure 15.27), the pattern matches the second-order image statistics
of natural scenes. It is therefore assumed that the calibration is for tasks in which
natural images are involved.
By reading this chart, the viewer is able to select which patch holds the pattern
that is just visible. It was found that this task is simple enough to be fast, yet
accurate enough to be useful in the following computations [251].
This pattern is read twice, once with the lights on, and once with the lights
off. With the two JNDs obtained using the above method, Weber’s law then states
that
ΔLd ΔLb
= . (15.26)
Ld Ld + LR
i i
i i
i i
i i
Figure 15.27. Test pattern used to determine the smallest amount of contrast visible
on a particular display device. (Image from [251],
c 2006 ACM, Inc. Reprinted with
permission.)
From this equation, we can compute how much light LR is reflected off the screen:
ΔLb
LR = Ld −1 , (15.27)
ΔLd
i i
i i
i i
i i
maximum value). The constraints on this function f : [0, m] → [0, m] can be sum-
marized as follows:
f (0) = 0, (15.28a)
f (m) = m, (15.28b)
f (L) = L − LR , LR ≤ L ≤ m (15.28c)
f (L) = 1, (15.28d)
f (x) ≥ 0, (15.28e)
By splitting this function into two domains, the first spanning [0, L] and the second
spanning [L, m], these constraints can be met by setting [1006]
px
f (x) = . (15.29)
(p − 1)x + 1
We first make the appropriate substitutes to yield the following pair of functions:
(L − LR ) px
f[0,L] (x) = , (15.30a)
x(p − 1) + L
x−L
p (m − L + LR )
f [L,m] (x) = m−L + L − LR . (15.30b)
x−L
(p − 1) +1
m−L
We can now solve for the free parameter p by using the restrictions on the deriva-
tive of f . For the [0, L] range we have
p (L − LR ) (p − 1)px (L − LR )
f[0,L] (x) = − (15.31a)
L(p − 1)x/L + 1 L2 ((p − 1)x/L + 1)2
p (L − LR ) L
= (15.31b)
(xp − x + L)2
= 1. (15.31c)
i i
i i
i i
i i
p (m − L + LR )
f[m,L] (x) = (15.32a)
(p − 1) (x − L)
(m − L) +1
m−L
(p − 1) p (x − L) (m − L + LR )
= 2 (15.32b)
(p − 1) (x − L)
(m − L)2 +1
m−L
= 1. (15.32c)
Contrast correction
250
L = 12.75
L = 25.50
L = 38.25
200 L = 51.00
f(x)
L = 76.5
150 m = 255
100
50
0
0 50 100 150 200 250
x
Figure 15.28. Example remapping functions for JNDs of 5%, 10%, 15%, and 20% of
the maximum display value m. (Image from [251],
c 2006 ACM, Inc. Reprinted with
permission.)
i i
i i
i i
i i
Choosing the maximum value m = 255 and the pivot point L at one third the
maximum, example mappings are plotted in Figure 15.28. It will be assumed that
the pivot point L is the same as the luminance value for which the JNDs were
computed, i.e., L = Ld . As the correction algorithm is optimal for this particular
value, and necessarily approximate for all others, the choice of pivot point, and
therefore the display value Ld for which the JNDs are computed, should be chosen
such that the displayed material is optimally corrected. For very dark images, the
JNDs should be measured for lower values of Ld , whereas for very light images,
higher values of Ld could be chosen. In the absence of knowledge of the type of
images that will be displayed after correction, the JNDs can be computed for a
middle gray, i.e., L = Ld ≈ 0.2 m.
In summary, by applying this function to display values, we get new display
values which are corrections for ambient reflections off of the screen. The light
that reaches the retina is now given by
For the pivot point L = Ld , we have f (Ld ) = Ld − LR . The light that reaches the
eye is therefore exactly Ld = Ld , which is the desired result. For values larger or
smaller than Ld , the correction is only partial.
i i
i i
i i
i i
Chapter 16
Color Management
i i
i i
i i
i i
Input Device
Characterization
Device
Independent CIE
XYZ
Output Device
Characterization
troduced to these types of color spaces, based upon CIE XYZ colorimetry and
extended to include color appearance spaces such as CIELAB and CIECAM02.
Recall our introduction to device-independent color from Chapter 7 using the CIE
XYZ system of colorimetry, redrawn here in Figure 16.1.
For this type of color management system, we bring the individual device
units into a common reference color space. The common reference color space,
or connection space, must be well understood and unambiguously defined in order
to be considered device-independent. We bring the colors into this space through
a process known as device characterization, which involves careful measurement
of the individual device properties. Conceptually, we can think of converting the
device units into the reference color space as a forward characterization. From
the reference color space we can then move to any output device using an inverse
i i
i i
i i
i i
i i
i i
i i
i i
files [523]. As of this printing, the current ICC profile format stands at Version
4.2.0.0, which is defined in the ICC.1:2004-10 profile specification and also at
their website (http://www.color.org).
Much like our generic color management example given above, the profile-
based system recommended by the ICC is grounded in the use of a reference
color space which is ”unambiguously defined” [523]. The reference color space
chosen by the ICC is the CIE XYZ system of colorimetry, based upon the 1931
Standard Observer. Since we know the CIE XYZ system still contains some am-
biguity with regard to the appearance of colors, further assumptions based upon
a reference viewing condition and standard color measurement techniques were
also defined. The reference viewing condition follows the recommendations of
ISO 13655:1996, Graphic Technology—Spectral Measurement and Colorimetric
Computation for Graphic Arts Images [533]. The reference medium is defined
to be a reflection print, viewed under CIE D50 illumination at 500 lux, with a
20% surround reflectance. Additionally, the reflectance media should be mea-
sured using a standard 45/0 or 0/45 measurement geometry with a black backing
placed behind the media substrate. This measurement geometry places the light
source and color measurement sensor 45 degrees apart to minimize specular re-
flectance contributions. In addition, the ICC and ISO standards define the relative
reflectance of the reflectance substrate to be 89% relative to a perfect reflecting
diffuser, with the darkest value having a reflectance of 0.30911% [523]. These
values correspond to a linear dynamic range of approximately 290–1. Addition-
ally there are specifications for viewing flare, or stray light hitting the reference
medium, of about 1.06 candelas per square meter. These standardized viewing
conditions, along with the reference color space of the 1931 CIE XYZ tristimu-
lus values, are called the Profile Connection Space (PCS). The PCS is the basis
for all the color profile based transformations used in ICC color management, as
shown in Figure 16.2. Additionally, the ICC allows the use of CIELAB as a PCS,
calculated based upon the standard CIE XYZ tristimulus values as well as on the
reference viewing conditions.
The color transforms represented by the circular T in Figure 16.2 represent
a road map for converting between specific device units and the reference CIE
XYZ color space. These transforms form the basis of the ICC color profiles.
These transforms could be based entirely on color measurements of the particu-
lar imaging devices and conform to generating CIE XYZ colorimetric matches.
Recall once again, from Chapter 7, that colorimetric matches are only guaranteed
to hold up for a single viewing condition. When performing cross-media color
reproduction, we are almost certain to attempt to generate color matches across
vastly disparate viewing conditions, using a wide variety of devices. By providing
i i
i i
i i
i i
T T
T T
Profile Connection
Space (PCS)
T T
T
Figure 16.2. The general ICC color management workflow, using a Profile Connection
Space (PCS), redrawn from [523].
additional information for the PCS, namely by providing a reference viewing con-
dition and medium as well, the ICC has created a system for color management
that can account for these changes in color appearance. This includes changes in
the color of the viewing environment, using chromatic adaptation transforms, as
well as changes in the surround, media-white and black points, and overall tone
and contrast. How these color transforms are chosen and applied is specified by
the rendering intent of the end-user.
i i
i i
i i
i i
i i
i i
i i
i i
where Xmw is the CIE tristimulus value of the media white, as specified by the
mediaWhitePointTag and Xrel is the relative CIE tristimulus value being normal-
ized. It is important to note that while this is referred to as ICC absolute colori-
metric intent, it is actually closer to relative colorimetric scaling as defined by
the CIE, since we are dealing with normalized units relative to a perfect reflect-
ing diffuser. This alone has probably been a point of confusion for many users.
The ICC absolute colorimetric intent may be most useful for simulating different
output devices or mediums on a single device such as a computer LCD display,
a process generally referred to as color proofing. A similar strategy could be uti-
lized to simulate one type of printing press or technology on another printer. For
practical applications, the ICC states that the media-relative colorimetric intent
can be used to move in and out of the PCS, and absolute colorimetric transforms
can be calculated by scaling these media-relative PCS values by the ratio of the
destination media white to the source media white [523].
The ICC colorimetric intents should be based upon actual color measurements
of the imaging devices, allowing for chromatic adaptation to the reference view-
ing conditions. It is possible that these color measurements do not represent actual
color measurements but rather measurements that may have been altered in order
to minimize color error for important colors, such as skin tones, or even to cre-
ate preferred color reproduction of the in-gamut colors. Although these types of
color measurements may not represent true CIE XYZ colorimetry, they will be
interpreted as colorimetric transformations in an ICC workflow.
The saturation and perceptual intents are much less stringently defined than
the two colorimetric rendering intents. These are purposely left to be vendor-
specific, to allow for proprietary color rendering strategies. Of these two, the
saturation intent is more specific. This intent was designed to allow for the re-
production of simple images where color saturation, vividness, or chroma is most
important, such as business graphics and charts. In these situations, it may be
i i
i i
i i
i i
i i
i i
i i
i i
Figure 16.3. An ICC perceptual rendering intent workflow with a single input and output
device.
i i
i i
i i
i i
Once transformed into the PCS, we then have another color profile transform
to an output device. This is generally another re-rendering into the viewing con-
ditions of the output display device. If we are moving to a device that also is
capable of reproducing a large dynamic range, we may wish to undo some of
the tone compression that was performed on the input, provided this can be done
in a visually lossless manner. Additionally, the perceptual re-rendering can take
into account properties of the viewing conditions, adjusting the overall bright-
ness or contrast as a function of illumination level and the surround. Again, these
re-renderings are performed into the PCS space, assuming the reference viewing
conditions, and then back out of that PCS space into the viewing conditions of the
output device.
Color measurement and color appearance models may be used to generate a
baseline perceptual rendering transform, but they may still not generate an overall
desired color reproduction. As such, it is possible to manipulate the color trans-
formations in order to maximize overall image pleasantness. Just as this could
be done with the colorimetric transforms, for example, to minimize errors in spe-
cific regions of color space, the PCS will assume that color appearance matches
are being generated. As such, the creation of perceptual transforms used in the
ICC color management system can almost be classified as part science and tech-
nology and part creative or artistic choice. Commercial software is available to
generate colorimetric profiles and transforms for most common color imaging
devices, though there are far fewer resources available for generating these per-
ceptual transforms.
To reemphasize this point, remember that the perceptual rendering intent is
used to generate desired or pleasing color appearance matches, and the specific
definition of these transforms is left to the actual manufacturers, vendors, and im-
plementers of the ICC profiles. It is quite likely that the color-managed image
reproduction between multiple devices using the perceptual intent will be differ-
ent based upon the different manufacturers’ choices of implementation. Although
this does sound slightly ominous with regard to accurate color reproduction across
different device manufacturers, by forcing all the perceptual rendering transforms
to first go into the viewing conditions of the reference PCS and then using these
same reference conditions as well on output, the ICC system does mitigate in prac-
tice manufacturer differences. It should also be noted that although the reference
viewing conditions are designed to emphasize the graphic arts and printing indus-
try, similar concepts can be utilized or extended for pure soft-copy reproduction,
such as digital camera and computer display, or a film and video workflow.
All of the color transforms for these different rendering intents are actually
encoded into the color profiles themselves. Many, but not all, profiles supply all
i i
i i
i i
i i
of these transforms internally and allow the color management engine to choose
the appropriate one based upon user input. This potentially makes the generation
of color profiles a complicated, and even artistic, procedure. Many end users of
color management systems are also often confused about which rendering intent
to choose. It is our goal that this chapter will help aid them in making their
decisions. The ICC also provides some guidance, generally recommending the
perceptual rendering intent when strict adherence to colorimetric matching is not
necessary, such as when creating pleasing natural image reproductions [523].
They also recommend using the perceptual transform when the input and out-
put devices have largely disparate gamuts or properties, as the perceptual intent
should do a better job of gamut-mapping and taking into account changes in view-
ing conditions. When working in a proofing environment, e.g., when simulating
one type of device on another, the ICC recommends using the colorimetric intents.
Remember that the colorimetric intents are only attempting to generate colorimet-
ric matches for colors that are within the gamut of the devices; they do not have
defined gamut-mapping algorithms associated with them.
i i
i i
i i
i i
Applications
Graphics Imaging
Library Library
ICC Color Profiles
Default
CMM
Figure 16.4. A general overview architecture of the ICC color management framework,
redrawn from [523].
applications and libraries directly, the system allows dynamic swapping of the ac-
tual CMM used. When used at the operating system level, this type of architecture
allows all computer applications to receive identical color management and not
worry about applying the color transforms themselves.
i i
i i
i i
i i
The profile header contains useful information about the profile itself, as well
as the type of transforms defined. Among other elements, it contains information
about the canonical input space, e.g., RGB or CMY space and the desired profile
connection space or canonical output space, e.g., XYZ or CIELAB. It can also
include information about the preferred CMM, the manufacturer and model of
the device being profiled, the profile creator, etc. Additionally the profile header
contains the class of the profile. There are seven distinct classes of profiles, and
these are defined to make it easier for the CMM to implement certain types of
color transforms. The seven classes are:
• DeviceLink profile;
• abstract profile;
Of these classes, the input, output, and display profiles are by far the most
common among standard color imaging systems. The device link profile is more
exotic, and is used to link two devices together without using the PCS. This may
be beneficial when performing color reproduction in a closed system, such as ty-
ing a scanner or camera directly to a printer. Additionally these classes of profiles
may be used for color proofing of multiple printers, while avoiding any cross-
contamination, for example, between the CMYK primaries of the two printers.
The color space conversion profile is designed for transforming between a given
color space that does not describe a specific device and the PCS. For example,
it would be possible to use this type of profile to convert from images encoded
in an LMS cone space into the PCS XYZ space. The abstract profile and named
profile are even more rare. The abstract profile can contain color transformations
from the PCS to the PCS and could be used as a form of color re-rendering. The
named color profile is used to transform from device units into a specific categor-
ical color space, corresponding to color names. This could be used for a specific
device that has a set of named colorants or primaries.
Following the profile header is the tag table, which serves as an index into
the actual transform information contained in the profile. The tag table includes
i i
i i
i i
i i
the number of tags, the actual signatures of each tag, the size, and location in the
actual binary file. The number of tags required by the ICC specification varies
by the type and class of the profile. There are a few common requirements for
all the classes, except for the device link: profileDescriptionTag, copyrightTag,
mediaWhitePointTag, and chromaticAdaptationTag.
The first two required tags provide information about the profile itself. As
we already learned in Section 16.2.1, the mediaWhitePointTag is used to convert
from the media-relative colorimetric intent into the ICC absolute colorimetric in-
tent, through a normalization of the XYZ or CIELAB values. Remember, the ICC
absolute colorimetric intent should be thought of as a type of relative colorime-
try, as defined by the CIE. The chromaticAdaptationTag encodes a 3 × 3 color
matrix that can be used to convert a CIE XYZ tristimulus value measured at the
device’s viewing conditions into the D50 illumination of the reference viewing
condition. Although any chromatic adaptation matrix could be used, the ICC rec-
ommends using the linearized Bradford matrix, as discussed in Section 10.4.5.
The chromaticAdaptationTag is encoded in a nine-element array:
a0 a1 a2 a3 a4 a5 a6 a7 a8 (16.2)
The equations for calculating a chromatic adaptation matrix using the linearized
Bradford transform for the ICC D50 reference condition are
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
R X 0.8951 0.2664 −0.1614 X
⎣G⎦ = MBFD ⎣Y ⎦ = ⎣−0.7502 1.7135 0.0367⎦ · ⎣Y ⎦ , (16.4a)
B Z 0.0389 −0.0685 1.0296 Z
⎡ ⎤
RD50
⎢ RSource 0.0 0.0 ⎥
⎢ ⎥
−1 ⎢ GD50 ⎥
Madapt = MBFD · ⎢ 0.0 0.0 ⎥ · MBFD (16.4b)
⎢ GSource ⎥
⎣ BD50 ⎦
0.0 0.0
BSource
i i
i i
i i
i i
"B" Curves
Red TRC
Device Space PCS
Channel 1
Green TRC Matrix X
Channel 2 Y
3x3 Z
Channel 3
Blue TRC
Figure 16.5. Workflow for converting colors from a input or display device into the PCS
using a MatTRC transform, redrawn from [523].
reproduction curves (MatTRC) and color look-up tables (cLUT or LUT). The
matrix and tone-reproduction curves are used primarily to represent the colori-
metric transforms for some input and display classes of profiles. The look-up
table type transforms can be used for the colorimetric intents for the three main
classes of profiles, as well as for the perceptual and saturation intents. It should
be noted that these transform types are not defined for the named color profiles,
which have their own special characteristics.
The MatTRC transforms should be familiar to readers, as they are similar
transforms to those discussed for converting different types of additive RGB color
spaces in Chapter 8. An example workflow for these transformations is shown in
Figure 16.5. The specific tags in the ICC profile that relate to these transforms
are redTRCTag, redMatrixColumnTag, greenTRCTag, greenMatrixColumnTag,
blueTRCTag, and blueMatrixColumnTag. The three MatrixColumnTags repre-
sent the measured XYZ values of the three red, green, and blue sensors of a input
device, such as a camera or scanner, or the RGB primaries of a display device.
As we learned from Chapters 7 and 8, additive devices such as LCD and CRT
displays can be characterized by measuring the individual CIE XYZ values of
the three primaries. Using Grassmann’s laws, we can calculate a 3 ×3 color ma-
trix transformation from an RGB device into CIE XYZ by adding the individual
CIE XYZ values of the primaries. The red, green, and blue MatrixColumnTag
i i
i i
i i
i i
elements form the first, second, and third columns of this matrix, respectively:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X Xred Xgreen Xblue R
⎣Y ⎦ = ⎣Yred Ygreen Yblue ⎦ · ⎣G⎦ . (16.5)
Z Zred Zgreen Zblue B device
The red, green, and blue TRCTags represent transforms for undoing any nonlinear
encoding of the device color space, and as seen in Figure 16.5, they are applied
before the linear transform to CIE XYZ. The curve types allowed by the TRCTags
include a simple power or gamma function of the form Y = X γ . These values are
stored as 16-bit fixed numbers, with eight fractional bits. In practice, this means
that the numbers encoded would be divided by 28 or 256 to get the appropriate
gamma value.
Alternatively, the TRCTags can encode sampled one-dimensional functions of
up to 16-bit unsigned integer values. These sampled functions can be thought
of as one-dimensional look-up tables, or 1D-LUTs. Perhaps the easiest way to
think about 1D-LUTs is to represent the table as an array of numbers. The in-
put color, in device space, is then the index into this array, and the output is the
value contained at that index. A simple example is given in Table 16.1, using an
11-element LUT to approximate a gamma function of 2.2. The ICC specification
states that function values are between entries; linear interpolation should be used
to calculate the intermediate entries. We can plot the 11-element 1D-LUT, lin-
early interpolated to 256 entries against an actual gamma function, as shown in
Figure 16.6.
This figure suggests that there is a reasonable relationship between the small
1D-LUT approximating the 2.2 gamma function and the actual function. This re-
i i
i i
i i
i i
1
gam m a
0.9 1D LUT
0.8
0.7
0.6
Output
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1
Input
Figure 16.6. The differences between an 11-element 1D-LUT interpolated to 256 ele-
ments and an actual gamma curve. Although the functions appear similar, the differences
between the two can lead to visual errors.
sult is actually slightly misleading, as we can find out by calculating the color dif-
ferences with an 8-bit image encoded with a 1/2.2 gamma function. An example
image is shown in Figure 16.7. We can convert this image into XYZ by apply-
ing the two different TRC functions, followed by a matrix transformation. From
there we can calculate CIE δ E ∗ a b color differences to get an idea of how large
the error may be. Assuming Figure 16.7 is encoded using the sRGB primaries,
with a gamma correction of 1/2.2, we find that the maximum color difference be-
tween using a real gamma function and an 11-element 1D-LUT is about 3.7. The
complete range of errors is shown in Figure 16.8. We can see that there are some
quantization artifacts that would be clearly visible. This suggests that when using
1D-LUTs, care should be taken to use as large a LUT as possible to avoid errors
introduced by the interpolation. This idea will be revisited later in the chapter
with regards to 3D-LUTs.
Again, these MatTRC transformations are only really valid for certain types
of devices and are only defined for input and display classes of profiles. They
are also only used for colorimetric transformations into and out of the CIE XYZ
version of the PCS. For the other rendering intents, or for other color imaging
devices, the transforms used can be generically referred to as the AToB and BToA
transform, for those going into and out of the PCS, respectively. The specific tags
i i
i i
i i
i i
Figure 16.7. An example image encoded with sRGB primaries and a 1/2.2 gamma
compression.
3.5
2.5
1.5
0.5
Figure 16.8. The CIE δ E ∗ ab color differences between using a small interpolated 1D-
LUT and an actual gamma function to move into the ICC PCS.
i i
i i
i i
i i
Table 16.2. The different color transformation tags and rendering intents available in ICC
profiles.
used by the ICC are summarized in Table 16.2, for the input, display, output, and
color space classes of transform. The three other classes have specific rules and
tags and are beyond the scope of this text.
Conceptually, we can refer to all of the AToB and BToA transforms as those
based upon color look-up tables (cLUTs). This is perhaps an oversimplification,
as we can see in Figure 16.9, as these transforms can contain 1D-LUTs, matrix
transforms, multi-dimensional cLUTs, or combinations of these types of trans-
forms. Specifically, the AToB transforms contain up to five processing elements
that are applied in a specific order: 1D-LUTs ”B” Curves, 3×3 matrix with an
offset (represented by a 3×4 matrix), 1D-LUTs ”M” curves, n-dimensional color
LUTs, and 1D-LUTs ”A” curves. The four combinations of these, shown in Fig-
ure 16.9, can be made by setting specific individual transforms to identity.
We have already discussed how the one-dimensional look-up tables can be
applied in the MatTRC ICC transforms. Conceptually, the ”A”, ”M”, and ”B”
curves work the same way. It is important to note that the ”A” curves, of which
there are as many as there are color channels in our imaging device, can only
be used in conjunction with the multi-dimensional cLUT. The matrix transform
is very similar to that used in the MatTRC, though there is an additional offset
element. This additive offset could be useful to account for differences in the
black of the device space and that of the PCS, or it could be used to account for
flare in the viewing conditions. The multi-dimensional look-up table (cLUT) is
perhaps the most important element of the AToB transforms. As such, a quick
overview of how multi-dimensional look-up tables work is in order. We should
also note that when our device-space has more than 3 primaries, it is the cLUT
that reduces the dimensionality of those channels down to the three dimensions
of the CIE XYZ or more often CIELAB PCS.
We can think of a cLUT as a dictionary that translates specific device pri-
maries from the device space into another color space. Consider a simple color
i i
i i
i i
i i
"B" Curves
Channel 1
X L*
Channel 2 Y or a*
Z b*
Channel 3
Channel 1
Matrix X L*
Channel 2 Y or a*
3x4 Z b*
Channel 3
"A" Curves
"B" Curves
Device Space
Multi-Dimensional
Lookup Table PCS
Channel 1
X L*
Channel 2 Y or a*
CLUT
... Z b*
Channel n
"A" Curves
"M" Curves "B" Curves
Device Space
Multi-Dimensional
Lookup Table PCS
Channel 1
Matrix X L*
Channel 2 Y or a*
CLUT
... 3x4 Z b*
Channel n
Figure 16.9. Example AToB LUT transforms from device coordinates to the ICC PCS,
redrawn from [523].
i i
i i
i i
i i
printer, that has cyan, magenta, and yellow primaries (CMY). As we know, the
goal of the ICC profile is to provide a transform from those specific device pri-
maries into the PCS. We could accomplish this using mathematical and empirical
models, such as the Kubelka-Munk model discussed in Chapter 8. We could also
accomplish this using a brute-force color measurement method along with a color
look-up table. In an extreme case, if we state that our printer is capable of gen-
erating 256 levels (8 bits) of each of the cyan, magenta, and yellow colors, then
we need to print each of those colors separately, and then all of the combinations
of colors. We could then take a color measuring device such as a spectropho-
tometer and measure each one of the printed combinations. With three channels
that would equal 256 × 256 × 256, or over 16 million samples to measure! As-
suming we have the time and energy necessary to accomplish this task, however,
we would then have a direct map that could tell us for any given combination of
CMY what the associated XYZ or CIELAB values are.
Obviously, this is not the most practical approach as the amount of time, let
alone printer media, would not be conducive to such an exercise. If we consider
devices with more than three channels, we can see that the number of measure-
ments necessary to completely fill our LUT would grow exponentially. For prac-
tical implementations we can use a lower number of measurements in each of the
channels and perform some type of interpolation for values in between. As we
saw with the simple 1D-LUTs, if we use too few samples the color error can be
very large. Therefore, it is desirable to achieve a balance between the number of
samples, type of interpolation, and acceptable color error.
Figure 16.10 shows an example of building a cLUT using five sample points
for each of the CMY levels. This creates what is known as a 5×5×5 LUT, with
a total of 125 samples. We could print each of these samples and measure the
corresponding CIELAB or CIE XYZ values to populate the table. Alternatively
we can use a printer model to estimate what the sample values would be. Essen-
tially at each of the colored bulbs, or nodes, in Figure 16.10, we can store the
Figure 16.10. Mapping a device CMY space using a 5 × 5 × 5 color look-up table.
i i
i i
i i
i i
p110 p111
p11
p1
p10
p100 p101
p010 p011
p01 p
y p0
i i
i i
i i
i i
Prism 1 Prism 2
Figure 16.12. An example of sub-dividing the color space cube for prism interpolation.
Figure 16.13. An example of subdividing the color space cube for pyramid interpolation.
interpolation uses only six nodes or vertices of our sampled color space and is
more capable of following local curvature in the color space.
Similarly, pyramid interpolation splits the color cube into three pyramids, each
containing five vertices, as shown in Figure 16.13. These pyramids all come
together at a single point, and that point should be chosen to correspond to neutral
colors. Again tests are performed to determine in which of the pyramids the point
lies, and from there the colors are interpolated.
Tetrahedral interpolation subdivides the space into six tetrahedra, each us-
ing only four vertices. All of the tetrahedra share a common edge, which ideally
should be pointed along the neutral axis. An example of this sub-division is shown
in Figure 16.14. Due to its computational simplicity, as well as ability to main-
tain the neutral axis, this is a common form of interpolation used and will be
discussed in further detail later in this chapter. It should be noted that tetrahedral
interpolation for color space transformation has been covered by certain patents
since the 1970s and 1980s, as detailed by Kang [575]. More details on specific
interpolations can also be found in Kang and Sharma et al. [575, 577, 1035].
Color look-up tables can be powerful tools for transforming colors from de-
vice specific spaces into the ICC reference color space. These cLUTs can be
constructed in such a way as to provide color re-rendering for perceptual ren-
dering intents, or they can also be used to guide certain regions of color space
to other regions. For instance, it would be possible to use cLUTs to shift most
“green” colors to a more saturated green, to create pleasing grass and tree repro-
duction. Likewise, we could shift the “blue” colors away from purple to create
pleasing sky colors. These types of manipulations may be valid for the perceptual
intent. Color look-up tables are not limited to just three dimensions, as we have
illustrated in these examples. The same concepts and interpolation techniques can
i i
i i
i i
i i
Figure 16.14. An example of subdividing the color space cube for tetrahedral interpola-
tion.
i i
i i
i i
i i
have been developed, often in association with color measuring devices. These
tools handle all the appropriate look-up table generation, color measurement, and
binary encoding of the profiles.
i i
i i
i i
i i
L* L*
b* b*
Figure 16.15. Two iconic projections of device-gamuts with different blacks in the
CIELAB space.
i i
i i
i i
i i
Figure 16.16. An image captured with a digital camera and tagged with an sRGB profile.
computer operating systems, we can examine the transforms used to move into
the PCS.
First we consider the nonlinear red, green, and blue TRCTags. Recalling from
Chapter 8, we know that the transfer function defined for sRGB is broken into
two distinct parts: a linear segment for dark colors and a gamma-offset-gain func-
tion for the remaining parts. This function is encoded as 16-bit values in a 1024
element 1D-LUT in the ICC profile as shown in Figure 16.17. We first apply this
LUT to our input sRGB data to get linear RGB units. Next we can examine the
red, green, and blue MatrixColumnTags and combine these into our 3×3 matrix
to transform into CIE XYZ:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X 0.4361 0.3851 0.1431 R
⎣Y ⎦ = ⎣0.2225 0.7169 0.0606⎦ · ⎣G⎦ . (16.6)
Z D50 0.0139 0.0971 0.7141 B sRGB linear
Note that this transform appears to be significantly different than the standard
sRGB transform. This is because it has been chromatically adapted from the
standard white of CIE D65 to the reference D50 transform using a linear Bradford
transform as shown in Equation (16.4). We apply this matrix to our linearized
i i
i i
i i
i i
0.9
0.8
0.7
0.6
Output
0.5
0.4
0.3
0.2
0.1
0
0 200 400 600 800 1000 1200
Input Index
Figure 16.17. The red, green, and blue TRCTags from an sRGB profile.
Figure 16.18. An image transformed into the ICC XYZ reference color space.
i i
i i
i i
i i
RGB data, and our image is now in the reference CIE XYZ PCS as shown in
Figure 16.18.
Once we are in the reference color space, we need to move back out onto a
computer display. Essentially, we want to figure out what digital counts should
be displayed on our monitor to achieve a colorimetric match to our input image.
In order to do this, we need to characterize our monitor. We can do this using
commercial software and a spectrophotometer. We first measure the red, green,
and blue primaries at their full intensity. This will help generate the RGB to
XYZ color matrix as shown in Equation (16.6). We also measure the combination
of the three primaries at maximum value to get a measurement of the white of
our computer display. Typically, the native white will be around CIE D65. It is a
good idea to perform a check of your monitor, adding up the individual tristimulus
values of each of the three channels (equivalent to passing R = G = B = 1 into the
3×3 color matrix) to see if they match the measured tristimulus values of your
white. This will determine if your display is behaving in an additive way. From
there, we need to measure the individual tone-reproduction curves of each of the
primaries. This is done by presenting individual ramps on each of the channels.
We can fit a simple gamma function to these measurements, or we can use the
measurements themselves as a 1D-LUT.
For this particular display, the profile is included on the DVD; it was found
that a simple gamma of approximately 1.8 was adequate to describe all three color
channels. The measured color conversion matrix is
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X 0.4735 0.3447 0.1460 R
⎣Y ⎦ = ⎣0.2517 0.6591 0.0891⎦ · ⎣G⎦ . (16.7)
Z D50 0.0058 0.0677 0.7515 B display
Note that this matrix also has a native white (R=G=B=1) equivalent to CIE
D50, despite the fact that the monitor was measured to have a white of CIE XYZ
(1.0104, 1.0300, 1.0651). This is because, once again, we applied a linear Brad-
ford chromatic adaptation transform to get into the PCS color space. This chro-
matic adaptation transform is given by
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X 1.042633 0.030930 −0.052887 X
⎣Y ⎦ = ⎣ 0.022217 1.001877 −0.021103⎦ · ⎣Y ⎦ . (16.8)
Z PCS −0.001160 −0.003433 0.761444 Z display
Recall that the ICC specification requests that these transforms are defined
for moving into the PCS and that they must be invertible. As such, we would
actually use the inverse of Equations (16.8) and (16.7), applied in that order. We
1
would then have to apply the inverse of the display gamma, essentially RGB 1.8 .
i i
i i
i i
i i
Figure 16.19. An sRGB image transformed into a particular monitor display color space.
After that, the image is ready to be displayed. An example of the image that
would be displayed is shown in Figure 16.19. Remember, with the ICC workflow
the underlying encoded image bits do not change, the only thing that changes is
how we interpret those bits. We can see that in order to display an sRGB image
correctly on this particular display, we must make the image more blue and with
a higher contrast.
i i
i i
i i
i i
p110
p11 p111
x p1
p010
p01
p p10
p100
y p011 p101
p0
Figure 16.20. A bounding cube in our device color space used in a color look-up table.
To calculate the values in the PCS, you replace the p∗∗∗ values in the equation
with the measured CIELAB or CIE XYZ values at those nodes.
For tetrahedral interpolation, as shown in Figure 16.14, we first must figure
out in which of the six tetrahedra our point lies. This is done using a series of
inequality tests on the (x, y, z)-coordinates of our color space. The interpolation
is then calculated based upon the four specific vertices on that tetrahedron. Note,
that to maintain the neutral axis as best we can, all the tetrahedra share the p000
vertex. Table 16.3 shows the inequality test and the corresponding weights used
for the interpolation. Once we have those weights, we can interpolate our values
Test Wz Wy Wx
x<z<y p110 − p010 p010 − p000 p111 − p110
z<y<x p111 − p011 p011 − p001 p001 − p000
z≤x≤y p111 − p011 p010 − p000 p011 − p010
y≤z<x p101 − p001 p111 − p101 p001 − p000
y<x≤z p100 − p000 p110 − p100 p111 − p110
x≤y≤z p100 − p000 p111 − p101 p101 − p100
Table 16.3. The test cases and vertex weights for tetrahedral interpolation.
i i
i i
i i
i i
This simplicity is what makes tetrahedral interpolation a popular choice for color
look-up tables.
i i
i i
i i
i i
Chapter 17
Dynamic Range Reduction
i i
i i
i i
i i
Figure 17.1. Photographs taken in “good light” contain no under- or over-exposed areas.
This can be achieved by keeping the sunlight behind the camera. (Images courtesy of Kirt
Witte, Savannah School of Art and Design, http://www.TheOtherSavannah.com).
In addition, capturing the full dynamic range of a scene implies that, in many
instances, the resulting high dynamic range image cannot be directly displayed,
since its range is likely to exceed the two orders of magnitude range afforded by
conventional display devices. Examples of images that are typically difficult to
capture with conventional photography are shown in Figure 17.3.
For such images, the photographer normally chooses whether the indoor or
the outdoor part of the scene is properly exposed. Sometimes, neither can be
achieved within a single image, as shown in Figure 17.4 (left). These decisions
may be avoided by using high dynamic range imaging (Figure 17.4 (right)).
Landscapes tend to have a bright sky and a much darker foreground; such
environments also benefit from high dynamic range imaging. For this particular
type of photography, it is also possible to resort to graduated neutral density filters,
whereby the top half of the filter attenuates more than the bottom half. Positioning
Figure 17.2. Photographs shot into the light. The stark contrast and under-exposed fore-
grounds are created for dramatic effect. (Images courtesy of Kirt Witte, Savannah School
of Art and Design (http://www.TheOtherSavannah.com).)
i i
i i
i i
i i
Figure 17.3. Some environments contain a larger range of luminance values than can
be captured with conventional techniques. From left to right, top to bottom: Harbor of
Konstanz, June 2005; Liebfrauenkirche, Trier, Germany, May 2006; the Abbey of Mont
St. Michel, France, June 2005, hut near Schloß Dagstuhl, Germany, May 2006.
Figure 17.4. With conventional photography, some parts of the scene may be under- or
over-exposed (left). Capturing this scene with nine exposures, and assembling these into
one high dynamic range image, affords the result shown on the right; refectory in Mont St.
Michel, France, June 2005.
i i
i i
i i
i i
Figure 17.5. Linear scaling of high dynamic range images to fit a given display device
may cause significant detail to be lost (left and middle). The left image is linearly scaled.
In the middle image, high values are clamped. For comparison, the right image is tone-
mapped, allowing details in both bright and dark regions to be visible; Mont St. Michel,
France, June 2005.
such a filter in front of the lens allows the photographer to hold detail in both sky
and foreground. Nonetheless, a more convenient and widely applicable solution
is to capture high dynamic range data.
There are two strategies to display high dynamic range images. First, we
may develop display devices that can directly accommodate high dynamic range
imagery [1020, 1021]. Second, high dynamic range images may be prepared for
display on low dynamic range display devices by applying a tone-reproduction
operator. The purpose of tone reproduction is therefore to reduce the dynamic
range of an image such that it may be displayed on a display device.
We can use a simple compressive function to normalize an image (see Fig-
ure 17.5 (left)). This constitutes a linear scaling, which is sufficient only if the
dynamic range of the image is slightly higher than the dynamic range of the dis-
play. For images with a higher dynamic range, small intensity differences will be
quantized to the same display value and visible details are lost. In Figure 17.5
(middle), all large pixel values are clamped. This makes the normalization less
dependent on noisy outliers, but here we lose information in the bright areas of the
image. In comparison, the right image in Figure 17.5 is tone-mapped, showing
detail in both the light and dark regions.
In general linear scaling will not be appropriate for tone reproduction. The
key issue in tone reproduction is to compress an image, while at the same time
preserving one or more attributes of the image. Different tone-reproduction algo-
rithms focus on different attributes such as contrast, visible detail, brightness, or
appearance.
Ideally, displaying a tone-mapped image on a low dynamic range display de-
vice would create the same visual response in the observer as the original scene.
Given the limitations of display devices, this is, in general, not achievable, al-
though we may approximate this goal as closely as possible.
i i
i i
i i
i i
The small constant δ is introduced to prevent the average from becoming zero in
the presence of black pixels. Note that absolute black tends to occur mostly in
computer-generated images. In the real world, it is difficult to find environments
with a complete absence of photons. The geometric average is normally mapped
to a predefined display value. The effect of mapping the geometric average to
different display values is shown in Figure 17.6.
Alternatively, sometimes the minimum or maximum luminance found in the
image is used. This approach tends to suffer from the same problems as dynamic
range measures that are based on the minimum and maximum luminance: they
are sensitive to outliers.
The main challenge faced in the design of a global operator lies in the choice
of compressive function. Many functions are possible, for instance based on the
image’s histogram (Section 17.7) [1211] or on data gathered from psychophysics
(Section 17.2).
On the other hand, local operators compress each pixel according to a specific
compression function that is modulated by information derived from a selection
of neighboring pixels, rather than from the full image [47, 174, 178, 270, 291, 552,
i i
i i
i i
i i
Figure 17.6. Spatial tone mapping applied after mapping the geometric average to differ-
ent display values (left: 0.09, right: 0.18.); Schloß Dagstuhl, Germany, May 2006.
856, 883, 884, 931, 932, 950]. The rationale is that a bright pixel in a bright neigh-
borhood may be perceived differently than a bright pixel in a dim neighborhood.
Design challenges for local operators involve choosing the compressive function,
the size of the local neighborhood for each pixel, and the manner in which lo-
cal pixel values are used. In general, local operators are able to achieve better
compression than global operators (Figure 17.7), albeit at a higher computational
cost.
i i
i i
i i
i i
Both global and local operators are often inspired by the human visual system.
Most operators employ one of two distinct compressive functions; this is orthog-
onal to the distinction between local and global operators. Display values Ld (x, y)
are most commonly derived from image luminance Lv (x, y) by the following two
functional forms:
Lv (x, y)
Ld (x, y) = , (17.2a)
f (x, y)
Lvn (x, y)
Ld (x, y) = . (17.2b)
Lvn (x, y) + gn (x, y)
In these equations, f (x, y) and g(x, y) may either be constant or a function that
varies per pixel. In the former case, we have a global operator, whereas a spatially
varying function results in a local operator. The exponent n is a constant that is
either fixed, or set differently per image.
Equation (17.2a) divides each pixel’s luminance by a value derived from ei-
ther the full image or a local neighborhood. As an example, the substitution
f (x, y) = Lmax /255 in (17.2a) yields a linear scaling such that values may be di-
Figure 17.8. Halos are artifacts commonly associated with local tone-reproduction opera-
tors. Chiu’s operator is used here without smoothing iterations to demonstrate the effect of
division (left). Several mechanisms exist to minimize halos, for instance by using the scale-
selection mechanism used in photographic tone mapping (right); St. Barnabas Monastery,
North Cyprus, July 2006.
i i
i i
i i
i i
rectly quantized into a byte, and they can therefore be displayed. A different
approach is to substitute f (x, y) = Lblur (x, y), i.e., divide each pixel by a weighted
local average, perhaps obtained by applying a Gaussian filter to the image [174].
While this local operator yields a displayable image, it highlights a classical prob-
lem whereby areas near bright spots are reproduced too dark. This is often seen
as halos, as demonstrated in Figure 17.8.
The cause of halos stems from the fact that Gaussian filters blur across sharp
contrast edges in the same way that they blur small details. If there is a large con-
trast gradient in the neighborhood of the pixel under consideration, this causes
the Gaussian blurred pixel to be significantly different from the pixel itself. By
using a very large filter kernel in a division-based approach, such large contrasts
are averaged out, and the occurrence of halos can be minimized. However, very
large filter kernels tend to compute a local average that is not substantially differ-
ent from the global average. In the limit that the size of the filter kernel tends to
infinity, the local average becomes identical to the global average and, therefore,
limits the compressive power of the operator to be no better than a global operator.
Thus, the size of the filter kernel in division-based operators presents a trade-off
between the ability to reduce the dynamic range and the visibility of artifacts.
i i
i i
i i
i i
Ld 1.0
Ld = Lw / (Lw + 1)
Ld = 0.25 log (Lw) + 0.5
0.5
0.0
-2 -1 0 1 2
log (L w )
Figure 17.9. Over the middle-range values, sigmoidal compression is approximately log-
arithmic.
1.0 Ld = Lw / (L w + 10)
Ld = Lw / (L w + 1)
Ld = Lw / (L w + 0.1)
0.5
0.0
-2 -1 0 1 2
Figure 17.10. The choice of semi-saturation constant determines how input values are
mapped to display values.
i i
i i
i i
i i
The interpolation is governed by user parameter a ∈ [0, 1] which has the effect
of varying the amount of contrast in the displayable image (Figure 17.11). More
contrast means less visible detail in the light and dark areas and vice versa. This
interpolation may be viewed as a half-way house between a fully global and a
fully local operator by interpolating between the two extremes without resorting
to expensive blurring operations.
Although operators typically compress luminance values, this particular op-
erator may be extended to include a simple form of chromatic adaptation. It thus
presents an opportunity to adjust the level of saturation normally associated with
tone mapping, as discussed at the beginning of this chapter.
Rather than compress the luminance channel only, sigmoidal compression is
applied to each of the three color channels:
Ir (x, y)
Ir,d (x, y) = , (17.4a)
Ir (x, y) + gn (x, y)
Ig (x, y)
Ig,d (x, y) = , (17.4b)
Ig (x, y) + gn (x, y)
Ib (x, y)
Ib,d (x, y) = . (17.4c)
Ib (x, y) + gn (x, y)
The computation of g(x, y) is also modified to bilinearly interpolate between the
geometric average luminance and pixel luminance, and between each independent
color channel and the pixel’s luminance value. We therefore compute the geomet-
ric average luminance value L̄v , as well as the geometric average of the red, green,
and blue channels (I¯r , I¯g and I¯b ). From these values, we compute g(x, y) for each
pixel and for each color channel independently. The equation for the red channel
(gr (x, y)) is then
The green and blue channels are computed similarly. The interpolation parameter
a steers the amount of contrast as before, and the new interpolation parameter
c ∈ [0, 1] allows a simple form of color correction, as shown in Figure 17.11.
So far we have not discussed the value of the exponent n in Equation (17.2b).
Studies in electro-physiology report values between n = 0.2 and n = 0.9 [485].
i i
i i
i i
i i
Figure 17.11. Linear interpolation of a varies contrast in the tone-mapped image: this
is set to 0.0 (left) and to 1.0 (right). Linear interpolation according to c yields varying
amounts of color correction: c was varied between 0.0 (top) and 1.0 (bottom); Rennes,
France, June 2005.
While the exponent may be user-specified, for a wide variety of images we may
estimate a reasonable value from the geometric average luminance L̄v and the min-
imum and maximum luminance in the image (Lmin and Lmax ) with the following
i i
i i
i i
i i
empirical equation:
1.4
Lmax − L̄v
n = 0.3 + 0.7 . (17.6)
Lmax − Lmin
The different variants of sigmoidal compression shown above are all global
in nature. This has the advantage that they are fast to compute, and they are very
suitable for medium to high dynamic range images. Their simplicity makes these
operators suitable for implementation on graphics hardware as well. For very high
dynamic range images, however, it may be necessary to resort to a local operator
since this may give somewhat better compression.
A straightforward method to extend sigmoidal compression replaces the
global semi-saturation constant by a spatially varying function that also can be
computed in several different ways. Thus, g(x, y) then becomes a function of a
spatially localized average. Perhaps the simplest way to accomplish this is to
again use a Gaussian blurred image. Each pixel in a blurred image represents
a locally averaged value that may be viewed as a suitable choice for the semi-
saturation constant.2
As with division-based operators discussed in the previous section, we have
to consider haloing artifacts. If sigmoids are used with a spatially varying semi-
saturation constant, the Gaussian filter kernel is typically chosen to be very small
in order to minimize artifacts. In practice, filter kernels of only a few pixels wide
are sufficient to suppress significant artifacts while at the same time producing
more local contrast in the tone-mapped images. Such small filter kernels can
be conveniently computed in the spatial domain without losing too much per-
formance. There are, however, several different approaches to compute a local
average; these are discussed in the following section.
constant.
i i
i i
i i
i i
It is therefore important that the local average is computed over pixel values that
are not significantly different from the pixel that is being filtered.
This suggests a strategy whereby an image is filtered such that no blurring
over such edges occurs. A simple, but computationally expensive, way is to com-
pute a stack of Gaussian blurred images with different kernel sizes, i.e., an image
pyramid. For each pixel, we may choose the largest Gaussian that does not over-
lap with a significant gradient. The scale at which this happens can be computed
as follows.
In a relatively uniform neighborhood, the value of a Gaussian blurred pixel
should be the same regardless of the filter-kernel size. Thus, in this case, the dif-
ference between a pixel filtered with two different Gaussians should be around
zero. This difference will only change significantly if the wider filter kernel over-
laps with a neighborhood containing a sharp contrast step, whereas the smaller
filter kernel does not. A difference of Gaussians (DoG) signal LiDoG (x, y) at scale
i can be computed as follows:
LiDoG (x, y) = Ri σ (x, y) − R2 i σ (x, y). (17.7)
It is now possible to find the largest neighborhood around a pixel that does not
contain sharp edges by examining differences of Gaussians at different kernel
sizes i [950]:
" DoG "
" Li (x, y) "
" "
" Ri σ (x, y) + α " > t i = 1 . . . 8. (17.8)
i i
i i
i i
i i
Figure 17.12. Scale selection mechanism. The left image shows the tone-mapped result.
The image on the right encodes the selected scale for each pixel as a gray value—the darker
the pixel, the smaller the scale. A total of eight different scales were used to compute this
image; Clifton Suspension Bridge, Bristol, UK, April 2006.
Images tone mapped with both forms are shown in Figure 17.13. In addition,
this figure shows the color difference as computed with the CIE94 color difference
metric, presented in Section 8.8.2. This image shows that the main differences oc-
cur near (but not precisely at) high-frequency high-contrast edges, predominantly
Figure 17.13. This image was tone mapped with both global and local versions of the
photographic tone-reproduction operator (top left, and top right, respectively). Below, for
∗ color difference is shown; Stonehenge, UK, October 2006.
each pixel, the CIE ΔEab
i i
i i
i i
i i
seen in the clouds. These are the regions where more detail is produced by the
local operator.
Image pyramids may also be used to great effect to tone map an image directly.
Careful design of the filter stack may yield results that are essentially free of halos,
as discussed in the following section.
An alternative approach includes the use of edge-preserving smoothing oper-
ators that are designed specifically for removing small details while keeping sharp
contrasts intact. Such filters have the advantage that sharp discontinuities in the
filtered result coincide with the same sharp discontinuities in the input image, and
they may therefore help to prevent halos [253]. Several such filters, for example,
the bilateral filter, trilateral filter, Susan filter, the LCIS algorithm, and the mean
shift algorithm are suitable [178, 204, 270, 883, 1153], although some of them
are expensive to compute. Edge-preserving smoothing operators are discussed in
Section 17.5.
i i
i i
i i
i i
To avoid distortions in the reconstructed image, the effective gain should have
frequencies no higher than the frequencies present in the sub-band signal. This
can be achieved by blurring the effective gain map G(x, y) before applying it to
the sub-band signal. This approach leads to a significant reduction in artifacts,
and it is an important tool in the prevention of halos.
The filter bank itself may also be adjusted to limit distortions in the recon-
structed signal. In particular, to remove undesired frequencies in each of the sub-
bands caused by applying a non-linear function, a second bank of filters may be
applied before summing the sub-bands to yield the reconstructed signal. If the
first filter bank that splits the signal into sub-bands is called the analysis filter
bank, then the second bank is called the synthesis filter bank. The non-linearity
described above can then be applied in between the two filter banks. Each of
the synthesis filters should be tuned to the same frequencies as the corresponding
analysis filters.
An efficient implementation of this approach, which produces excellent
artifact-free results, is described by Li et al. [681], and an example image is shown
in Figure 17.14. The source code for this operator is available at http://www.mit.
edu/∼yzli/hdr companding.htm. The image benefits from clamping the bottom
2% and the top 1% of the pixels; this is discussed further in Section 17.10.2.
Although this method produces excellent artifact-free images, it has the ten-
dency to over-saturate the image. This effect was ameliorated in Figure 17.14 by
desaturating the image using the technique described in Section 17.10.1 (with a
value of 0.7, which is the default value used for the sub-band approach). How-
ever, even after desaturating the image, its color fidelity remained a little too sat-
urated. Further research would be required to determine the exact cause of this
effect, which is shared with gradient-domain compression (Section 17.6). Finally,
it should be noted that the scene depicted in Figure 17.14 is a particularly chal-
lenging image for tone reproduction. The effects described here would be less
pronounced for many high dynamic range photographs.
i i
i i
i i
i i
Figure 17.14. Tone reproduction using a sub-band architecture, computed here using a
Haar filter [681]; Gymnasium of Salamis, North Cyprus, July 2006.
i i
i i
i i
i i
Figure 17.15. Bilateral filtering removes small details, but preserves sharp gradients (left).
The associated detail layer is shown on the right; Bristol, UK, August 2005.
Figure 17.16. An image tone mapped using bilateral filtering. The base and detail lay-
ers shown in Figure 17.15 are recombined after compressing the base layer; Bristol, UK,
August 2005.
sion of the base layer may be achieved by linear scaling. Tone reproduction on
the basis of bilateral filtering is executed in the logarithmic domain.
Aside from bilateral filtering, other operators may be applicable, such as the
mean-shift filter [204], the trilateral filter [178], anisotropic diffusion [890], an
adaptation of a Retinex-like filter [775], or even the median filter (Section 15.2.3).
Edge-preserving smoothing operators may also be used to compute a local
adaptation level for each pixel, to be applied in a spatially varying or local tone-
reproduction operator. A local operator based on sigmoidal compression can, for
instance, be created by substituting Lblur (x, y) = LB (x, y) in (17.9).
i i
i i
i i
i i
Figure 17.17. The image on the left is tone mapped using gradient domain compression.
The magnitude of the gradients ∇L is mapped to a greyscale in the right image (white is
a gradient of 0; black is the maximum gradient in the image); Mont St. Michel, June 2005.
i i
i i
i i
i i
L d 300
250
200
150
100
50
0
0 1 2 3 4 5 6
Log 10 L
i i
i i
i i
i i
Figure 17.19. Image tone mapped using the histogram-adjustment technique; St. Barn-
abas monastery, North Cyprus, July 2006.
to never attain a slope that is too large. The magnitude of the slope that can
be maximally attained is determined by a threshold-versus-intensity curve (TVI)
curve (See Section 5.5.2). The method is then called histogram adjustment, rather
than histogram equalization [1211]. An example of a display mapping generated
by this method is shown in Figure 17.18. This mapping is derived from the image
shown in Figure 17.19.
This method can be extended to include the observer’s state of adaptation
and a time course of adaptation [529]. Such an operator, optimized to preserve
visibility, is then able to simulate what a visually-impaired person would see. The
TVI curve that is used in this algorithm is extended to account for adaptation,
which in turn is governed by a temporal factor modeling the amount of bleaching
occurring in the retina.
i i
i i
i i
i i
tion. The influence of each framework on the total lightness needs to be estimated,
and the anchors within each framework must be computed [618–620].
It is desirable to assign a probability of belonging to a particular framework
to each pixel. This leaves the possibility of a pixel having non-zero participation
in multiple frameworks, which is somewhat different from standard segmentation
algorithms that assign a pixel to at most one segment. To compute frameworks
and probabilities for each pixel, a standard K-means clustering algorithm may be
applied. This algorithm results in a set of n centroids dotted around the image. For
a centroid with a given luminance value LC , the probability of a pixel belonging
to the framework defined by this centroid is given by
− log10 (LC /Lv (x, y))2
PC (x, y) = exp , (17.16)
2σ2
where σ is the maximum difference between adjacent centroids. If no pixels
belong to a centroid with a probability higher than 0.95, this centroid is merged
with a neighbor.
The probability map thus obtained is the first step towards defined frame-
works. The concept of proximity, however, is not yet accounted for. This feature
may be implemented by filtering the probability map with a bilateral filter (17.13),
with σ1 set to half of the smaller of the image width and height and σ2 set to 0.4.
Once frameworks are computed, their impact on the image’s total lightness
can be assessed. The strength of a framework depends on both its articulation
and its relative size (see Section 5.8.5). Articulation can be estimated from the
minimum and maximum luminance within each framework:
2
− log10 LCmax /LCmin
AC = 1 − exp , (17.17)
2σ 2
where σ = 1/3. The size factor of the relative framework XC is computed from
the number of pixels in the framework NC as follows:
−NC2
XC = 1 − exp , (17.18)
2σ2
where σ = 0.1. The probability maps PC are multiplied by their respective artic-
ulation and size factors to obtain the final decomposition into frameworks:
Afterwards, these probabilities are normalized so that, for each pixel, they sum
to 1.
i i
i i
i i
i i
For each framework, the highest luminance rule may now be applied to find
an anchor. However, direct application of this rule may result in the selection of
a luminance value of a patch that is perceived as self-luminous. As the anchor
should be the highest luminance value that is not perceived as self-luminous, se-
lection of the highest luminance value should be preceded by filtering the area of
the local framework with a large Gaussian filter.
Assuming that WC is the anchor selected for local framework C, the net light-
ness of the whole image can be computed as follows:
Llightness (x, y) = 0.7 log10 (Lv (x, y)/W0 ) + 0.3 ∑ PC (x, y) log10 (Lv (x, y)/WC ) .
C
(17.20)
Here, W0 is the anchor of the global framework. This then constitutes a com-
putational model of lightness perception that can be extended for the purpose of
tone reproduction. To achieve dynamic range reduction, the above equation can
be augmented with individual scale factors fC that scale the output of each of the
frameworks independently:
The factors fC are chosen such that each framework fits within the display capa-
bilities of the target display device.
One of the strengths of using a computational model of lightness perception
for the purpose of tone reproduction, is that traditionally difficult phenomena such
as the Gelb effect, introduced in Section 5.8.5, can be handled correctly. To
1 2 3 4 5
i i
i i
i i
i i
Lightness
Lightness Model Photographic Tone Reproduction
0 0
-1 -1
-2 -2
-1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2
log 10 L log 10 L
Lightness
0 Bilateral Filtering
1
2
3
-1 4
5
-2
-1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2
log 10 L
demonstrate, Figure 17.20 shows the output of the lightness model for a set of
stimuli and compares them against the photographic tone-reproduction operator
as well as the bilateral filter. By visual inspection, the lightness model preserves
the Gelb effect, whereas the other two tone-reproduction operators break down
to different degrees. This is confirmed by plotting the lightness values against
luminance input, as shown in Figure 17.21.
This model of lightness perception may have applications beyond tone re-
production. In particular, this model appears to be suitable for detecting pixels
that belong to self-luminous surfaces versus pixels that depict bright reflective
surfaces. It would be interesting to see this model applied within the context of
image appearance modeling, where pixels depicting self-luminous surfaces ought
to be treated differently from pixels depicting bright surfaces.
Further, this model may be used for the purpose of white balancing images
of scenes that have different areas illuminated by different light sources. For in-
stance, shops typically illuminate their displays throughout the day, even when
daylight enters through the shop window. Images taken under these circum-
i i
i i
i i
i i
stances may have areas illuminated by tungsten light and other areas illuminated
by daylight. The approach discussed here would partition these areas into differ-
ent frameworks and treat them separately.
i i
i i
i i
i i
recently appeared [1058]. Here, the image is segmented using K-means segmen-
tation [98,724]. At the edge of each segment, contrast is enhanced with a function
that models contrasts as found in the Cornsweet illusion:
exp −d 2 /σ 2 (x, y) ∈ segment A,
m(x, y) = 1 + a 2 2 (17.22)
1 − exp −d /σ (x, y) ∈ segment B.
Here, d is the shortest distance to the edge between segment A and segment B,
and σ is a user parameter to specify the width of the enhanced area. It is assumed
that the average luminance in segment A is greater than in segment B. Further,
a is a measure of the amplitude of the scaling and, thus, specifies how strong
the effect is. The value of m is applied as a multiplier to enhance the u* and v*
channels in the CIE L∗ u∗ v∗ color space. Applying counter shading only to the
chromatic channels yields a subtler effect than using the luminance channel. An
example result is shown in Figure 17.22. Note that in this image, halo artifacts are
not obvious, although the overall impression is that of a more saturated image.
17.10 Post-Processing
After tone reproduction, it is possible to apply several post-processing steps to ei-
ther improve the appearance of the image, adjust its saturation, or correct for the
gamma of the display device. Here, we discuss two frequently used techniques
that have a relatively large impact on the overall appearance of the tone-mapped
results—a technique to desaturate the results and a technique to clamp a percent-
age of the lightest and darkest pixels.
i i
i i
i i
i i
Figure 17.23. Per-channel gamma correction may desaturate the image. The images were
desaturated with values of s = 0.1 through 0.9 with increments of 0.1; Mainau, Germany,
June 2005.
Alternatively, the saturation constant s may be chosen smaller than 1. Such per-
channel gamma correction may desaturate the results to an appropriate level, as
shown in Figure 17.23 [306].
The results of tone reproduction may appear unnatural, because human color
perception is non-linear with respect to overall luminance level. This means that
if we view an image of a bright outdoor scene on a monitor in a dim environ-
ment, we are adapted to the dim environment rather than the outdoor lighting. By
keeping color ratios constant, we do not take this effect into account. The above
i i
i i
i i
i i
17.10.2 Clamping
A common post process for tone reproduction is clamping. It is for instance part of
the iCAM model, as well as the sub-band encoding scheme. Clamping is normally
applied to both very dark as well as very light pixels. Rather than specify a hard
threshold beyond which pixels are clamped, a better way is to specify a percentile
of pixels that will be clamped. This gives better control over the final appearance
of the image.
By selecting a percentile of pixels to be clamped, inevitably detail will be lost
in the dark and light areas of the image. However, the remainder of the luminance
values is spread over a larger range, and this creates better detail visibility for
large parts of the image.
The percentage of pixels clamped varies usually between 1% and 5%, depend-
ing on the image. The effect of clamping is shown in Figure 17.24. All images are
treated with the photographic tone-reproduction technique. The images on the left
show the results without clamping, and therefore all pixels are within the display
range. In the image on the top right, the darkest 7% of the pixels are clamped,
as well as the lightest 2% of the pixels. This has resulted in an image that has
reduced visible detail in the steps of the amphitheater, as well as in the wall and
the bushes in the background. However, the overall appearance of the image has
improved, and the clamped image conveys the atmosphere of the environment
better than the directly tone-mapped image.
The bottom-right image of Figure 17.24 also conveys the atmosphere of the
scene better. This photograph was taken during a bright day. In this case, the
i i
i i
i i
i i
Figure 17.24. Examples of clamping. All images were tone mapped using photographic
tone reproduction. The left images were not clamped, whereas 7% of the darkest pixels and
2% of the lightest pixels were clamped in the top-right image. Top: Salamis amphitheater,
North Cyprus, July 2006. Bottom: Lala Mustafa Pasha Mosque, Famagusta, North Cyprus,
July 2006.
i i
i i
i i
i i
clamping has allowed more detail to become visible in the back wall around the
window.
To preserve the appearance of the environment, the disadvantage of losing
detail in the lightest and darkest areas is frequently outweighed by a better overall
impression of brightness. The photographs shown in Figure 17.24 were taken
during a very bright day, and this is not conveyed well in the unclamped images.
Finally, the effect of clamping has a relatively large effect on the results. For
typical applications, it is an attractive proposition to add this technique to any
tone-reproduction operator. However, as only a few tone-reproduction operators
incorporate this feature as standard, it also clouds the ability to assess the quality
of tone-reproduction operators. The difference between operators appears to be
of similar magnitude as the effect of clamping.
i i
i i
i i
i i
17.11.1 Operators
For each of the studies discussed in the following, we refer to the original pa-
pers for detailed descriptions of the experimental set-up and analysis. We aim to
provide an overview as well as the main conclusions of each of the studies. The
algorithms included are summarized as follows:
Linear scaling. A global operator, whereby the image is linearly scaled to fit into
the range of the display.
Time-dependent operator. A global operator that takes the time course of adapta-
tion into account [885].
i i
i i
i i
i i
i i
i i
i i
i i
17.11.2 Comparison
In this section, we present side-by-side comparisons of several recent tone-
reproduction operators. As discussed above, the successful visual comparison
depends on several parameters:
• Choice of test images. The image used in this comparison is only one of
many that we could have chosen. It is well known that some images suit
certain operators better than others. An exhaustive comparison, or even
a visual comparison with a reasonable number of test images, is beyond
the scope of this book. We have therefore tried to select a reasonable test
image, but note that for specific applications, a better test image could have
been chosen.
i i
i i
i i
i i
i i
i i
i i
i i
well preserved in the bright areas, except by photographic tone reproduction. This
operator was designed to let bright areas burn out, to maintain a good impression
of brightness. Hence, this operator affords a different trade-off between detail
visibility and overall perception of brightness.
We applied clamping to the linearly scaled operator with 10% of the bottom
pixels and 5% of the top pixels clamped. This has resulted in significant burn-out
in the windows and loss of detail in the dark areas. However, this operator has
created a well exposed mid-range, presenting the walls and the floor well. Without
clamping, this image would appear to be largely black, with the windows still
burnt-out. The linear operator is effectively a baseline result, the quality of which
all other operators will have to exceed to justify their computational expense.
Both the histogram-adjustment operator and the photoreceptor-inspired opera-
tor present sufficient details in the dark areas; however, the histogram-adjustment
operator let the bright area burn out. The bright windows are reasonably well
represented by the photoreceptor-based operator. The overall impression that this
window is very bright in the real environment is well conveyed by both operators.
The Tumblin-Rushmeier operator trades detail visibility for global contrast.
This operator would benefit from additional clamping to improve detail visibility
of the mid-range of luminance values. However, we applied clamping only if
this was part of the original operator, which for Figure 17.25 was only the sub-
band encoding scheme and the linear scaling. Similarly, gamma correction was
applied to all images, except the sub-band encoding technique, as this operator
was designed to handle gamma issues directly.
log10 (Ld (x, y)) ≈ f (log10 (Lv (x, y))) = G1 log10 (Lv (x, y)) + G2 , (17.24)
with G1 and G2 coefficients estimated through linear regression for all tone-
mapped pixel values. Here, f is the estimated tone-reproduction function (lin-
ear in the log domain) that is clamped to the minimum and maximum display
i i
i i
i i
i i
Algorithm
photoreceptor
photographic (global)
photographic (local)
lightness perception
bilateral filtering
gradient domain compression
adaptive logarithmic mapping
gamma 2.2
0 0.2 0.4 0.6 0.8 1
dC g
Figure 17.26. Change in global contrast dCg for different tone-reproduction operators
(after Smith, Krawczyk, Myszkowski, and Seidel [1058]).
luminance. The global contrast Δ log10 (Ld ) in the brightness domain can then be
derived using the minimum and maximum brightness in the tone-mapped image:
i i
i i
i i
i i
A model of the response of the human visual system to a given contrast can be
approximated with the following transducer function T (Cl ) [733]:
The constants in this equation are chosen such that T (0) = 0 and, for an appro-
priately chosen threshold contrast Ct , we have T (Ct ) = 1. The threshold contrast
was chosen to be 1%, so that Ct = log10 (1.01). For high dynamic range images,
this assumption may be inaccurate. To correct for the unique conditions posed by
high dynamic range imaging, a correction factor may be applied to T (Cl ), leading
to the following model of response to local contrast:
Ct
Cresp (x, y) = T (Cl (L, x, y)) . (17.29)
LB (x, y) + tvi (LB (x, y))
log10
LB (x, y)
photoreceptor
photographic (global)
photographic (local)
lightness perception
bilateral filtering
gradient domain compression
adaptive logarithmic mapping
gamma 2.2
-0.05 0 0.05 0.1 0.15 0.2 0.25
Relative dark area with loss of detail
Figure 17.27. Relative area for which the change in visible detail dCresp went from > 1
to < 1 due to tone reproduction. This graph is for dark areas of the image (after Smith,
Krawczyk, Myszkowski, and Seidel [1058]).
i i
i i
i i
i i
Algorithm
photoreceptor
photographic (global)
photographic (local)
lightness perception
bilateral filtering
gradient domain compression
adaptive logarithmic mapping
gamma 2.2
Figure 17.28. Relative area for which the change in visible detail dCresp went from > 1
to < 1 due to tone reproduction. This graph is for light areas of the image (after Smith,
Krawczyk, Myszkowski, and Seidel [1058]).
The change in hypothetical response that occurs when a high dynamic range
image is tone mapped is thus a relevant metric to evaluate how well local details
are preserved. As the unit of measurement of Cresp is in just noticeable differences
(JNDs), values less than 1 are imperceptible and can therefore be set to 0. Also,
if tone reproduction causes the contrast to be reduced from > 1 JND to < 1 JND,
the change is set to 1. In all other cases, the change in detail visibility is simply
the difference in contrast response:
⎧
⎪
⎨1 if Cresp (Lv , x, y) > 1 > Cresp (Ld , x, y),
" "
dCresp (x, y) = 0 if "Cresp (Lv , x, y) −Cresp (Ld , x, y)" < 1,
⎪
⎩
Cresp (Lv , x, y) −Cresp (Ld , x, y) otherwise.
(17.30)
This metric is typically applied directly to pairs of HDR and tone-mapped images.
However, it would be instructive to separate the images into light and dark areas
and apply this metric separately. The threshold is arbitrarily set to the log average
luminance of the HDR image.
The analysis, shown for dark areas in Figure 17.27 and for light areas in Fig-
ure 17.28 measures the relative area of the image where local contrast visible in
the HDR input is mapped to less than one JND by the tone-reproduction operator.
This is the area where visible contrast is lost and, therefore, needs to be as small
as possible.
17.11.4 Validation
Tone reproduction validation studies aim to rank operators in terms of suitability
for a particular task. The design of such experiments thus involve a task that
i i
i i
i i
i i
observers are asked to perform. The accuracy with which the task is performed is
then said to be a measure for how suitable a tone reproduction is for that particular
task. It is therefore important to select a sensible task, so that practicality of tone-
reproduction operators can be inferred from task performance.
This particular requirement causes several difficulties. For instance, the game
industry is currently moving to high dynamic range imaging. This means that
rendered content needs to be tone mapped first before it can be displayed. The
real-world task is then “entertainment.” Maybe the aim of the game designer is
to achieve a high level of involvement of the player; or perhaps the goal is to
enable players to achieve high scores. In any case, it will be difficult to devise a
laboratory experiment that measures entertainment, involvement, or engagement.
For other tasks—medical visualization was mentioned earlier—the validation
study may be easier to define, but such application-specific validation studies are
currently in their infancy. A notable exception is Park and Montag’s study, which
evaluates tone reproduction operators for their ability to represent non-pictorial
data [872]. As a result, validation studies tend to address issues related to prefer-
ence and/or similarity.
Preference ratings, which measure whether one image is liked better than
another, have various advantages [263, 625, 627]. It is, for instance, not neces-
sary to compare tone-mapped images to real scenes, as only pair-wise compar-
isons between tone-mapped images are required. On the downside, not having
seen the original environment can be viewed as a serious limitation [44]. The
most preferred image may still deviate significantly from the real environment
from which it was derived. Preference scalings will not be able to signal such
deviations.
On the other hand, similarity ratings are designed to determine which tone-
mapped result looks most natural, i.e., most like the real environment that it de-
picts [627, 659, 1278]. Such studies benefit from having the real environment
present; it has been shown that including the real environment in the study may
change observers’ opinions [44].
The first validation study that assessed tone-reproduction operators appeared
in 2002 [263] and measured image preference and, in a second experiment, nat-
uralness. They found that photographic tone reproduction consistently scored
highest, followed by the histogram-adjustment method; LCIS scored low.
In a separate experiment, a set of seven tone-reproduction operators were
compared against two real scenes [1278]. A total of 14 observers rated the tone-
mapped images for overall brightness, overall contrast, visible detail in dark re-
gions, visible detail in light regions, and naturalness. Results of this experiment
are reproduced in Figure 17.29.
i i
i i
i
i
i
i
920
-2.0
-1.0
0.0
1.0
2.0
-2.0
-1.0
0.0
1.0
2.0
-2.0
-1.0
0.0
1.0
2.0
Ti Li Ti Li Ti Li
m m ne m ne
e- nea e- e-
de r de ar de ar
H pe
d H pe
d H pe
d
ist ist ist
og Ash ent og Ash ent og Ash ent
ra ra ra
m ikhm m ikhm m ikhm
A in A in A in
dj dj dj
us us us
tm tm tm
Ph en Ph en Ph en
ot Bi t ot Bi t ot Bi t
A ogr la A ogr la A ogr la
da ap te da ap te da ap te
pt h ra pt h ra pt h ra
iv ic ( l iv ic ( l iv ic ( l
e L lo e L lo e L lo
og c a og c a og c a
ar l) ar l) ar l)
ith ith ith
Detail Contrast
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
-2.0
-1.0
0.0
1.0
Ti Li
Ti
m
Li
ne m
e- nea
e-
de ar de r
H pe H pe
de
ist d ist
og A
og Ash ent sh nt
ra
ra
m i k m ikhm
A
hm A in
dj in dj
us us
tm tm
Ph en
Ph
ot
en
t ot Bi t
Bi
l A ogr la
A ogr da ap t er
da ap ater
a pt h a
pt h iv ic ( l
iv ic ( l e L lo
e L lo
og cal
og cal
ar ) ar )
ith
ith
m m
ic
Detail (light) (F = 30.36, p < 0.01)
Overall Contrast (F = 8.74, p < 0.01)
ic
ures 17.26, 17.27, and 17.28 (after Yoshida, Blanz, Myszkowski, and Seidel [1278]).
ent image attributes. The green colors correspond to the operators also included in Fig-
Figure 17.29. Ratings for seven tone-reproduction operators with respect to five differ-
17. Dynamic Range Reduction
i
i
i
i
i i
i i
This experiment reveals that the global operators (linear, histogram adjust-
ment, time-dependent and adaptive logarithmic) are perceived to be overall
brighter than the local operators (bilateral filtering, Ashikhmin, and photographic).
In addition, the global operators have higher overall contrast than the local oper-
ators. Details in dark regions are best preserved by the Ashikhmin and adaptive
logarithmic operators according to this experiment. Details in light regions are
best preserved by the local operators. The histogram-adjustment, photographic
and adaptive-logarithmic operators are perceived to be the most natural reproduc-
tions of the real scenes.
Correlations between each of brightness, contrast, and detail visibility on the
one hand, and naturalness on the other were computed. It was found that none of
these attributes alone can sufficiently explain the naturalness results. It is there-
fore hypothesized that a combination of these attributes, and possibly others, are
responsible for naturalness.
A second, large-scale validation study also asked the question of which op-
erators produces images with the most natural appearance [659]. Images were
ranked in pairs with respect to a reference image, which was displayed on a high
dynamic range display device. Thus, for each trial, the participant viewed a ref-
erence image as well as two tone-mapped images that were created with different
operators. Preference for one of the two tone-mapped images was recorded in the
context of a specific image attribute that varied between experiments. In the first
experiment, overall similarity was tested using 23 scenes. In the next two exper-
iments, visibility and reproduction of detail was tested in light and dark regions
of the image using 10 scenes each. The experiments were carried out for six dif-
ferent tone-reproduction operators, and the analysis produced a ranking for each
image. In Figure 17.30, we summarize the results by showing, for each operator,
how many times it was ranked first, second, and so on.
The overall similarity ranking in this figure indicates that the iCAM model
produces images that were perceived to be closest to the high dynamic range
images, followed by photographic tone reproduction. In this experiment, the
bilateral-filter and the adaptive-logarithmic mapping did not perform as well.
Multiple comparison scores were computed on the data, which revealed that all
operators perform differently in a statistically significant manner, with the excep-
tion of histogram adjustment and the local eye-adaptation model which together
form one group.
The overall significance test was repeated for gray-scale images. Here it was
found that the photographic operator and the iCAM model perform equivalently.
It is speculated that this change in ranking may result from iCAM’s roots in color
appearance modeling, as it produces particularly natural looking color images.
i i
i i
i i
i i
Overall Similarity
25
Votes
20
Bilateral
15 Photographic (local)
Adaptive Logarithmic
Histogram Adjustment
10 iCAM
Local Eye Adaptation
5
0
1 2 3 4 5 6
Rank
Similarity in Dark Regions Similarity in Light Regions
Votes
Votes
10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
1 2 3 4 5 6 1 2 3 4 5 6
Rank Rank
Figure 17.30. Number of images ranked for each tone-reproduction operator in experi-
ments performed by Ledda et al. [659].
In bright regions, the iCAM model still outperforms the other operators, but
the photographic operator does not reproduce details in dark regions as well, be-
ing an average performer in this experiment and performing better in dark re-
gions. Each of Ledda’s experiments provides a ranking, which is summarized in
Figure 17.31, showing that in these experiments, iCAM and the photographic op-
erator tend to produce images that are perceived to be closer to the corresponding
high dynamic range images than can be achieved with other operators.
A further user study asking observers to rate images in terms of naturalness
was recently presented [44]. This study assessed in terms of preference (Exper-
iment I). Operators were also compared for naturalness in the absence of a real
scene (Experiment II) and in the presence of a real scene (Experiment III). The
i i
i i
i i
i i
Figure 17.31. Ranking of operators in Ledda’s experiments [659]. The boxes around pairs
of operators indicate that these operators were not different within statistical significance.
Perceived Quality
1.2 1.2
0.8 0.8
0.4 0.4
0.0 0.0
1 2 3 4 5 1 2 3 4 5
Scene Scene
Experiment III: Closest to real scene
1.2
Perceived Quality
0.0
1 2 3 4
Scene
Figure 17.32. Ratings of naturalness and preference obtained with Ashikhmin and Goral’s
experiment [44].
i i
i i
i i
i i
results for each of these experiments are reproduced in Figure 17.32 for five dif-
ferent tone-reproduction operators.
As can be seen, in this particular experiment, the absence of a real scene
caused operators to perform inconsistently across scenes, suggesting that the
choice of scene has a non-negligible influence on the outcome. However, the
results are more consistent if the images could be compared against the real envi-
ronment.
When compared directly against real environments, in Experiment III the
gradient-domain compression technique performs well, followed by adaptive-
logarithmic mapping. The trilateral filter does not produce images that are close
to the real environments in this particular experiment.
For the purpose of assessing different tone reproduction operators, as well as
for the development of an evaluation protocol for testing such operators, Kuang et
al developed a series of experiments centered around visual preference and ren-
dering accuracy in the context of a digital photography task [627].In all three of
their experiments the bilateral filter performed well, with the photographic oper-
ator scoring similarly in two of the three experiments.
In the presence of a real scene, it was found that visual preference correlates
highly with perceived accuracy, suggesting that testing for visual preference is a
reasonable approach, and enables inferences to be made about the accuracy of
tone reproduction algorithms. Further the results obtained with rating scales cor-
relate well with the much more time-consuming paired comparison tests. This
leads to the conclusion that one may use the more convenient rating scales proce-
dure to assess tone reproduction operators.
Finally, a recent study proposed not to use real images, but rather a care-
fully constructed stimulus that enables assessment of tone-reproduction opera-
tors in terms of contrast preservation [25]. It is argued that the strength of the
Cornsweet-Craik-O’Brien illusion, discussed in Section 5.4.4, depends on its lu-
minance profile. If such an illusion is constructed in a high dynamic range image,
then tone-reproduction operators are likely to distort this luminance profile and,
therefore, weaken the strength of this illusion.
It is now possible to infer the illusion’s strength in a two-alternative forced-
choice experiment by comparing the tone-mapped image against a step function
with contrast that is variable between trials. The smaller measured step sizes
are caused by weaker illusions. In turn, the strength of the illusion after tone
reproduction is taken as a measure of how well contrast is preserved.
Although many detailed conclusions can be drawn from this experiment, the
results are generally in agreement with previous tone-reproduction operators. We
infer that the preservation of contrast therefore correlates with people’s subjective
i i
i i
i i
i i
i i
i i
i i
i i
17.11.6 Discussion
Now that more validation studies are becoming available, some careful overall
conclusions may be drawn. It is clear that operators that perform well in one
validation study are not guaranteed to perform well in all validation studies. This
hints at the possibility that these operators are not general purpose, but are suitable
only for certain specific tasks.
On the other hand, it could also be argued that the choice of task, types of
images, as well as further particulars of the experimental designs, have an impact
on the outcome. Ashikhmin’s study has found that having a real scene included
in the experimental paradigm, for instance, significantly changes the result.
Further, there are many different operators, and each validation study neces-
sarily includes only a relatively small subset of these. There are a few operators
that have been included in most of the user studies to date. Of these, the pho-
tographic operator appears to perform consistently well, except in Ashikhmin’s
study. The adaptive-logarithmic mapping performed well in Yoshida’s study as
well as in Ashikhmin’s study, but performed poorly in Ledda’s study.
Finally, Ashikhmin’s study, as well as Ledda’s study, did not find a clear dis-
tinction between the included local and global operators. On the other hand,
Yoshida’s methodology did allow local operators to be distinguished from global
operators. This suggests that perhaps for some combinations of tasks and images,
a computationally expensive local operator would not yield better results than a
cheaper global operator. However, this does not mean that there does not exist a
class of particularly challenging images for which a local operator would bring
significant benefits. However, such scenes would also be challenging to include
in validation studies.
i i
i i
i i
i i
i i
i i
i i
i i
Part IV
Appendices
i i
i i
i i
i i
Appendix A
Vectors and Matrices
A · B = Ax Bx + Ay By + Az Bz . (A.1)
The dot product is therefore a scalar quantity. The geometric interpretation is that
the dot product represents the cosine of the angle between two vectors times the
length of both vectors (see Figure A.1):
A · B = A B cos(α ). (A.2)
This implies that the dot product will be zero for perpendicular vectors. Vectors
pointing in the same direction have a dot product with a value equal to the square
of their length:
A · A = A 2 . (A.3)
931
i i
i i
i i
i i
B
α
Another implication is that to find the angle between two vectors, we may nor-
malize both vectors and take the dot product:
A B
cos(α ) = · . (A.4)
A B
and results in a new vector which is perpendicular to both A and B, which may be
expressed as
A · (A × B) = 0, (A.6a)
B · (A × B) = 0. (A.6b)
A
AB
B
α
BA
i i
i i
i i
i i
A × B = A B sin(α ). (A.7)
While this identity could be used to compute the angle between two normalized
vectors, it is cheaper to use the dot product as indicated above in (A.4).
As the two points are close to one another, the higher-order terms can be ne-
glected, so that the difference ΔT between T1 and T2 is given by
" " "
∂ T "" ∂ T "" ∂ T ""
ΔT = T2 − T1 = Δ x + Δ y + Δ z. (A.10)
∂ x "P1 ∂ y "P1 ∂ z "P1
1 We assume throughout that we are dealing with three dimensions.
i i
i i
i i
i i
In the limit that the distance between the two points goes to zero, this becomes
∂T ∂T ∂T
lim ΔT = dT = dx + dy + dz. (A.11)
Δx→0 ∂x ∂y ∂z
Δy→0
Δz→0
A.4 Divergence
Considering a small volume Δv of space bounded by a surface Δs, the influence
of a vector field F may cause a flux, i.e., a measurable quantity flows into the
volume (in-flux) or out of the volume (out-flux). If the volume goes to zero,
the differential out-flux is represented by the divergence of a vector field. The
divergence ∇· of a vector F is then defined as
∂ Fx ∂ Fy ∂ Fz
∇·F = div F = + + (A.15)
∂x ∂y ∂z
provided that F and its first partial derivatives are continuous within Δv. The
divergence operator applied to a vector, returns a vector.
i i
i i
i i
i i
dV2
dV1
dV4 V
dV3
Figure A.3. A volume V is split into four subvolumes dV1 through dV4 . The flux emanating
from dV1 flows partially into neighboring subvolumes and partially out of volume V . Only
subvolumes at the boundary contribute to flux emanating from the volume.
normal to the surface and a component tangential to the surface. The surface
integral of the component normal to the surface, taken over the surface s, may be
related to the volume integral of the divergence of the vector field, taken over the
volume v:
F · n ds = ∇·F dv. (A.16)
s v
This relation is known as the divergence theorem or Gauss’ theorem. Its use lies
in the fact that the triple integral over the volume v may be replaced by a double
integral over the surface s that bounds the volume.
If a volume V is broken up into a set of small subvolumes, then the flux ema-
nating from a given subvolume will flow into its neighbors, as illustrated in Fig-
ure A.3. Thus for subvolumes located in the interior of V , the net flux is zero.
Only subvolumes located on the boundary of V will contribute to the flux associ-
ated with the volume. This contribution is related to the normal component of the
vector field at the surface. Hence, Gauss’ theorem, as stated above.
A.6 Curl
Suppose water flows through a river. The drag of the banks of the river cause the
water to flow slower near the banks than in the middle of the river. If an object is
released in the river, the water will flow faster along one side of the object than
the opposite side, causing the object to rotate; the water flow on either side of
the object is non-uniform. The term curl is a measure of non-uniformity, i.e., the
ability to cause rotation.
i i
i i
i i
i i
Modeling the flow in the river with a vector field, there will be variation in the
strength of the field dependent on location. At each point in a vector field F, the
curl measures its non-uniformity. Curl is defined as
∂ Fz ∂ Fy ∂ Fx ∂ Fz ∂ Fy ∂ Fx
∇× F = curl F = − ex + − ey + − ez . (A.17)
∂y ∂z ∂z ∂x ∂x ∂y
The curl operator takes a vector and produces a new vector.
Figure A.4. Examples of open (or capped) surfaces with associated contours.
When an open surface is subdivided into small differential surface areas, then
the integral of a vector quantity normal to each surface area may be approximated
with a line integral over the contour. The contribution of each contour cancels
out for adjacent surface areas, as shown in Figure A.5. Thus, only the contour
segments that coincide with the contour of the entire surface do not cancel out.
This leads to Stokes’ Theorem:
∇× F · n ds = F · n dc. (A.18)
s c
c1 c2
ds 1 ds 2
s
Figure A.5. Neighboring differential surface areas with associated contours. Integrals over
contours c1 and c2 cancel where they are adjacent and are oriented in opposite directions.
i i
i i
i i
i i
A.8 Laplacian
The Laplacian operator on a scalar function T is given by
∂ 2T ∂ 2T ∂ 2T
∇2 T = + 2 + 2. (A.19)
∂ x2 ∂y ∂z
For a vector-valued function F, this operator is defined by applying the above
operator on each of the three x, y, and z components independently, yielding the
Laplacian of vector F.
∂ 2 Fx ∂ 2 Fx ∂ 2 Fx
∇2 Fx = + + , (A.20a)
∂ x2 ∂ y2 ∂ z2
∂ 2 Fy ∂ 2 Fy ∂ 2 Fy
∇2 Fy = + + , (A.20b)
∂ x2 ∂ y2 ∂ z2
∂ 2 Fz ∂ 2 Fz ∂ 2 Fz
∇ 2 Fz = + + . (A.20c)
∂ x2 ∂ y2 ∂ z2
This Laplacian operator thus takes a vector and produces a new vector.
i i
i i
i i
i i
x → Mx , (A.28)
v = (x y z 0)T , (A.29)
p = (x y z 1)T . (A.30)
n → M−1 T n. (A.31)
i i
i i
i i
i i
Appendix B
Trigonometry
In this appendix, we show some of the trigonometric relations that are used else-
where in this book. While we assume basic familiarity with trigonometry, we
review some of the lesser-known relations.
y Pu . Pv = cos (u-v)
= cos(u) cos(v) + sin(u) sin(v)
1 Pu
u-v
sin(u) Pv
u
v 1
sin(v) x
cos(u)
cos(v)
Figure B.1. The cosine of the difference between two angles may be found by computing
the dot product between two unit vectors.
939
i i
i i
i i
i i
940 B. Trigonometry
By equating these two expressions, we find the first difference formula (as shown
in Figure B.1):
cos(u − v) = cos(u) cos(v) + sin(u) sin(v). (B.3)
A similar expression may be derived for the sine of the difference of two
angles. Here, we use the definition of the cross product and define two three-
dimensional vectors as
⎛ ⎞ ⎛ ⎞
cos(u) cos(v)
Pu = ⎝ sin(u) ⎠ , Pv = ⎝ sin(v) ⎠ (B.4)
0 0.
The length of Pu × Pv is then equal to sin(u − v) (see (A.7)). At the same time,
this vector is given by
⎛ ⎞
0
Pu × Pv = ⎝ 0 ⎠. (B.5)
sin(u) cos(v) − cos(u) sin(v)
The length of this vector is equal to its z-component. We thus have the following
identity:
sin(u − v) = sin(u) cos(v) − cos(u) sin(v). (B.6)
The above two difference formulas may be extended for the cosine and sine of the
sum of two angles. Here, we use the identities cos(−v) = cos(v) and sin(−v) =
−sin(v):
cos(u + v) = cos(u) cos(v) − sin(u) sin(v), (B.7a)
sin(u + v) = sin(u) cos(v) + cos(u) sin(v). (B.7b)
i i
i i
i i
i i
Rewriting yields
sin(u) + sin(v) = 1 − cos(u) 1 + cos(u) + 1 − cos(v) 1 + cos(v).
(B.12b)
i i
i i
i i
i i
942 B. Trigonometry
This expression can be expanded using sin2 (u/2) + cos2 (u/2) = 1 once more:
sin(u) + sin(v) = 2 sin(u/2) cos(u/2) sin2 (u/2) + cos2 (u/2) (B.12e)
+ 2 sin(v/2) cos(v/2) sin2 (v/2) + cos2 (v/2) (B.12f)
We can then apply sum and difference formulae to find the desired expression:
u+v u−v
sin(u) + sin(v) = 2 sin cos . (B.13a)
2 2
i i
i i
i i
i i
Surface area: A
A
Steradian: ω =
r2
Figure B.2. The definition of solid angle ω is a conical segment of a sphere. One steradian
is the angle subtended at the center of this sphere by an area of surface equal to the square
of the radius.
Ω
Surface area
Solid angle
Projected solid angle
(p)
Ω
Figure B.3. The solid angle spanned by the surface A is equal to Ω. The projected solid
angle is shown by Ω(p) .
dθ
sin (θ) dφ
θ
dΩ
dΩ = sin (θ) dφ dθ
θ: elevational angle
φ: azimuthal angle
Figure B.4. Two differential solid angles are shown on a unit sphere. Note that the size
of the differential solid angle changes based on its vertical position on the sphere. The
relationship is given by dΩ = sin(θ )d φ d θ .
i i
i i
i i
i i
944 B. Trigonometry
On the other hand, the projected solid angle is the area of the solid angle
projected onto the base of a unit sphere (Figure B.3). The largest projected solid
angle can be π which is equal to the full area of the base.
The differential solid angle is the smallest area element on a sphere. Its value
depends on the elevation angle at which it is computed (see Figure B.4). The sum
of all differential solid angles is equal to the surface area of the sphere:
2π π
sin(θ ) d φ d θ = 4π . (B.14)
0 0
i i
i i
i i
i i
Appendix C
Complex Numbers
In this appendix, we review some basic formulae and theorems relating to com-
plex numbers.
C.1 Definition
A complex number z is defined as
z = x + iy, (C.1)
√
where x and y are real numbers, and i is the imaginary number (i = −1). Com-
plex numbers may be represented by points on a complex plane, as shown in
Figure C.1. Here, the point z may either be represented in Cartesian coordinates
(x, y) or in polar coordinates (r, ϕ ). To convert a complex number from Cartesian
Imaginary z =x+iy
= r cos (ϕ) + i r sin (ϕ)
x = r cos (ϕ) = r exp (i ϕ)
z
r
y = r sin (ϕ)
Real
Figure C.1. A complex number P may be represented as a point in the complex plane.
945
i i
i i
i i
i i
For a complex number z, the real part is indicated with x = Re{z} and the
imaginary part is given by y = Im{z}. From (C.1), it can be seen that the operators
Re{z} and Im{z} are defined as follows:
z + z∗
Re{z} = , (C.3)
2
z−z ∗
Im{z} = , (C.4)
2i
√
|z| = z z∗ = Re{z}2 + Im{z}2 (C.5)
z = x + iy (C.7a)
= r cos(ϕ ) + ir sin(ϕ ) (C.7b)
iϕ
= re . (C.7c)
A proof of this result may be obtained by expanding each of the exp, cos and
sin functions into a Taylor series and substituting back into Equations (C.7b)
and (C.7c).
i i
i i
i i
i i
Imaginary
(1,0) 1
i
(0,i) i
(-1,0) -1
-1 (0,-i) -i
1 Real
-i Unit circle
Figure C.2. Values of eiϕ may be deduced for specific values of ϕ by observing the unit
circle in the complex plane.
C.3 Theorems
By drawing a unit circle around the origin in the complex plane, we obtain several
useful expressions for complex exponential functions (see Figure C.2):
e0 = 1, (C.8a)
eiπ /2 = i, (C.8b)
iπ
e = −1, (C.8c)
3iπ /2
e = −i. (C.8d)
i i
i i
i i
i i
Thus, the complex function f (t) may be written as the product of the complex
phasor A = a ei ϕ and the time factor ei ω t . This is an important result that al-
lows the time dependency to be eliminated from partial differential equations. It
thereby reduces partial differential equations to ordinary differential equations.
These results are used in finding solutions to the wave equations for plane waves
in time-harmonic fields in Section 2.2.3.
To reconstitute a solution, it suffices to take the real part:
i i
i i
i i
i i
Appendix D
Units and Constants
In this appendix, we enumerate the constants, symbols and units used in this book.
Also given are conversions between commonly encountered units, insofar as they
deviate from SI units. The basic quantities are given in Table D.1. Some relevant
units outside the SI system are given in Table D.2. The 22 derived SI units are
enumerated in Table D.3. Contants (in SI units) are given in Table D.4. For further
information on SI units, see the NIST website (http://physics.nist.gov/cuu/Units/
index.html.)
949
i i
i i
i i
i i
i i
i i
i i
i i
Appendix E
The CIE Luminous
Efficiency Functions
951
i i
i i
i i
i i
Table E.1. (continued) The CIE 1924 Photopic luminous efficiency function, V (λ ).
i i
i i
i i
i i
953
Table E.2. (continued) The CIE 1951 Scotopic luminous efficiency function, V (λ ).
i i
i i
i i
i i
Table E.3. (continued) The CIE 1988 Photopic luminous efficiency function, VM (λ ).
i i
i i
i i
i i
Appendix F
CIE Illuminants
955
i i
i i
i i
i i
i i
i i
i i
i i
957
i i
i i
i i
i i
Appendix G
Chromaticity Coordinates
of Paints
In this appendix, we list a set of chromaticity coordinates for paints. Table G.1
is after Barnes, who measured the reflectances of both organic and inorganic pig-
ments used for painting [68]. This table is useful for specifying colors in render-
ing equations. It should be noted that these values were obtained under standard
illuminant C.
Pigment Y x y Pigment Y x y
Red lead 32.76 0.5321 0.3695 Burnt umber 5.08 0.4000 0.3540
English vermillion 22.26 0.5197 0.3309 Malachite 41.77 0.2924 0.3493
Cadmium red 20.78 0.5375 0.3402 Chrome green (medium) 16.04 0.3133 0.4410
Madder lake 33.55 0.3985 0.2756 Verdigris 16.67 0.1696 0.2843
Alizarin crimson 6.61 0.5361 0.3038 Emerald green 39.12 0.2446 0.4215
Carmine lake 5.04 0.4929 0.3107 Terre verte 29.04 0.3092 0.3510
Dragon’s blood 4.94 0.4460 0.3228 Cobalt green 19.99 0.2339 0.3346
Realgar 32.35 0.4868 0.3734 Viridian 9.85 0.2167 0.3635
Venetian red 13.12 0.4672 0.3462 Cobalt blue 16.81 0.1798 0.1641
Indian red 10.34 0.3797 0.3194 Genuine ultramarine 18.64 0.2126 0.2016
Cadmium orange (medium) 42.18 0.5245 0.4260 Cerulean blue 18.19 0.1931 0.2096
Camboge 29.87 0.4955 0.4335 Azurite 9.26 0.2062 0.2008
Yellow ochre 41.15 0.4303 0.4045 Manganese violet 26.99 0.3073 0.2612
Zinc yellow 82.57 0.4486 0.4746 Cobalt violet 9.34 0.2817 0.1821
Indian yellow 40.01 0.4902 0.4519 Smalt 8.25 0.1898 0.1306
Saffron 33.45 0.4470 0.4345 Blue verditer 16.46 0.1899 0.1880
Yellow lake 27.24 0.4410 0.4217 French ultramarine blue 7.84 0.1747 0.1151
Strontium lemon yellow 84.36 0.4157 0.4732 Indigo 3.62 0.3180 0.2728
Hansa yellow 5 G 75.09 0.4447 0.5020 Prussian blue 1.30 0.2883 0.2453
Chrome yellow (medium) 63.07 0.4843 0.4470 Titanium white A 95.75 0.3124 0.3199
Cadmium yellow (light) 76.66 0.4500 0.4819 Titanium white B 95.66 0.3122 0.3196
Orpiment 64.45 0.4477 0.4402 Zinc white 94.88 0.3122 0.3200
Cobalt yellow 50.48 0.4472 0.4652 White lead 87.32 0.3176 0.3242
Raw sienna 20.03 0.4507 0.4043 Lamp black 4.08 0.3081 0.3157
Burnt sienna 7.55 0.4347 0.3414 Ivory black 2.22 0.3055 0.3176
Raw umber 7.44 0.3802 0.3693 .
Table G.1. Yxy coordinates for a selection of organic and inorganic paints.
959
i i
i i
i i
i i
Bibliography
[1] E. Aas and J. Bogen. “Colors of Glacier Water.” Water Resources Research 24:4 (1988), 561–
565.
[2] P. M. Acosta-Serafini, I. Masaki, and C. G. Sodini. “A 1/3” VGA Linear Wide Dynamic Range
CMOS Image Sensor Implementing a Predictive Multiple Sampling Algorithm with Overlap-
ping Integration Intervals.” IEEE Journal of Solid-State Circuits 39:9 (2004), 1487–1496.
[3] A. Adams and G. Haegerstrom-Portnoy. “Color Deficiency.” In Diagnosis and Management in
Vision Care. Boston: Butterworth, 1987.
[4] D. L. Adams and S. Zeki. “Functional Organization of Macaque V3 for Stereoscopic Depth.”
Journal of Neurophysiology 86:5 (2001), 2195–2203.
[5] J. Adams, K. Parulski, and K. Spaulding. “Color Processing in Digital Cameras.” IEEE Micro
18:6 (1998), 20–30.
[6] M. M. Adams, P. R. Hof, R. Gattass, M. J. Webster, and L. G. Ungerleider. “Visual Cortical
Projections and Chemoarchitecture of Macaque Monkey Pulvinar.” Journal of Comparative
Neurology 419:3 (2000), 377–393.
[7] J. E. Adams Jr. “Interactions between Color Plane Interpolation and Other Image Processing
Functions in Electronic Photography.” In Proceedings of the SPIE 2416, pp. 144–155. Belling-
ham, WA: SPIE, 1995.
[8] A. Adams. The camera. The Ansel Adams Photography series, Little, Brown and Company,
1980.
[9] A. Adams. The negative. The Ansel Adams Photography series, Little, Brown and Company,
1981.
[10] A. Adams. The print. The Ansel Adams Photography series, Little, Brown and Company, 1983.
[11] E. H. Adelson and J. R. Bergen. “The Plenoptic Function and the Elements of Early Vision.” In
Computational Models of Visual Processing, pp. 3–20. Cambridge, MA: MIT Press, 1991.
[12] E. H. Adelson and J. Y. A. Wang. “The Plenoptic Camera.” IEEE Transactions on Pattern
Analysis and Machine Intelligence 14:2 (1992), 99–106.
[13] E. A. Adelson. “Lightness Perception and Lightness Illusions.” In The New Cognitive Neuro-
sciences, edited by M. Gazzaniga, 2nd edition, pp. 339–351. Cambridge, MA: MIT Press, 2000.
[14] M. Aggarwal and N. Ahuja. “Split Aperture Imaging for High Dynamic Range.” International
Journal of Computer Vision 58:1 (2004), 7–17.
[15] M. Aggarwal, H. Hua, and N. Ahuja. “On Cosine-Fourth and Vignetting Effects in Real Lenses.”
In Proceedings of the Eighth IEEE International Conference on Computer Vision (ICCV), 1, 1,
pp. 472–479, 2001.
961
i i
i i
i i
i i
962 Bibliography
[16] M. Aguilar and W. S. Stiles. “Saturation of the Rod Mechanism of the Retina at High Levels of
Stimulation.” Optica Acta 1:1 (1954), 59–65.
[17] W. E. Ahearn and O. Sahni. “The Dependence of the Spectral and Electrical Properties of AC
Plasma Panels on the Choice and Purity of the Gas Mixture.” SID International Symposium
Digest 7 (1978), 44–45.
[18] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford. “Captcha: Telling Humans and Computers
Apart Automatically.” In Advances in Cryptology - Proceedings of Eurocrypt 2003, Lecture
Notes in Computer Science 2656, pp. 294–311. Berlin: Springer, 2003.
[19] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford. “Telling Humans and Computers Apart
Automatically.” Communications of the ACM 47:2 (2004), 57–60.
[20] T. Ajito, T. Obi, M. Yamaguchi, and N. Ohyama. “Multiprimary Color Display for Liquid
Crystal Display Projectors using Diffraction Grating.” Optical Engineering 38:11 (1999), 1883–
1888.
[21] T. Ajito, T. Obi, M. Yamaguchi, and N. Ohyama. “Expanded Color Gamut Reproduced by Six-
Primary Projection Display.” In Proceedings of the SPIE 3954, edited by M. H. Wu, pp. 130–137.
Bellingham, WA: SPIE, 2000.
[22] N. Akahane, S. Sugawa, S. Adachi, K. Mori, T. Ishiuchi, and K. Mizobuchi. “A Sensitivity
and Linearity Improvement of a 100-dB Dynamic Range CMOS Image Sensor using a Lateral
Overflow Integration Capacitor.” IEEE Journal of Solid-State Circuits 41:4 (2006), 851–858.
[23] T. Akenine-Möller and E. Haines. Real-Time Rendering, 2nd edition. Natick, MA: A K Peters,
2002.
[24] A. O. Akyüz and E. Reinhard. “Noise Reduction in High Dynamic Range Imaging.” Journal of
Visual Communication and Image Representation 18:5 (2007), 366–376.
[25] A. O. Akyüz and E. Reinhard. “Perceptual Evaluation of Tone Reproduction Operators using
the Cornsweet-Craik-O’Brien Illusion.” ACM Transactions on Applied Perception 4:4 (2007),
20–1 – 20–29.
[26] A. O. Akyüz, R. Fleming, B. Riecke, E. Reinhard, and H. Bülthoff. “Do HDR Displays Support
LDR Content? A Psychophysical Investigation.” ACM Transactions on Graphics 26:3 (2007),
38–1 – 38–7.
[27] E. Allen. “Analytical Color Matching.” Journal of Paint Technology 39:509 (1967), 368–376.
[28] E. Allen. “Basic Equations used in Computer Color matching II. Tristimulus Match, Two-
Constant Theory.” Journal of the Optical Society of America 64:7 (1974), 991–993.
[29] D. H. Alman. “CIE Technical Committee 1-29, Industrial Color Difference Evaluation Progress
Report.” Colour Research and Application 18 (1993), 137–139.
[30] M. Alpern and J. Moeller. “The Red and Green Cone Visual Pigments of Deuteranomalous
Trichromacy.” Journal of Physiology 266 (1977), 647–675.
[31] M. Alpern and S. Torii. “The Luminosity Curve of the Deuteranomalous Fovea.” Journal of
General Physiology 52:5 (1968), 738–749.
[32] M. Alpern and S. Torii. “The Luminosity Curve of the Protanomalous Fovea.” Journal of
General Physiology 52:5 (1968), 717–737.
[33] M. Alpern and T. Wake. “Cone Pigment in Human Deutan Color Vision Defects.” Journal of
Physiology 266 (1977), 595–612.
[34] M. Alpern. “The Stiles-Crawford Effect of the Second Kind (SCII): A Review.” Perception 15:6
(1986), 785–799.
[35] B. Anderson and J. Winawer. “Image Segmentation and Lightness Perception.” Nature 434
(2005), 79–83.
i i
i i
i i
i i
Bibliography 963
[36] R. Anderson. “Matrix Description of Radiometric Quantities.” Applied Optics 30:7 (1991),
858–867.
[37] B. L. Anderson. “Perceptual Organization and White’s Illusion.” Perception 32 (2003), 269–284.
[38] A. Angelucci, J. B. Levitt, E. J. S. Walton, J.-M. Hupé, J. Bullier, and J. S. Lund. “Circuits for
Local and Global Signal Integration in Primary Visual Cortex.” Journal of Neuroscience 22:19
(2002), 8633–8646.
[39] A. Angström. “The Albedo of Various Surfaces of Ground.” Geografiska Annalen 7 (1925),
323–342.
[40] R. A. Applegate and V. Lakshminarayanan. “Parametric Representation of Stiles-Crawford
Functions: Normal Variation of Peak Location and Directionality.” Journal of the Optical Soci-
ety of America A 10:7 (1993), 1611–1623.
[41] G. B. Arden. “The Importance of Measuring Contrast Sensitivity in Cases of Visual Distur-
bances.” British Journal of Ophthalmology 62 (1978), 198–209.
[42] L. Arend and A. Reeves. “Simultaneous Color Constancy.” Journal of the Optical Society of
America A 3:10 (1986), 1743–1751.
[43] P. Artal, E. Berrio, A. Guirao, and P. Piers. “Contribution of the Cornea and Internal Surfaces to
the Change of Ocular Aberrations with Age.” Journal of the Optical Society of America A 19:1
(2002), 137–143.
[44] M. Ashikhmin and J. Goral. “A Reality Check for Tone Mapping Operators.” ACM Transactions
on Applied Perception 3:4 (2006), 399–411.
[45] M. Ashikhmin and P. Shirley. “An Anisotropic Phong BRDF Model.” journal of graphics tools
5:2 (2002), 25–32.
[46] M. Ashikhmin, S. Premoz̆e, and P. Shirley. “A Microfacet-Based BRDF Generator.” In Proceed-
ings of ACM SIGGRAPH 2000, Computer Graphics Proceedings, Annual Conference Series,
edited by K. Akeley, pp. 65–74. Reading, MA: Addison-Wesley, 2000.
[47] M. Ashikhmin. “A Tone Mapping Algorithm for High Contrast Images.” In Proceedings of
the 13th Eurographics Workshop on Rendering, pp. 145–155. Aire-la-Ville, Switzerland: Euro-
graphics Association, 2002.
[48] ASTM. “Standard Tables for Reference Solar Spectral Irradiances: Direct Normal and Hemi-
spherical on 37◦ Tilted Surface.” Technical Report G173-03, American Society for Testing and
Materials, West Conshohocken, PA, 2003.
[49] ASTM. “Standard Terminology of Appearance.” Technical Report E 284, American Society for
Testing and Materials, West Conshohocken, PA, 2006.
[50] D. A. Atchison and G. Smith. “Chromatic Dispersions of the Ocular Media of Human Eyes.”
Journal of the Optical Society of America A 22:1 (2005), 29–37.
[51] D. A. Atchison, G. Smith, and N. Efron. “The Effect of Pupil Size on Visual Acuity in Uncor-
rected and Corrected Myopia.” American Journal of Optometry and Physiological Optics 56:5
(1979), 315–323.
[52] J. J. Atick and N. A. Redlich. “What Does the Retina Know about Natural Scenes?” Neural
Computation 4 (1992), 196–210.
[53] H. Aubert. Physiologie der Netzhaut. Breslau, Germany: E Morgenstern, 1865.
[54] V. A. Babenko, L. G. Astafyeva, and V. N. Kuzmin. Electromagnetic Scattering in Disperse
Media: Inhomogeneous and Anisotropic Particles. Berlin: Springer-Verlag, 2003.
[55] R. J. Baddeley and P. J. B. Hancock. “A Statistical Analysis of Natural Images Matches Psy-
chophysically Derived Orientation Tuning Curves.” Proceedings of the Royal Society of London
B 246 (1991), 219–223.
i i
i i
i i
i i
964 Bibliography
[56] R. Bajcsy, S. W. Lee, and A. Leonardis. “Color Image Segmentation with Detection of High-
lights and Local Illumination Induced by Interreflection.” In Proceedings of the International
Conference on Pattern Recognition, pp. 785–790. Washington, DC: IEEE, 1990.
[57] R. Bajcsy, S. W. Lee, and A. Leonardis. “Detection of Diffuse and Specular Interface Reflections
and Inter-reflections by Color Image Segmentation.” International Journal of Computer Vision
17:3 (1996), 241–272.
[58] R. M. Balboa, C. W. Tyler, and N. M. Grzywacz. “Occlusions Contribute to Scaling in Natural
Images.” Vision Research 41:7 (2001), 955–964.
[59] G. V. G. Baranoski, P. Shirley, J. G. Rokne, T. Trondsen, and R. Bastos. “Simulating the Aurora
Borealis.” In Eighth Pacific Conference on Computer Graphics and Applications, pp. 2–14. Los
Alamitos, CA: IEEE Computer Society Press, 2000.
[60] H. B. Barlow, R. M. Hill, and W. R. Levick. “Retinal Ganglion Cells Responding Selectively
to Direction and Speed of Image Motion in the Rabbit.” Journal of Physiology (London) 173
(1964), 377–407.
[61] H. B. Barlow. “The Size of Ommatidia in Apposition Eyes.” Journal of Experimental Biology
29:4 (1952), 667–674.
[62] H. B. Barlow. “Summation and Inhibition in the Frog’s Retina.” Journal of Physiology (London)
119 (1953), 69–88.
[63] H. B. Barlow. “Dark and Light Adaptation: Psychophysics.” In Handbook of Sensory Physiology
VII/4, edited by D. Jameson and L. M. Hurvich, pp. 1–28. Berlin: Springer-Verlag, 1972.
[64] K. Barnard and B. Funt. “Investigations into Multi-Scale Retinex (MSR).” In Color Imaging:
Vision and Technology, edited by L. W. MacDonald and M. R. Luo, pp. 9–17. New York: John
Wiley and Sons, 1999.
[65] K. Barnard and B. Funt. “Camera Characterization for Color Research.” Color Research and
Application 27 (2002), 152–163.
[66] K. Barnard, L. Martin, A. Coath, and B. Funt. “A Comparison of Computational Color Con-
stancy Algorithms — Part II: Experiments with Image Cata.” IEEE Transactions on Image
Processing 11:9 (2002), 985–996.
[67] C. Barnes, J. Wei, and S. K. Shevell. “Chromatic Induction with Remote Chromatic Contrast
Varied in Magnitude, Spatial Frequency, and Chromaticity.” Vision Research 39 (1999), 3561–
3574.
[68] N. F. Barnes. “Color Characteristics of Artists’ Paints.” Journal of the Optical Society of America
29:5 (1939), 208–214.
[69] P. Barone, A. Batardiere, K. Knoblauch, and H. Kennedy. “Laminar Distribution of Neurons
in Extrastriate Areas Projecting to Visual Areas V1 and V4 Correlates with the Hierarchical
Rank and Indicates the Operation of a Distance Rule.” Journal of Neuroscience 20:9 (2000),
3263–3281.
[70] B. Barsky, L. Chu, and S. Klein. “Cylindrical Coordinate Representations for Modeling Surfaces
of the Cornea and Contact Lenses.” In Proceedings of the International Conference on Shape
Modeling and Applications (SMI-99), edited by B. Werner, pp. 98–115. Los Alamitos, CA: IEEE
Computer Society, 1999.
[71] A. Bartels and S. Zeki. “The Architecture of the Colour Centre in the Human Visual Brain: New
Results and a Review.” European Journal of Neuroscience 12:1 (2000), 172–193.
[72] C. Bartleson and E. Breneman. “Brightness Perception in Complex Fields.” Journal of the
Optical Society of America 57 (1967), 953–957.
[73] M. Bass, E. W. van Stryland, D. R. Williams, and W. L. Wolfe, editors. Handbook of Optics:
Fundamentals, Techniques and Design, Second edition. New York: McGraw-Hill, 1995.
i i
i i
i i
i i
Bibliography 965
[74] B. Bastani, W. Cressman, and B. Funt. “Calibrated Color Mapping Between LCD and CRT
Displays: A Case Study.” Colour Research and Application 30:6 (2005), 438–447.
[75] B. Baxter, H. Ravindra, and R. A. Normann. “Changes in Lesion Detectability Caused by Light
Adaptation in Retinal Photoreceptors.” Investigative Radiology 17 (1982), 394–401.
[76] B. E. Bayer. “Color Imaging Array.”, 1976. U.S. Patent 3,971,065.
[77] D. A. Baylor, B. J. Nunn, and J. L. Schnapf. “The Photocurrent, Noise and Spectral Sensitivity
of Rods of the Monkey Macaca fascicularis.” Journal of Physiology 357:1 (1984), 576–607.
[78] D. A. Baylor, B. J. Nunn, and J. L. Schnapf. “Spectral Sensitivity of Cones of the Monkey
Macaca fascicularis.” Journal of Physiology 390:1 (1987), 145–160.
[79] H. Becker, H. Vestweber, A. Gerhard, P. Stoessel, and R. Fortte. “Novel Host Materials for Ef-
ficient and Stable Phosphorescent OLED Devices.” In Proceedings of the International Display
Manufacturing Conference, pp. 329–330. San Jose, CA: Society for Information Display, 2005.
[80] R. E. Bedford and G. Wyszecki. “Axial Chromatic Aberration of the Human Eye.” Journal of
the Optical Society of America 47 (1957), 564–565.
[81] P. R. Bélanger. “Linear-Programming Approach to Color-Recipe Formulations.” Journal of the
Optical Society of America 64:11 (1974), 1541–1544.
[82] A. J. Bell and T. J. Sejnowski. “Edges Are the ’Independent Components’ of Natural Scenes.” In
Advances in Neural Information Processing Systems, pp. 831–837. Cambridge, MA: MIT Press,
1996.
[83] A. J. Bell and T. J. Sejnowski. “The Independent Components of Natural Scenes Are Edge
Filters.” Vision Research 37 (1997), 3327–3338.
[84] G. Berbecel. Digital Image Display: Algorithms and Implementation. Chichester: John Wiley
and Sons, 2003.
[85] T. T. J. M. Berendschot, J. van de Kraats, and D. van Norren. “Wavelength Dependence of
the Stiles-Crawford Effect Explained by Perception of Backscattered Light from the Choroid.”
Journal of the Optical Society of America A 18:7 (2001), 1445–1451.
[86] L. D. Bergman, B. E. Rogowitz, and L. A. Treinish. “A Rule-Based Tool for Assisting Col-
ormap Selection.” In Proceedings of IEEE Visualization, pp. 118–125. Los Alamitos, CA: IEEE
Computer Society, 1995.
[87] B. Berlin and P. Kay. Basic Color Terms: Their Universality and Evolution. Berkeley, CA:
University of California Press, 1969.
[88] B. J. Berne and R. Pecora. Dynamic Light Scattering (with Applications to Chemistry, Biology
and Physics). Mineola, NY: Dover Publications, 2000.
[89] R. S. Berns and M. J. Shiyu. “Colorimetric Characterization of a Desk-Top Drum Scanner using
a Spectral Model.” Journal of Electronic Imaging 4 (1995), 360–372.
[90] R. S. Berns, D. H. Alman, L. Reniff, G. D. Snyder, and M. R. Balonon-Rosen. “Visual Determi-
nation of Suprathreshold Color-Difference Tolerances using Probit Analysis.” Colour Research
and Application 16 (1991), 297–316.
[91] R. S. Berns, M. E. Gorzynski, and R. J. Motta. “CRT Colorimetry – Part II: Metrology.” Color
Research and Application 18:5 (1993), 315–325.
[92] R. S. Berns. “Methods for Characterizing CRT Displays.” Displays 16:4 (1996), 173–182.
[93] R. S. Berns. Billmeyer and Saltzman’s Principles of Color Technology, Third edition. New York:
John Wiley and Sons, 2000.
[94] I. Biederman and P. Kalocsai. “Neural and Psychological Analysis of Object and Face Recog-
nition.” In Face Recognition: From Theory to Applications, pp. 3–25. Berlin: Springer-Verlag,
1998.
i i
i i
i i
i i
966 Bibliography
[95] M. Bigas, E. Cabruja, J. Forest, and J. Salvi. “Review of CMOS Image Sensors.” Microelec-
tronics Journal 37:5 (2006), 433–451.
[96] J. Birnstock, J. Blässing, A. Hunze, M. Scheffel, M. Stößel, K. Heuser, G. Wittmann, J. Wörle,
and A. Winnacker. “Screen-Printed Passive Matrix Displays Based on Light Emitting Polymers.”
Applied Physics Letters 78:24 (2001), 3905–3907.
[97] J. Birnstock, J. Blässing, A. Hunze, M. Scheffel, M. Stößel, K. Heuser, J. Wörle, G. Wittmann,
and A. Winnacker. “Screen-Printed Passive Matrix Displays and Multicolor Devices.” In Pro-
ceedings of the SPIE 4464, pp. 68–75. Bellingham, WA: SPIE, 2002.
[98] C. M. Bishop. Neural Networks for Pattern Recognition. Oxford: Oxford University Press,
1995.
[99] D. L. Bitzer and H. G. Slottow. “The Plasma Display Panel — A Digitally Addressable Display
with Inherent Memory.” In Proceedings of the AFIPS Conference 29, pp. 541–547. American
Federation of Information Processing Societies, 1966.
[100] H. R. Blackwell. “Luminance difference thresholds.” In Handbook of Sensory Physiology,
VII/4, edited by D. Jameson and L. M. Hurvich, pp. 78–101. Berlin: Springer-Verlag, 1972.
[101] C. Blakemore and F. Campbell. “On the Existence of Neurons in the Human Visual System
Electively Sensitive to Orientation and Size of Retinal Images.” Journal of Physiology 203
(1969), 237–260.
[102] B. Blakeslee and M. E. McCourt. “A Multi-Scale Spatial Filtering Account of the White Effect,
Simultaneous Brightness Contrast and Grating Induction.” Vision Research 39 (1999), 4361–
4377.
[103] N. Blanc. “CCD versus CMOS — Has CCD Imaging Come to an End?” In Photogrammetric
Week 01, edited by D. Fritsch and R. Spiller, pp. 131–137. Heidelberg: Wichmann Verlag, 2001.
[104] G. G. Blasdel and J. S. Lund. “Termination of Afferent Axons in Macaque Striate Cortex.”
Journal of Neuroscience 3:7 (1983), 1389–1413.
[105] J. F. Blinn. “Models of Light Reflection for Computer Synthesized Pictures.” Proc. SIGGRAPH
’77, Computer Graphics 11 (1977), 192–198.
[106] J. F. Blinn. “Return of the Jaggy.” IEEE Computer Graphics and Applications 9:2 (1989),
82–89.
[107] J. F. Blinn. “What We Need Around Here Is More Aliasing.” IEEE Computer Graphics and
Applications 9:1 (1989), 75–79.
[108] C. F. Bohren. “Colors of the Sea.” Weatherwise 35:5 (1982), 256–260.
[109] G. F. Bohren. “The Green Flash.” Weatherwise 35:6 (1982), 271–275.
[110] C. F. Bohren. “More About Colors of the Sea.” Weatherwise 36:6 (1983), 311–316.
[111] C. F. Bohren. “Scattering by Particles.” In Handbook of Optics: Fundamentals, Techniques
and Design, Volume 1, edited by M. Bass, E. W. van Stryland, D. R. Williams, and W. L. Wolfe,
Second edition. New York: McGraw-Hill, 1995.
[112] M. Born and E. Wolf. Principles of Optics, Seventh edition. Cambridge, UK: Cambridge
University Press, 1999.
[113] T. Bossomaier and A. W. Snyder. “Why Spatial Frequency Processing in the Visual Cortex?”
Vision Research 26 (1986), 1307–1309.
[114] P. Bouguer. Traité d’optique sur la gradation de la lumiere: Ouvrage posthume. Paris: M
l’Abbé de la Caille, 1760.
[115] D. Boussaoud, L. G. Ungerleider, and R. Desimone. “Pathways for Motion Analysis: Cortical
Connections of the Medial Superior Temporal and Fundus of the Superior Temporal Visual Areas
in the Macaque.” Journal of Comparative Neurology 296:3 (1990), 462–495.
i i
i i
i i
i i
Bibliography 967
[116] A. Bovik, editor. Handbook of Image and Video Processing. San Diego, CA: Academic Press,
2000.
[117] R. M. Boynton and D. N. Whitten. “Visual Adaptation in Monkey Cones: Recordings of Late
Receptor Potentials.” Science 170 (1970), 1423–1426.
[118] R. M. Boynton. Human Color Vision. New York: Holt, Rinehart and Winston, 1979.
[119] D. H. Brainard and W. T. Freeman. “Bayesean Color Constancy.” Journal of the Optical Society
of America A 14:7 (1997), 1393–1411.
[120] D. H. Brainard, W. A. Brunt, and J. M. Speigle. “Color Constancy in the Nearly Natural Image.
I. Assymetric Matches.” Journal of the Optical Society of America A 14:9 (1997), 2091–2110.
[121] D. H. Brainard, A. Roorda, Y. Yamauchi, J. B. Calderone, A. Metha, M. Neitz, J. Neitz, D. R.
Williams, and G. H. Jacobs. “Functional Consequences of the Relative Numbers of L and M
Cones.” Journal of the Optical Society of America A 17:3 (2000), 607–614.
[122] D. H. Brainard, D. G. Pelli, and T. Robson. “Display Characterization.” In Encylopedia of
Imaging Science and Technology, edited by J. Hornak, pp. 172–188. New York: John Wiley and
Sons, 2002.
[123] D. H. Brainard. “Color Constancy in the Nearly Natural Image. II. Achromatic Loci.” Journal
of the Optical Society of America A 15 (1998), 307–325.
[124] G. J. Braun and M. D. Fairchild. “Techniques for Gamut Surface Definition and Visualization.”
In Proceedings of the 5th IS&T/SID Color Imaging Conference, pp. 147–152. Springfield, VA:
IS&T, 1997.
[125] G. Brelstaff and F. Chessa. “Practical Application of Visual Illusions: Errare Humanum Est.” In
Proceedings of the 2nd Symposium on Applied Perception in Graphics and Visualization (APGV),
pp. 161–161. New York: ACM, 2005.
[126] E. Breneman. “Corresponding Chromaticities for Different States of Adaptation to Complex
Visual Fields.” Journal of the Optical Society of America, A 4 (1987), 1115–1129.
[127] P. Bressan, E. Mingolla, L. Spillmann, and T. Watanabe. “Neon Color Spreading: A Review.”
Perception 26:11 (1997), 1353–1366.
[128] H. Brettel and F. Viénot. “Web Design for the Colour-Blind User.” In Colour Imaging: Vision
and Technology, edited by L. W. MacDonald and M. R. Luo, pp. 55–71. Chichester, UK: John
Wiley and Sons, Ltd., 1999.
[129] H. Brettel, F. Viénot, and J. D. Mollon. “Computerized Simulation of Color Appearance for
Dichromats.” Journal of the Optical Society of America A 14:10 (1997), 2647–2655.
[130] D. Brewster. “A Note Explaining the Cause of an Optical Phenomenon Observed by the Rev.
W. Selwyn.” In Report of the Fourteenth Meeting of the British Association for the Advancement
of Science, edited by J. Murray. London, 1844.
[131] M. H. Brill and J. Larimer. “Avoiding On-Screen Metamerism in N-Primary Displays.” Journal
of the Society for Information Display 13:6 (2005), 509–516.
[132] M. H. Brill. “Image Segmentation by Object Color: A Unifying Framework and Connection
to Color Constancy.” Journal of the Optical Society of America A 7:10 (1986), 2041–2047.
[133] G. S. Brindley. “The Discrimination of After-Images.” Journal of Physiology 147:1 (1959),
194–203.
[134] R. W. Brislin. “The Ponzo Illusion: Additional Cues, Age, Orientation, and Culture.” Journal
of Cross-Cultural Psychology 5:2 (1974), 139–161.
[135] K. H. Britten, M. N. Shadlen, W. T. Newsome, and J. A. Movshon. “The Analysis of Visual
Motion: A Comparison of Neuronal and Psychophysical Performance.” Journal of Neuroscience
12:12 (1992), 4745–4765.
i i
i i
i i
i i
968 Bibliography
[136] A. Brockes. “Vergleich der Metamerie-Indizes bei Lichtartwechsel von Tageslicht zur
Glühlampe and zu verschiedenen Leuchtstofflampen.” Die Farbe 18:223.
[137] A. Brockes. “Vergleich von berechneten Metamerie-Indizes mit Abmusterungsergebnissen.”
Die Farbe 19:135 (1970), 1–10.
[138] K. Brodmann. Vergleichende Lokalisationslehre der Großhirnrhinde. Leipzig, Germany: Barth
Verlag, 1909. Translated by L J Garey, Localisation in the Cerebral Cortex (Smith-Gordon,
London, 1994).
[139] J. Broerse, T. Vladusich, and R. P. O’Shea. “Colour at Edges and Colour Spreading in McCol-
lough Effects.” Vision Research 39 (1999), 1305–1320.
[140] N. Bruno, P. Bernardis, and J. Schirillo. “Lightness, Equivalent Backgrounds and Anchoring.”
Perception and Psychophysics 59:5 (1997), 643–654.
[141] G. Buchsbaum. “A Spatial Processor Model for Object Color Perception.” Journal of the
Franklin Institute 310 (1980), 1–26.
[142] Bureau International des Poids et Mesures. “Le Système International d’Unités (The Interna-
tional System of Units).”, 2006.
[143] J. H. Burroughes, D. D. C. Bradley, A. R. Brown, R. N. Marks, K. Mackay, R. H. Friend, P. L.
Burns, and A. B. Holmes. “Light-Emitting Diodes based on Conjugated Polymers.” Nature
347:6293 (1990), 539–541.
[144] G. J. Burton and I. R. Moorhead. “Color and Spatial Structure in Natural Scenes.” Applied
Optics 26:1 (1987), 157–170.
[145] A. Calabria and M. D. Fairchild. “Herding CATs: A Comparison of Linear Chromatic Adap-
tation Transforms for CIECAM97s.” In IS&T/SID 9th Color Imaging Conference, pp. 174–178.
Springfield, VA: Society for Imaging Science and Technology, 2001.
[146] A. J. Calabria and M. D. Fairchild. “Perceived Image Contrast and Observer Preference I: The
Effects of Lightness, Chroma, and Sharpness Manipulations on Contrast Perception.” Journal
of Imaging Science and Technology 47 (2003), 479–493.
[147] A. Calabria and M. D. Fairchild. “Perceived Image Contrast and Observer Preference II: Empir-
ical Modeling of Perceived Image Contrast and Observer Preference Data.” Journal of Imaging
Science and Technology 47 (2003), 494–508.
[148] E. M. Callaway and A. K. Wiser. “Contributions of Individual Layer 2-5 Spiny Neurons to
Local Circuits in Macaque Primary Visual Cortex.” Visual Neuroscience 13:5 (1996), 907–922.
[149] E. Camalini, editor. Optical and Acoustical Holography. New York: Plenum Press, 1972.
[150] F. W. Campbell and A. H. Gregory. “Effect of Pupil Size on Acuity.” Nature (London) 187:4743
(1960), 1121–1123.
[151] F. W. Campbell and R. W. Gubisch. “Optical Quality of the Human Eye.” Journal of Physiology
186 (1966), 558–578.
[152] F. W. Campbell and J. G. Robson. “Application of Fourier Analysis to the Visibility of Grat-
ings.” Journal of Physiology (London) 197 (1968), 551–566.
[153] V. C. Cardei and B. Funt. “Color Correcting Uncalibrated Digital Camera.” Journal of Imaging
Science and Technology 44:4 (2000), 288–294.
[154] B. S. Carlson. “Comparison of Modern CCD and CMOS Image Sensor Technologies and
Systems for Low Resolution Imaging.” In Proceedings of IEEE Sensors, 1, 1, pp. 171–176,
2002.
[155] J. E. Carnes and W. F. Kosonocky. “Noise Sources in Charge-Coupled Devices.” RCA Review
33 (1972), 327–343.
i i
i i
i i
i i
Bibliography 969
i i
i i
i i
i i
970 Bibliography
[177] B. Choubey, S. Aoyoma, S. Otim, D. Joseph, and S. Collins. “An Electronic Calibration
Scheme for Logarithmic CMOS Pixels.” IEEE Sensors Journal 6:4 (2006), 950–056.
[178] P. Choudhury and J. Tumblin. “The Trilateral Filter for High Contrast Images and Meshes.”
In EGRW ’03: Proceedings of the 14th Eurographics Workshop on Rendering, pp. 186–196.
Aire-la-Ville, Switzerland: Eurographics Association, 2003.
[179] R. E. Christoffersen. Basic Principles and Techniques of Molecular Quantum Mechanics. New
York: Springer-Verlag, 1989.
[180] C. Chubb, G. Sperling, and J. A. Solomon. “Texture Interactions Determine Perceived Con-
trast.” Proceedings of the National Academy of Sciences of the United States of America 86:23
(1989), 9631–9635.
[181] J. W. Chung, H. R. Guo, C. T. Wu, K. C. Wang, W. J. Hsieh, T. M. Wu, and C. T. Chung. “Long-
Operating Lifetime of Green Phosphorescence Top-Emitting Organic Light Emitting Devices.”
In Proceedings of the International Display Manufacturing Conference, pp. 278–280. San Jose,
CA: Society for Information Display, 2005.
[182] CIE. “CIE Proceedings 1951.” Technical Report Vol. 1, Sec 4; Vol 3, p. 37, Commision
Internationale De L’Eclairage, Vienna, 1951.
[183] CIE. “Light as a True Visual Quantity: Principles of Measurement.” Technical Report Publ.
No. 41 (TC-1.4), Commision Internationale De L’Eclairage, Vienna, 1978.
[184] CIE. “A Method For Assessing the Quality of Daylight Simulators for Colorimetry.” Technical
Report CIE 51-1981, Commision Internationale De L’Eclairage, Vienna, 1981.
[185] CIE. “The Basis of Physical Photometry.” Technical Report Publ. No. 18.2 (TC-1.2), Commi-
sion Internationale De L’Eclairage, Vienna, 1983.
[186] CIE. “Colorimetry, Second Edition.” Technical Report CIE 15.2, Commision Internationale
De L’Eclairage, Vienna, 1986.
[187] CIE. “International Lighting Vocabulary.” Technical Report 17.4, Commision Internationale
De L’Eclairage, Vienna, 1987.
[188] CIE. “Special Metamerism Index: Change in Observer.” Technical Report CIE 80-1989,
Commision Internationale De L’Eclairage, Vienna, 1989.
[189] CIE. “CIE 1988 2◦ Spectral Luminous Efficiency Function for Photopic Vision.” Technical
Report Publ. No. 86, Commision Internationale De L’Eclairage, Vienna, 1990.
[190] CIE. “Method of Measuring and Specifying Colour Rendering Properties of Light Sources.”
Technical Report CIE 13.3-1995, Commision Internationale De L’Eclairage, Vienna, 1995.
[191] CIE. “The CIE 1997 Interim Colour Appearance Model (Simple Version), CIECAM97s.”
Technical Report CIE 131-1998, Commision Internationale De L’Eclairage, Vienna, 1998.
[192] CIE. “Virtual Metamers for Assessing the Quality of Simulators of CIE Illuminant D50.” Tech-
nical Report Supplement 1-1999 to CIE 51-1981, Commision Internationale De L’Eclairage,
Vienna, 1999.
[193] CIE. “CIE Standard Illuminants for Colorimetry, Standard CIE S005/E-1998.” Technical
Report Standard CIE S005/E-1998, Commision Internationale De L’Eclairage, Vienna, 2004.
Published also as ISO 10526/CIE S 005/E-1999.
[194] CIE. “Colorimetry, Third Edition.” Technical Report CIE Publ. No. 15:2004, Commision
Internationale De L’Eclairage, Vienna, 2004.
[195] CIE. “A Colour Appearance Model for Colour Management Systems: CIECAM02.” Technical
Report CIE 159:2004, Commision Internationale De L’Eclairage, Vienna, 2004.
[196] F. R. Clapper and J. A. C. Yule. “The Effect of Multiple Internal Reflections on the Densities
of Halftone Prints on Paper.” Journal of the Optical Society of America 43 (1953), 600–603.
i i
i i
i i
i i
Bibliography 971
[197] F. R. Clapper and J. A. C. Yule. “Reproduction of Color with Halftone Images.” In Proceedings
of the Technical Association of Graphic Arts, pp. 1–12. Chicago, IL: Technical Assn. of the
Graphic Arts, 1955.
[198] F. J. J. Clark, R. McDonald, and B. Rigg. “Modifications to the JPC79 Colour-Difference
Formula.” Journal of the Society of Dyers and Colourists 100 (1984), 128–132, (Errata: 281–
282).
[199] S. Coe, W.-K. Woo, M. Bawendi, and V. Bulovic. “Electroluminescence from Single Mono-
layers of Nanocrystals in Molecular Organic Devices.” Nature 420:6917 (2002), 800–803.
[200] M. F. Cohen and J. R. Wallace. Radiosity and Realistic Image Synthesis. Cambridge, MA:
Academic Press, Inc., 1993.
[201] J. Cohen. “Dependency of the Spectral Reflectance Curves of the Munsell Color Chips.” Psy-
chonomic Science 1 (1964), 369–370.
[202] M. Colbert, E. Reinhard, and C. E. Hughes. “Painting in High Dynamic Range.” Journal of
Visual Communication and Image Representation 18:5 (2007), 387–396.
[203] M. Collet. “Solid-State Image Sensors.” Sensors and Actuators 10 (1986), 287–302.
[204] D. Comaniciu and P. Meer. “Mean Shift: A Robust Approach toward Feature Space Analysis.”
IEEE Transactions on Pattern Analysis and Machine Intelligence 24:5 (2002), 603–619.
[205] B. Comiskey, J. D. Albert, H. Yoshizawa, and J. Jacobson. “An Electrophoretic Ink for All-
Printed Reflective Electronic Displays.” Nature 394:6690 (1998), 253–255.
[206] M. Corbetta, F. M. Miezin, S. Dobmeyer, G. L. Shulman, and S. E. Petersen. “Selective and Di-
vided Attention during Visual Discriminations of Shape, Color and Speed: Functional Anatomy
by Positron Emission Tomography.” Journal of Neuroscience 11:8 (1991), 2383–2402.
[207] T. N. Cornsweet and D. Teller. “Relation of Increment Thresholds to Brightness and Lumi-
nance.” Journal of the Optical Society of America 55 (1965), 1303–1308.
[208] T. N. Cornsweet. Visual Perception. New York: Academic Press, 1970.
[209] N. P. Cottaris and R. L. de Valois. “Temporal Dynamics of Chromatic Tuning in Macaque
Primary Visual Cortex.” Nature 395:6705 (1998), 896–900.
[210] W. B. Cowan and N. Rowell. “On the Gun Independence and Phosphor Constancy of Color
Video Monitors.” Color Research and Application 11 (1986), Supplement 34–38.
[211] W. B. Cowan. “An Inexpensive Scheme for Calibration of a Colour Monitor in Terms of CIE
Standard Coordinates.” Proc. SIGGRAPH ’83 Computer Graphics 17:3 (1983), 315–321.
[212] B. H. Crawford. “The Scotopic Visibility Function.” Proceedings of the Physical Society of
London 62:5 (1949), 321–334.
[213] G. P. Crawford, editor. Flexible Flat Panel Displays. Chichester, UK: John Wiley and Sons,
2005.
[214] W. Crookes. “The Bakerian lecture: On the Illumination of Lines of Electrical Pressure, and
the Trajectory of Molecules.” Philosophical Transactions of the Royal Society of London 170
(1879), 135–164.
[215] M. E. Crovella and M. S. Taqqu. “Estimating the Heavy Tail Index from Scaling Properties.”
Methodology and Computing in Applied Probability 1:1 (1999), 55–79.
[216] R. Cruz-Coke. Colour Blindness — An Evolutionary Approach. Springfield, IL: Charles C
Springfield, 1970.
[217] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and S. Sirotti. “Improving Shadow Suppression
in Moving Object Detection with HSV Color Information.” In IEEE Intelligent Transportation
Systems, pp. 334–339. IEEE Press, 2001.
i i
i i
i i
i i
972 Bibliography
[218] J. A. Curcio and C. C. Petty. “The Near-Infrared Absorption Spectrum of Liquid Water.”
Journal of the Optical Society of America 41:5 (1951), 302–304.
[219] C. A. Curcio, K. R. Sloan, R. E. Kalina, and A. E. Hendrickson. “Human Photoreceptor
Topography.” Journal of Comparative Neurology 292:4 (1990), 497–523.
[220] C. A. Curcio, K. A. Allen, K. R. Sloan, C. L. Lerea, J. B. Hurley, I. B. Klock, and A. H. Milam.
“Distribution and Morphology of Human Cone Photoreceptors Stained with Anti-Blue Opsin.”
Journal of Comparative Neurology 312:4 (1991), 610–624.
[221] D. M. Dacey and B. B. Lee. “The ’Blue-On’ Opponent Pathways in Primate Retina Originates
from a Distinct Bistratified Ganglion Cell.” Nature 367:6465 (1994), 731–735.
[222] D. M. Dacey and B. B. Lee. “Cone Inputs to the Receptive Field of Midget Ganglion Cells
in the Periphery of the Macaque Retina.” Investigative Ophthalmology and Visual Science,
Supplement 38 (1997), S708.
[223] D. M. Dacey and B. B. Lee. “Functional Architecture of Cone Signal Pathways in the Primate
Retina.” In Color Vision: From Genes to Perception, edited by K. R. Gegenfurtner and L. Sharpe,
pp. 181–202. Cambridge, UK: Cambridge University Press, 1999.
[224] D. M. Dacey and M. R. Petersen. “Dendritic Field Size and Morphology of Midget and Parasol
Ganglion Cells of the Human Retina.” Proceedings of the National Academy of Sciences of the
United States of America 89:20 (1992), 9666–9670.
[225] D. M. Dacey. “Morphology of a Small-Field Bistratified Ganglion Cell Type in the Macaque
and Human Retina.” Visual Neuroscience 10 (1993), 1081–1098.
[226] D. M. Dacey. “The Mosaic of Midget Ganglion Cells in the Human Retina.” Journal of
Neuroscience 13:12 (1993), 5334–5355.
[227] D. M. Dacey. “Parallel Pathways for Spectral Coding in Primate Retina.” Annual Review of
Neuroscience 23 (2000), 743–775.
[228] S. C. Dakin and P. J. Bex. “Natural Image Statistics Mediate Brightness Filling In.” Proceed-
ings of the Royal Society of London, B 270 (2003), 2341–2348.
[229] S. Daly. “The Visible Differences Predictor: An Algorithm for the Assessment of Image Fi-
delity.” In Digital Images and Human Vision, edited by A. Watson, pp. 179–206. Cambridge,
MA: MIT Press, 1993.
[230] A. Damasio, T. Yamada, H. Damasio, J. Corbett, and J. McKnee. “Central Achromatopsia:
Behavioral, Anatomic, and Physiologic Aspects.” Neurology 30:10 (1980), 1064–1071.
[231] K. J. Dana, B. van Ginneken, S. K. Nayar, and J. J. Koenderink. “Reflectance and Texture of
Real World Surfaces.” ACM Transactions on Computer Graphics 18:1 (1999), 1–34.
[232] K. J. Dana. “BRDF/BTF Measurement Device.” In Proceedings of the Eighth IEEE Interna-
tional Conference on Computer Vision (ICCV), pp. 460–466. Washington, DC: IEEE, 2001.
[233] J. L. Dannemiller. “Spectral Reflectance of Natural Objects: How Many Basis Functions Are
Necessary?” Journal of the Optical Society of America A 9:4 (1992), 507–515.
[234] J. Daugman. “High Confidence Visual Recognition of Persons by a Test of Statistical Inde-
pendence.” IEEE Transactions on Pattern Analysis and Machine Intelligence 15:11 (1993),
1148–1161.
[235] R. Davis and K. S. Gibson. Filters for the Reproduction of Sunlight and Daylight and the
Determination of Color Temperature. Bureau of Standards, Miscellaneous. Publication 114,
Washington, DC, 1931.
[236] H. Davson. Physiology of the Eye, Fifth edition. Pergamon Press, 1990.
[237] E. A. Day, L. Taplin, and R. S. Berns. “Colorimetric Characterization of a Computer-Controlled
Liquid Crystal Display.” Color Research and Application 29:5 (2004), 365–373.
i i
i i
i i
i i
Bibliography 973
[238] P. Debevec and J. Malik. “Recovering High Dynamic Range Radiance Maps from Pho-
tographs.” In Proceedings SIGGRAPH ’97, Computer Graphics Proceedings, Annual Confer-
ence Series, pp. 369–378. Reading, MA: Addison-Wesley, 1997.
[239] P. E. Debevec. “Rendering Synthetic Objects into Real Scenes: Bridging Traditional and
Image-Based Graphics with Illumination and High Dynamic Range Photography.” In Proceed-
ings of SIGGRAPH ’98, Computer Graphics Proceedings, Annual Conference Series, pp. 45–50.
Reading, MA: Addison-Wesley, 1998.
[240] P. E. Debevec. “A Tutorial on Image-Based Lighting.” IEEE Computer Graphics and Applica-
tions 22:2 (2002), 26–34.
[241] S. Decker, R. McGrath, K. Brehmer, and C. Sodini. “A 256 × 256 CMOS Imaging Array
with Wide Dynamic Range Pixels and Column-Parallel Digital Output.” In IEEE International
Solid-State Circuits Conference (ISSCC), pp. 176–177, 1998.
[242] C. DeCusatis, editor. Handbook of Applied Photometry. Philadelphia: American Institute of
Physics, 1997.
[243] M. F. Deering. “A Photon Accurate Model of the Human Eye.” ACM Transactions on Graphics
24:3 (2005), 649–658.
[244] P. Delahunt and D. Brainard. “Control of Chromatic Adaptation Signals from Separate Cone
Classes Interact.” Vision Research 40 (2000), 2885–2903.
[245] F. Delamare and B. Guineau. Colors: The Story of Dyes and Pigments. New York: Harry N
Abrams, 2000.
[246] E. Demichel. Le Procédé 26 (1924), 17–21.
[247] E. Demichel. Le Procédé 26 (1924), 26–27.
[248] E. J. Denton. “The Contributions of the Orientated Photosensitive and Other Molecules to
the Absorption of the Whole Retina.” Proceedings of the Royal Society of London, B 150:938
(1959), 78–94.
[249] G. Derra, H. Moensch, E. Fischer, H. Giese, U. Hechtfischer, G. Heusler, A. Koerber, U. Nie-
mann, F.-C. Noertemann, P. Pekarski, J. Pollmann-Retsch, A. Ritz, and U. Weichmann. “UHP
Lamp Systems for Projection Applications.” Journal of Physics D: Applied Physics 38 (2005),
2995–3010.
[250] A. M. Derrington, J. Krauskopf, and P. Lennie. “Chromatic Mechanisms in Lateral Geniculate
Nucleus of Macaque.” Journal of Physiology (London) 357 (1984), 241–265.
[251] K. Devlin, A. Chalmers, and E. Reinhard. “Visual Self-Calibration and Correction for Ambient
Illumination.” ACM Transactions on Applied Perception 3:4 (2006), 429–452.
[252] D. S. Dewald, S. M. Penn, and M. Davis. “Sequential Color Recapture and Dynamic Filtering:
A Method of Scrolling Color.” SID Symposium Digest of Technical Papers 32:1 (2001), 1076–
1079.
[253] J. M. DiCarlo and B. A. Wandell. “Rendering High Dynamic Range Images.” In Proceedings
of the SPIE 3965 (Electronic Imaging 2000 Conference), pp. 392–401. Bellingham, WA: SPIE,
2000.
[254] J. Dillon, R. H. Wang, and S. J. Atherton. “Photochemical and Photophysical Studies on Human
Lens Constituents.” Photochemistry and Photobiology 52 (1990), 849–854.
[255] J. Dillon. “Photochemical Mechanisms in the Lens.” In The Ocular Lens, edited by H. Maisel,
pp. 349–366. New York: Marcel Dekker, 1985.
[256] R. W. Ditchburn. Eye Movements and Visual Perception. Oxford, UK: Clarendon Press, 1973.
[257] K. R. Dobkins. “Moving Colors in the Lime Light.” Neuron 25 (2000), 15–18.
i i
i i
i i
i i
974 Bibliography
[258] D. Doherty and G. Hewlett. “Pulse Width Modulation Control in DLP Projectors.” Texas
Instruments Technical Journal, pp. 115–121.
[259] D. W. Dong and J. J. Atick. “Statistics of Natural Time-Varying Images.” Network: Computa-
tion in Neural Systems 6:3 (1995), 345–358.
[260] R. W. Doty. “Nongeniculate Afferents to Striate Cortex in Macaques.” Journal of Comparative
Neurology 218:2 (1983), 159–173.
[261] R. F. Dougherty and A. R. Wade. Available online (http://www.vischeck.com/daltonize).
[262] J. E. Dowling. The Retina: An Approachable Part of the Brain. Cambridge, MA: Belknap
Press, 1987.
[263] F. Drago, W. L. Martens, K. Myszkowski, and H.-P. Seidel. “Perceptual Evaluation of Tone
Mapping Operators with Regard to Similarity and Preference.” Technical Report MPI-I-2002-
4-002, Max Plank Institut für Informatik, Saarbrücken, Germany, 2002.
[264] F. Drago, K. Myszkowski, T. Annen, and N. Chiba. “Adaptive Logarithmic Mapping for Dis-
playing High Contrast Scenes.” Computer Graphics Forum 22:3 (2003), 419–426.
[265] M. Drew and B. Funt. “Natural Metamers.” Computer Vision, Graphics and Image Processing
56:2 (1992), 139–151.
[266] M. S. Drew, J. Wei, and Z. N. Li. “Illumination-Invariant Image Retrieval and Video Segmen-
tation.” Pattern Recognition 32:8 (1999), 1369–1388.
[267] D. D. Dudley. Holography: A Survey. Washington, DC: Technology Utilization Office, Na-
tional Aeronautics and Space Administration, 1973.
[268] R. O. Duncan and G. M. Boynton. “Cortical Magnification within Human Primary Visual
Cortex Correlates with Acuity Thresholds.” Neuron 38:4 (2003), 659–671.
[269] A. Dür. “An Improved normalization for the Ward Reflectance Model.” journal of graphics
tools 11:1 (2006), 51–59.
[270] F. Durand and J. Dorsey. “Fast Bilateral Filtering for the Display of High-Dynamic-Range
Images.” ACM Transactions on Graphics 21:3 (2002), 257–266.
[271] P. Dutré, P. Bekaert, and K. Bala. Advanced Globel Illumination. Natick, MA: A K Peters,
2003.
[272] D. M. Eagleman. “Visual Illusions and Neurobiology.” Nature Reviews Neuroscience 2:12
(2001), 920–926.
[273] F. Ebner and M. D. Fairchild. “Constant Hue Surfaces in Color Space.” In Proceedings of the
SPIE 3300, pp. 107–117. Bellingham, WA: SPIE, 1998.
[274] F. Ebner and M. D. Fairchild. “Development and Testing of a Color Space (IPT) with Improved
Hue Uniformity.” In IS&T/SID Sixth Color Imaging Conference: Color Science, Systems and
Applications, pp. 8–13. Springfield, VA: Society for Imaging Science & Technology, 1998.
[275] M. Ebner. Color Constancy. Chichester, UK: John Wiley and Sons, Ltd., 2007.
[276] J. G. Eden. “Information Display Early in the 21st Century: Overview of Selected Emissive
Display Technologies.” Proceedings of the IEEE 94:3 (2006), 567–574.
[277] W. Ehrenstein. “Über Abwandlungen der L Hermannschen Heiligkeitserscheinung.” Zeitschrift
für Psychologie 150 (1941), 83–91.
[278] W. Ehrenstein. “Modifications of the Brightness Phenomenon of L Hermann.” In The Percep-
tion of Illusory Contours, edited by S. Petry and G. E. Meyer. New York: Springer-Verlag, 1987.
Translated by A Hogg from [277].
[279] A. Einstein. “Ist die Trägheit eines Körpers von seinem Energieinhalt abhängig?” Annalen der
Physik 18:13 (1905), 639–641.
i i
i i
i i
i i
Bibliography 975
[280] J. H. Elder. “Are Edges Incomplete?” International Journal of Computer Vision 34:2/3 (1999),
97–122.
[281] H. D. Ellis. “Introduction to Aspects of Face Processing: Ten questions in Need of Answers.”
In Aspects of Face Processing, edited by H. Ellis, M. Jeeves, F. Newcombe, and A. Young,
pp. 3–13. Dordrecht, The Netherlands: Nijhoff, 1986.
[282] S. A. Engel, X. Zhang, and B. A. Wandell. “Color Tuning in Human Visual Cortex Measured
using Functional Magnetic Resonance Imaging.” Nature 388:6637 (1997), 68–71.
[283] P. G. Engeldrum. “Computing Color Gamuts of Ink-Jet Printing Systems.” Proceedings of the
Society for Information Display 27 (1986), 25–30.
[284] C. Enroth-Cugell and J. G. Robson. “The Contrast Sensitivity of Retinal Ganglion Cells of the
Cat.” Journal of Physiology (London) 187 (1966), 517–552.
[285] C. J. Erkerlers, J. van der Steen, R. M. Steinman, and H. Collewijn. “Ocular Vergence under
Natural Conditions I: Continuous Changes of Target Distance along the Median Plane.” Pro-
ceedings of the Royal Society of London B326 (1989), 417–440.
[286] C. J. Erkerlers, R. M. Steinman, and H. Collewijn. “Ocular Vergence under Natural Conditions
II: Gaze Shifts between Real Targets Differing in Distance and Direction.” Proceedings of the
Royal Society of London B326 (1989), 441–465.
[287] D. C. van Essen and J. L. Gallant. “Neural Mechanisms of Form and Motion Processing in the
Primate Visual System.” Neuron 13:1 (1994), 1–10.
[288] R. Evans. The Perception of Color. New York: John Wiley & Sons, 1974.
[289] G. L. Fain, H. R. Matthews, M. C. Cornwall, and Y. Koutalos. “Adaptation in Vertebrate
Photoreceptors.” Physiological Review 81 (2001), 117–151.
[290] M. D. Fairchild and G. M. Johnson. “On Contrast Sensitivity in an Image Difference Model.”
In IS&T PICS Conference, pp. 18–23. Springfield, VA: Society for Imaging Science and Tech-
nology, 2001.
[291] M. D. Fairchild and G. Johnson. “Meet iCAM: A Next-Generation Color Appearance Model.”
In Proceedings of the IS&T/SID 10th Color Imaging Conference, pp. 33–38. Springfield, VA:
Society for Imaging Science and Technology, 2002.
[292] M. D. Fairchild and G. M. Johnson. “The iCAM Framework for Image Appearance, Image
Differences, and Image Quality.” Journal of Electronic Imaging 13 (2004), 126–138.
[293] M. D. Fairchild and G. M. Johnson. “METACOW: A Public-Domain, High-Resolution, Fully-
Digital, Extended-Dynamic-Range, Spectral Test Target for Imaging System Analysis and Sim-
ulation.” In IS&T/SID 12th Color Imaging Conference, pp. 239–245. Springfield, VA: Society
for Imaging Science and Technology, 2004.
[294] M. D. Fairchild and G. M. Johnson. “On the Salience of Novel Stimuli: Adaptation and Image
Noise.” In IS&T/SID 13th Color Imaging Conference, pp. 333–338. Springfield, VA: Society
for Imaging Science and Technology, 2005.
[295] M. D. Fairchild and E. Pirrotta. “Predicting the Lightness of Chromatic Object Colors using
CIELAB.” Color Research and Application 16 (1991), 385–393.
[296] M. D. Fairchild and L. Reniff. “Time-Course of Chromatic Adaptation for Color-Appearance
Judgements.” Journal of the Optical Society of America 12 (1995), 824–833.
[297] M. D. Fairchild and D. R. Wyble. “Colorimetric Characterization of the Apple Studio Display
(Flat Panel LCD).” Technical report, Munsell Color Science Laboratory, Rochester, NY, 1998.
[298] M. D. Fairchild, E. Pirrotta, and T. G. Kim. “Successive-Ganzfeld Haploscopic Viewing Tech-
nique for Color-Appearance Research.” Color Research and Application 19 (1994), 214–221.
i i
i i
i i
i i
976 Bibliography
i i
i i
i i
i i
Bibliography 977
[318] G. D. Finlayson and S. Süsstrunk. “Color Ratios and Chromatic Adaptation.” In Proceedings
of IS&T CGIV, First European Conference on Color Graphics, Imaging and Vision, pp. 7–10.
Springfield, VA: Society for Imaging Science and Technology, 2002.
[319] G. D. Finlayson, S. D. Hordley, and P. M. Hubel. “Color by Correlation: A Simple, Unify-
ing Framework for Color Constancy.” IEEE Transactions on Patterns Analysis and Machine
Intelligence 23:11 (2001), 1209–1221.
[320] G. D. Finlayson, S. D. Hordley, and M. S. Drew. “Removing Shadows from Images using
Retinex.” In Proceedings IS&T Color Imaging Conference, pp. 73–79. Springfield, VA: Society
for Imaging Science and Technology, 2002.
[321] A. Fiorentini and A. M. Ercoles. “Vision of Oscillating Visual Fields.” Optica Acta 4 (1957),
370.
[322] D. Fitzpatrick, K. Itoh, and I. T. Diamond. “The Laminar Organization of the Lateral Geniculate
Body and the Striate Cortex in the Squirrel Monkey (Saimiri sciureus).” Journal of Neuroscience
3:4 (1983), 673–702.
[323] D. Fitzpatrick, W. M. Usrey, B. R. Schofield, and G. Einstein. “The Sublaminar Organization
of Corticogeniculate Neurons in Layer 6 of Macaque Striate Cortex.” Visual Neuroscience 11
(1994), 307–315.
[324] P. D. Floyd, D. Heald, B. Arbuckle, A. Lewis, M. Kothari, B. J. Gally, B. Cummings, B. R.
Natarajan, L. Palmateer, J. Bos, D. Chang, J. Chiang, D. Chu, L.-M. Wang, E. Pao, F. Su,
V. Huang, W.-J. Lin, W.-C. Tang, J.-J. Yeh, C.-C. Chan, F.-A. Shu, and Y.-D. Ju. “IMOD
Display Manufacturing.” SID Symposium Digest of Technical Papers 37:1 (2006), 1980–1983.
[325] J. M. Foley and M. E. McCourt. “Visual Grating Induction.” Journal of the Optical Society of
America A 2 (1985), 1220–1230.
[326] J. Foley, A. van Dam, S. Feiner, and J. Hughes. Computer Graphics Principles and Practice,
Second edition. Reading, MA: Addison-Wesley, 1990.
[327] A. Ford and A. Roberts. “Colour Space Conversions.”, 1998. Available online (http://www.
poynton.com/PFDs/coloureq.pdf).
[328] S. Forrest, P. Burrows, and M. Thompson. “The Dawn of Organic Electronics.” IEEE Spectrum
37:8 (2000), 29–34.
[329] J. Forrester, A. Dick, P. McMenamin, and W. Lee. The Eye: Basic Sciences in Practice.
London: W. B. Saunders Company Ltd., 2001.
[330] E. R. Fossum. “CMOS Image Sensors: Electronics Camera-On-A-Chip.” IEEE Transactions
on Electron Devices 44:10 (1997), 1689–1698.
[331] D. H. Foster and S. M. C. Nascimento. “Relational Color Constancy from Invariant Cone-
Excitation Ratios.” Proceedings of the Royal Society: Biological Sciences 257:1349 (1994),
115–121.
[332] J. Frankle and J. McCann. “Method and Apparatus for Lightness Imaging.”, 1983. U.S. patent
#4,384,336.
[333] W. Fries and H. Distel. “Large Layer V1 Neurons of Monkey Striate Cortex (Meynert Cells)
Project to the Superior Colliculus.” Proceedings of the Royal Society of London B, Biological
Sciences 219 (1983), 53–59.
[334] W. Fries. “Pontine Projection from Striate and Prestriate Visual Cortex in the Macaque Mon-
key: An Anterograde Study.” Visual Neuroscience 4 (1990), 205–216.
[335] K. Fritsche. Faults in Photography. London: Focal Press, 1968.
[336] G. A. Fry and M. Alpern. “The Effect of a Peripheral Glare Source upon the Apparent Bright-
ness of an Object.” Journal of the Optical Society of America 43:3 (1953), 189–195.
i i
i i
i i
i i
978 Bibliography
[337] G. D. Funka-Lea. “The Visual Recognition of Shadows by an Active Observer.” Ph.D. thesis,
University of Pennsylvania, 1994.
[338] B. V. Funt and G. Finlayson. “Color Constant Color Indexing.” IEEE Transactions on Pattern
Analysis and Machine Intelligence 17 (1995), 522–529.
[339] B. V. Funt, M. S. Drew, and J. Ho. “Color Constancy from Mutual Reflection.” International
Journal of Computer Vision 6 (1991), 5–24.
[340] B. Funt, F. Ciurea, and J. McCann. “Retinex in Matlab.” In Proceedings of the IS&T/SID Eighth
Color Imaging Conference: Color Science, Systems and Applications, pp. 112–121. Springfield,
VA: Society for Imaging Science and Technology, 2000.
[341] M. Furuta, Y. Nishikawa, T. Inoue, and S. Kawahito. “A High-Speed, High-Sensitivity Digital
CMOS Image Sensor with a Global Shutter and 12-Bit Column-Parallel Cyclic A/D Converters.”
IEEE Journal of Solid-State Circuits 42:4 (2007), 766–774.
[342] D. Gabor. “A New Microscopic Principle.” Nature 161 (1948), 777–778.
[343] D. Gabor. “Wavefront Reconstruction.” In Optical and Acoustical Holography, edited by
E. Camatini, pp. 15–21. New York: Plenum Press, 1972.
[344] E. R. Gaillard, L. Zheng, J. C. Merriam, and J. Dillon. “Age-Related Changes in the Absorption
Characteristics of the Primate Lens.” Investigative Ophthalmology and Visual Science 41:6
(2000), 1454–1459.
[345] L. Gall. “Computer Color Matching.” In Colour 73: Second Congress International Colour
Association, pp. 153–178. London: Adam Hilger, 1973.
[346] L. Ganglof, E. Minoux, K. B. K. Teo, P. Vincent, V. T. Semet, V. T. Binh, M. H. Yang, I. Y. Y.
Bu, R. G. Lacerda, G. Pirio, J. P. Schnell, D. Pribat, D. G. Hasko, G. A. J. Amaratunga, W. Milne,
and P. Legagneux. “Self-Aligned, Gated Arrays of Individual Nanotube and Nanowire Emitters.”
Nano Letters 4 (2004), 1575–1579.
[347] M. J. Gastinger, J. J. O’Brien, J. N. B. Larsen, and D. W. Marshak. “Histamine Immunoreactive
Axons in the Macaque Retina.” Investigative Ophthalmology and Visual Science 40:2 (1999),
487–495.
[348] H. Gates, R. Zehner, H. Doshi, and J. Au. “A5 Sized Electronic Paper Display for Document
Viewing.” SID Symposium Digest of Technical Papers 36:1 (2005), 1214–1217.
[349] F. Gatti, A. Acquaviva, L. Benini, and B. Ricco. “Low Power Control Techniques for TFT-
LCD Displays.” In Proceedings of the International Conference on Compilers, Architecture,
and Synthesis for Embedded Systems, pp. 218–224, 2002.
[350] I. Gauthier, M. J. Tarr, A. W. Anderson, P. Skudlarski, and J. C. Gore. “Activation of the
Middle Fusiform ’Face Area’ Increases with Expertise in Recognizing Novel Objects.” Nature
Neuroscience 2:6 (1999), 568–573.
[351] W. J. Geeraets and E. R. Berry. “Ocular Spectral Characteristics as Related to Hazards from
Lasers and Other Sources.” American Journal of Ophthalmology 66:1 (1968), 15–20.
[352] K. Gegenfurtner and D. C. Kiper. “Color Vision.” Annual Review of Neuroscience 26 (2003),
181–206.
[353] K. R. Gegenfurtner and L. Sharpe, editors. Color Vision: From Genes to Perception. Cam-
bridge, UK: Cambridge University Press, 1999.
[354] J. Geier, L. Sera, and L. Bernath. “Stopping the Hermann Grid Illusion by Simple Sine Distor-
tion.” Perception, ECVP 2004 supplement 33 (2004), 53.
[355] W. S. Geisler and M. S. Banks. “Visual Performance.” In Handbook of Optics: Fundamentals,
Techniques and Design, Volume 1, edited by M. Bass, E. W. van Stryland, D. R. Williams, and
W. L. Wolfe, Second edition. New York: McGraw-Hill, 1995.
i i
i i
i i
i i
Bibliography 979
[356] N. George, R. J. Dolan, G. R. Fink, G. C. Baylis, C. Russell, and J. Driver. “Contrast Polarity
and Face Recognition in the Human Fusiform Gyrus.” Nature Neuroscience 2:6 (1999), 574–
580.
[357] T. Gevers and H. Stokman. “Classifying Color Edges in Video into Shadow-Geometry, High-
light, or Material Transitions.” IEEE Transactions on Multimedia 5:2 (2003), 237–243.
[358] K. S. Gibson. “The Relative Visibility Function.” CIE Compte Rendu des Séances, Sixième
Session, Genève, 1924.
[359] J. E. Gibson and M. D. Fairchild. “Colorimetric Characterization of Three Computer Displays
(LCD and CRT).” Technical report, Munsell Color Science Laboratory, Rochester, NY, 2000.
[360] K. S. Gibson and E. P. T. Tyndall. “Visibility of Radiant Energy.” Scientific Papers of the
Bureau of Standards 19 (1923), 131–191.
[361] I. M. Gibson. “Visual Mechanisms in a Cone Monochromat.” Journal of Physiology 161
(1962), 10–11.
[362] A. Gilchrist and J. Cataliotti. “Anchoring of Surface Lightness with Multiple Illumination
Levels.” Investigative Ophthalmology and Visual Science 35:4 (1994), 2165–2165.
[363] A. Gilchrist, C. Kossyfidis, F. Bonato, T. Agostini, J. Cataliotti, X. Li, B. Spehar, V. Annan, and
E. Economou. “An Anchoring Theory of Lightness Perception.” Psychological Review 106:4
(1999), 795–834.
[364] A. L. Gilchrist. “Lightness Contrast and Failures of Constancy: A Common Explanation.”
Perception and Psychophysics 43:5 (1988), 415–424.
[365] A. L. Gilchrist. “The Importance of Errors in Perception.” In Colour Perception: Mind and the
Physical World, edited by R. M. D. Heyer, pp. 437–452. Oxford, UK: Oxford University Press,
2003.
[366] A. L. Gilchrist. “Lightness Perception: Seeing One Color through Another.” Current Biology
15:9 (2005), R330–R332.
[367] A. Glasser and M. C. W. Campbell. “Presbyopia and the Optical Changes in the Human Crys-
talline Lens with Age.” Vision Research 38 (1998), 209–229.
[368] J. Glasser. “Principles of Display Measurement and Calibration.” In Display Systems: Design
and Applications, edited by L. W. MacDonald and A. C. Lowe. Chichester: John Wiley and
Sons, 1997.
[369] A. S. Glassner. “How to Derive a Spectrum from an RGB Triplet.” IEEE Computer Graphics
and Applications 9:4 (1989), 95–99.
[370] A. S. Glassner. Principles of Digital Image Synthesis. San Fransisco, CA: Morgan Kaufmann,
1995.
[371] A. Glassner. “Computer-Generated Solar Halos and Sun Dogs.” IEEE Computer Graphics and
Applications 16:2 (1996), 77–81.
[372] A. Glassner. “Solar Halos and Sun Dogs.” IEEE Computer Graphics and Applications 16:1
(1996), 83–87.
[373] J. W. von Goethe. Zur Farbenlehre. Tübingen, Germany: Freies Geistesleben, 1810.
[374] N. Goldberg. Camera Technology: The Dark Side of the Lens. Boston: Academic Press, Inc.,
1992.
[375] S. Gong, J. Kanicki, G. Xu, and J. Z. Z. Zhong. “A Novel Structure to Improve the Viewing An-
gle Characteristics of Twisted-Nematic Liquid Crystal Displays.” Japanese Journal of Applied
Physics 48 (1999), 4110–4116.
[376] R. C. Gonzales and R. E. Woods. Digital Image Processing, Second edition. Upper Saddle
River, NJ: Prentice-Hall, 2002.
i i
i i
i i
i i
980 Bibliography
[377] J.-C. Gonzato and S. Marchand. “Photo-Realistic Simulation and Rendering of Halos.” In Win-
ter School of Computer Graphics (WSCG ’01) Proceedings. Plzen, Czech Republic: University
of West Bohemia, 2001.
[378] A. Gooch, S. C. Olsen, J. Tumblin, and B. Gooch. “Color2Gray: Salience-Preserving Color
Removal.” ACM Transactions on Graphics 24:3 (2005), 634–639.
[379] R. H. Good, Jr and T. J. Nelson. Classical Theory of Electric and Magnetic Fields. New York:
Academic Press, 1971.
[380] C. M. Goral, K. E. Torrance, D. P. Greenberg, and B. Battaile. “Modeling the Interaction
of Light Between Diffuse Surfaces.” Computer Graphics (SIGGRAPH ’84 Proceedings) 18:3
(1984), 213–222.
[381] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. “The Lumigraph.” In Proceed-
ings SIGGRAPH ’96, Computer Graphics Proceedings, Annual Conference Series, pp. 43 –54.
Reading, MA: Addison-Wesley, 1996.
[382] P. Gouras. “Identification of Cone Mechanisms in Monkey Ganglion Cells.” Journal of Physi-
ology (London) 199 (1968), 533–547.
[383] C. F. O. Graeff, G. B. Silva, F. Nüesch, and L. Zuppiroli. “Transport and Recombination in Or-
ganic Light-Emitting Diodes Studied by Electrically Detected Magnetic Resonance.” European
Physical Journal E 18 (2005), 21–28.
[384] J. Graham. “Some Topographical Connections of the Striate Cortex with Subcortical Structures
in Macaca Fascicularis.” Experimental Brain Research 47 (1982), 1–14.
[385] T. Granlund, L. A. A. Petterson, M. R. Anderson, and O. Inganäs. “Interference Phenomenon
Determines the Color in an Organic Light Emitting Diode.” Journal of Applied Physics 81:12
(1997), 8097–8104.
[386] H. Grassmann. “Zur Theorie der Farbernmischung.” Ann. Phys. Chem. 89 (1853), 69–84.
[387] R. Greenler. Rainbows, Halos, and Glories. Cambridge, UK: Cambridge University Press,
1980.
[388] R. L. Gregory and P. F. Heard. “Border Locking and the Café Wall Illusion.” Perception 8:4
(1979), 365–380.
[389] R. L. Gregory and P. F. Heard. “Visual Dissociations of Movement, Position, and Stereo Depth:
Some Phenomenal Phenomena.” Quarterly Journal of Experimental Psychology 35A (1983),
217–237.
[390] J. Greivenkamp. “Color Dependent Optical Filter for the Suppression of Aliasing Artifacts.”
Applied Optics 29:5 (1990), 676–684.
[391] L. D. Griffin. “Optimality of the Basic Color Categories for Classification.” Journal of the
Royal Society, Interface 3:6 (2006), 71–85.
[392] K. Grill-Spector and R. Malach. “The Human Visual Cortex.” Annual Review of Neuroscience
27 (2004), 647–677.
[393] K. Grill-Spector, T. Kushnir, S. Edelman, Y. Itzchak, and R. Malach. “Cue-Invariant Activation
in Object-Related Areas of the Human Occipital Lobe.” Neuron 21:1 (1998), 191–202.
[394] K. Grill-Spector, T. Kushnir, T. Hendler, S. Edelman, Y. Itzchak, and R. Malach. “A Sequence
of Object-Processing Stages Revealed by fMRI in the Human Occipital Lobe.” Human Brain
Mapping 6:4 (1998), 316–328.
[395] S. Grossberg and E. Mingolla. “Neural Dynamics of Form Perception: Boundary Adaptation,
Illusory Figures, and Neon Color Spreading.” Psychological Review 92 (1985), 173–211.
[396] S. Grossberg and J. Todorović. “Neural Dynamics of 1-D and 2-D Brightness Perception: A
Unified Model of Classical and Recent Phenomena.” Perception and Psychophysics 43 (1988),
241–277.
i i
i i
i i
i i
Bibliography 981
[397] F. Grum and R. J. Becherer. Optical Radiation Measurements. New York: Academic Press,
Inc., 1979.
[398] C. Gu and P. Yeh. “Extended Jones Matrix Method. II.” Journal of the Optical Society of
America A 10:5 (1993), 966–973.
[399] G. Gu, V. Boluvić, P. E. Burrows, S. R. Forrest, and M. E. Thompson. “Transparent Organic
Light Emitting Devices.” Applied Physics Letters 68:19 (1996), 2606–2608.
[400] G. Gu, P. E. Burrows, S. Venkatesh, S. R. Forrest, and M. E. Thompson. “Vacuum-Deposited,
Nonpolymeric Flexible Organic Light-Emitting Devices.” Optics Letters 22:3 (1997), 172–174.
[401] G. Gu, G. Parthasarathy, P. E. Burrows, P. Tian, I. G. Hill, A. Kahn, and S. R. Forrest. “Trans-
parent Stacked Organic Light Emitting Devices. I. Design Principles and Transparent Compound
Electrodes.” Journal of Applied Physics 86:8 (1999), 4067–4075.
[402] R. W. Gubbisch. “Optical Performance of the Eye.” Journal of Physiology 186 (1966), 558–
578.
[403] C. Gueymard, D. Myers, and K. Emery. “Proposed Reference Irradiance Spectra for Solar
Energy Systems Testing.” Solar Energy 73:6 (2002), 443–467.
[404] C. Gueymard. “The Sun’s Total and Spectral Irradiance for Solar Energy Applications and
Solar Radiation Models.” Solar Energy 76:4 (2004), 423–453.
[405] J. Guild. “The Colorimetric Properties of the Spectrum.” Phil. Trans. Roy. Soc. A 230 (1931),
149–187.
[406] R. W. Guillery and M. Colonnier. “Synaptic Patterns in the Dorsal Lateral Geniculate Nu-
cleus of Cat and Monkey: A Brief Review.” Zeitschrift für Zellforschung und mikroskopische
Anatomie 103 (1970), 90–108.
[407] B. Gulyas, C. A. Heywood, D. A. Popplewell, P. E. Roland, and A. Cowey. “Visual Form
Discrimination from Color or Motion Cues: Functional Anatomy by Positron Emission Tomog-
raphy.” Proceedings of the National Academy of Sciences of the United States of America 91:21
(1994), 9965–9969.
[408] M. Gur, I. Kagan, and D. M. Snodderly. “Orientation and Direction Selectivity of Neurons in
V1 of Alert Monkeys: Functional Relationships and Laminar Distributions.” Cerebral Cortex
15:8 (2005), 1207–1221.
[409] C. Gutierrez and C. G. Cusick. “Area V1 in Macaque Monkeys Projects to Multiple Histo-
chemically Defined Subdivisions of the Inferior Pulvinar Complex.” Brain Research 765 (1997),
349–356.
[410] D. Gutierrez, F. J. Seron, A. Munoz, and O. Anson. “Chasing the Green Flash: A Global
Illumination Solution for Inhomogeneous Media.” In Proceedings of the Spring Conference on
Computer Graphics, pp. 95–103. New York: ACM Press, 2004.
[411] D. Gutierrez, A. Munoz, O. Anson, and F. J. Seron. “Non-Linear Volume Photon Mapping.”
In Rendering Techniques 2005: Eurographics Symposium on Rendering, edited by K. Bala and
P. Dutré, pp. 291–300. Aire-la-Ville, Switzerland: Eurographics Association, 2005.
[412] C. S. Haase and G. W. Meyer. “Modeling Pigmented Materials for Realistic Image Synthesis.”
ACM Transactions on Graphics 11:4 (1992), 305–335.
[413] J. Haber, M. Magnor, and H.-P. Seidel. “Physically-Based Simulation of Twilight Phenomena.”
ACM Transactions on Graphics 24:4 (2005), 1353–1373.
[414] R. A. Hall and D. P. Greenberg. “A Testbed for Realistic Image Synthesis.” IEEE Computer
Graphics and Applications 3:6 (1983), 10–20.
[415] R. A. Hall. Illumination and Color in Computer Generated Imagery. Monographs in Visual
Communication, New York: Springer-Verlag, 1989.
i i
i i
i i
i i
982 Bibliography
[416] R. A. Hall. “Comparing Spectral Color Computation Methods.” IEEE Computer Graphics and
Applications 19:4 (1999), 36–45.
[417] P. E. Hallet. “The Variations in Visual Threshold Measurement.” Journal of Physiology 202
(1969), 403–419.
[418] M. Halstead, B. Barsky, S. Klein, and R. Mandell. “Reconstructing Curved Surfaces from
Specular Reflection Patterns using Spline Surface Fitting of Normals.” In Proceedings of SIG-
GRAPH 96, Computer Graphics Proceedings, Annual Conference Series, edited by H. Rush-
meier, pp. 335–342. Reading, MA: Addison-Wesley, 1996.
[419] A. Hanazawa, H. Komatsu, and I. Murakami. “Neural Selectivity for Hue and Saturation of
Colour in the Primary Visual Cortex of the Monkey.” European Journal of Neuroscience 12:5
(2000), 1753–1763.
[420] P. J. B. Hancock, R. J. Baddeley, and L. S. Smith. “The Principle Components of Natural
Images.” Network 3 (1992), 61–70.
[421] J. A. Hanley and B. J. McNeil. “The Meaning and Use of the Area under the Receiver Operating
Characteristic (ROC) Curve.” Radiology 143:1 (1982), 29–36.
[422] J. H. Hannay. “The Clausius-Mossotti Equation: An Alternative Derivation.” European Journal
of Physics 4 (1983), 141–143.
[423] A. Hård and L. Sivik. “NCS — Natural Color System: A Swedish Standard for Color Nota-
tion.” Colour Research and Application 6 (1981), 129–138.
[424] A. Hård, L. Sivik, and G. Tonnquist. “NCS, Natural Color System — From Concepts to
Research and Applications. Part I.” Colour Research and Application 21 (1996), 180–205.
[425] A. Hård, L. Sivik, and G. Tonnquist. “NCS, Natural Color System — From Concepts to
Research and Applications. Part II.” Colour Research and Application 21 (1996), 206–220.
[426] A. C. Hardy and F. L. Wurzburg, Jr. “Color Correction in Color Printing.” Journal of the
Optical Society of America 38:4 (1948), 300–307.
[427] P. Hariharan. Basics of Holography. Cambridge, UK: Cambridge University Press, 2002.
[428] H. F. Harmuth, R. N. Boules, and M. G. M. Hussain. Electromagnetic Dignals: Reflection,
Focussing, Distortion, and their Practical Applications. New York: Kluwer Academic / Plenum
Publishers, 1999.
[429] F. J. Harris. “On the Use of Windows for Harmonic Analysis with the Discrete Fourier Trans-
form.” Proceedings of the IEEE 66:1 (1978), 51–84.
[430] H. K. Hartline. “The Response of Single Optic Nerve Fibers of the Vertebrate Eye to Illumina-
tion of the Retina.” American Journal of Physiology 121 (1938), 400–415.
[431] E. I. Haskal, M. Buechel, J. F. Dijksman, P. C. Duineveld, E. A. Meulenkamp, C. A. H. A.
Mutsaers, A. Sempel, P. Snijder, S. I. E. Vulto, P. van de Weijer, and S. H. P. M. de Winter.
“Ink jet Printing of Passive Matrix Polymer Light-Emitting Displays.” SID Symposium Digest
of Technical Papers 33:1 (2002), 776–779.
[432] U. Hasson, M. Harel, I. Levy, and R. Malach. “Large-Scale Mirror-Symmetry Organization of
Human Occipito-Temporal Object Areas.” Neuron 37:6 (2003), 1027–1041.
[433] J. A. van Hateren and A. van der Schaaf. “Independent Component Filters of Natural Images
Compared with Simple Cells in Primary Visual Cortex.” Proceedings of the Royal Society of
London B 265 (1998), 359–366.
[434] J. H. van Hateren. “A Cellular and Molecular Model of Response Kinetics and Adaptation in
Primate Cones and Horizontal Cells.” Journal of Vision 5 (2005), 331–347.
[435] J. H. van Hateren. “Encoding of High Dynamic Range Video with a Model of Human Cones.”
ACM Transactions on Graphics 25:4 (2006), 1380–1399.
i i
i i
i i
i i
Bibliography 983
i i
i i
i i
i i
984 Bibliography
[458] S. H. C. Hendry and R. C. Reid. “The Koniocellular Pathway in Primate Vision.” Annual
Review of Neuroscience 23 (2000), 127–153.
[459] S. H. Hendry and T. Yoshioka. “A Neurochemically Distinct Third Channel in the Macaque
Dorsal Lateral Geniculate Nucleus.” Science 264 (1994), 575–577.
[460] L. Henyey and J. Greenstein. “Diffuse Radiation in the Galaxy.” Astrophysics Journal 93
(1941), 70–83.
[461] E. Hering. “Zur Lehre vom Lichtsinne. IV. Über die sogenannte Intensität der Lichtempfindung
und über die Empfindung des Schwarzen.” Sitzungsberichte / Akademie der Wissenschaften in
Wien, Mathematisch-Naturwissenschaftliche Klasse Abteilung III, Anatomie und Physiology des
Menschen und der Tiere sowie theoretische Medizin 69 (1874), 85–104.
[462] E. Hering. Outlines of a Theory of the Light Sense (Translation from German: Zur Lehre vom
Lichtsinne, 1878). Cambridge, MA: Harvard University Press, 1920.
[463] L. Hermann. “Eine Erscheinung simultanen Contrastes.” Pflügers Archiv für die gesamte
Physiologie des Menschen und Tiere 3 (1870), 13–15.
[464] E. W. Herold. “History and Development of the Color Picture Tube.” Proceedings of the
Society for Information Display 15:4 (1974), 141–149.
[465] R. D. Hersch, F. Collaud, and P. Emmel. “Reproducing Color Images with Embedded Metallic
Patterns.” ACM Transactions on Graphics 22:3 (2003), 427–434.
[466] M. Herzberger. Modern Geometrical Optics. New York: Interscience, 1958.
[467] P. G. Herzog. “Analytical Color Gamut Representations.” Journal for Imaging Science and
Technology 40 (1996), 516–521.
[468] P. G. Herzog. “Further Developments of the Analytical Color Gamut Representations.” In
Proceedings of the SPIE 3300, pp. 118–128. Bellingham, WA: SPIE, 1998.
[469] M. Hess and M. Wiegner. “COP: A Data Library of Optical Properties of Hexagonal Ice
Crystals.” Applied Optics 33 (1994), 7740–7749.
[470] S. Hesselgren. Hesselgrens färgatla med kortfattad färglära. Stockholm, Sweden: T Palmer,
AB, 1952.
[471] D. Hideaki, H. Y. K. Yukio, and S. Masataka. “Image Data Processing Apparatus for Process-
ing Combined Image Signals in order to Extend Dynamic Range.” In Image Sensor. 8223491,
Japanese Patent, 1996.
[472] B. M. Hill. “A Simple General Approach to Inference about the Tail of a Distribution.” The
Annals of Statistics 3:5 (1975), 1163–1174.
[473] J. W. Hittorf. “Über die Electricitätsleitung der Gase. Erste Mitteilungen.” Annalen der Physik
und Chemie 136 (1869), 1–31, 197–234.
[474] J. Ho, B. V. Funt, and M. S. Drew. “Separating a Color Signal into Illumination and Surface
Reflectance Components: Theory and Applications.” IEEE Transactions on Patterns Analysis
and Machine Intelligence 12:10 (1990), 966–977.
[475] B. Hoefflinger, editor. High-Dynamic-Range (HDR) Vision: Microelectronics, Image Process-
ing, Computer Graphics. Springer Series in Advanced Microelectronics, Berlin: Springer, 2007.
[476] H. Hofer, B. Singer, and D. R. Williams. “Different Sensations from Cones with the Same
Photopigment.” Journal of Vision 5 (2005), 444–454.
[477] A. Hohmann and C. von der Malsburg. “McCollough Effect and Eye Optics.” Perception 7
(1978), 551–555.
[478] G. Hollemann, B. Braun, P. Heist, J. Symanowski, U. Krause, J. Kränert, and C. Deter. “High-
Power Laser Projection Displays.” In Proceedings of the SPIE 4294, edited by M. H. Wu,
pp. 36–46. Bellingham, WA: SPIE, 2001.
i i
i i
i i
i i
Bibliography 985
[479] N. Holonyak Jr. and S. F. Bevaqua. “Coherent (Visible) Light Emission from GaAs1−∞ P∞
Junctions.” Applied Physics Letters 1 (1962), 82–83.
[480] G. Hong, M. R. Luo, and P. A. Rhodes. “A Study of Digital Camera Colorimetric Characteri-
zation Based on Polynomial Modeling.” Color Research and Application 26:1 (2001), 76–84.
[481] Q. Hong, T. X. Wu, X. Zhu, R. Lu, and S.-T. Wu. “Extraordinarily High-Contrast and Wide-
View Liquid-Crystal Displays.” Applied Physics Letters 86 (2005), 121107–1–121107–3.
[482] F. M. Honrubia and J. H. Elliott. “Efferent Innervation of the Retina I: Morphologic Study of
the Human Retina.” Archives of Ophthalmology 80 (1968), 98–103.
[483] D. C. Hood and D. G. Birch. “A Quantitative Measure of the Electrical Activity of Human Rod
Photoreceptors using Electroretinography.” Visual Neuroscience 5 (1990), 379–387.
[484] D. C. Hood and M. A. Finkelstein. “Comparison of Changes in Sensitivity and Sensation:
Implications for the Response-Intensity Function of the Human Photopic System.” Journal of
Experimental Psychology: Human Perceptual Performance 5:3 (1979), 391–405.
[485] D. C. Hood, M. A. Finkelstein, and E. Buckingham. “Psychophysical Tests of Models of the
Response Function.” Vision Research 19:4 (1979), 401–406.
[486] S. J. Hook. “ASTER Spectral Library.”, 1999. Available online (http://speclib.jpl.nasa.gov).
[487] S. D. Hordley and G. D. Finlayson. “Reevaluation of Color Constancy Algorithm Perfor-
mance.” Journal of the Optical Society of America A 23:5 (2006), 1008–1020.
[488] B. K. P. Horn. Robot Vision. Cambridge, MA: MIT Press, 1986.
[489] L. J. Hornbeck. “From Cathode Rays to Digital Micromirrors: A History of Electronic Projec-
tion Display Technology.” Texas Instruments Technical Journal 15:3 (1998), 7–46.
[490] J. C. Horton and L. C. Sincich. “How Specific is V1 Input to V2 Thin Stripes?” Society for
Neuroscience Abstracts 34 (2004), 18.1.
[491] G. D. Horwitz, E. J. Chichilnisky, and T. D. Albright. “Spatial Opponency and Color Tuning
Dynamics in Macaque V1.” Society for Neuroscience Abstracts 34 (2004), 370.9.
[492] C. Hou, M. W. Pettet, V. Sampath, T. R. Candy, and A. M. Norcia. “Development of the Spatial
Organization and Dynamics of Lateral Interactions in the Human Visual System.” Journal of
Neuroscience 23:25 (2003), 8630–8640.
[493] D. H. Hubel and T. N. Wiesel. “Receptive Fields, Binocular Interaction and Functional Archi-
tecture in the Cat’s Visual Cortex.” Journal of Physiology 160 (1962), 106–154.
[494] D. H. Hubel and T. N. Wiesel. “Ferrier Lecture: Functional Architecture of Macaque Monkey
Visual Cortex.” Proceedings of the Royal Society of London B 198:1130 (1977), 1–59.
[495] P. M. Hubel, J. Holm, and G. Finlayson. “Illuminant Estimation and Color Correction.” In
Color Imaging, pp. 73–95. New York: Wiley, 1999.
[496] D. H. Hubel. Eye, Brain, and Vision, Reprint edition, see also
http://neuro.med.harvard.edu/site/dh/ edition. New York: W. H. Freeman and Company,
1995.
[497] A. J. Hughes. “Controlled Illumination for Birefringent Colour LCDs.” Displays 8:3 (1987),
139–141.
[498] A. C. Huk, D. Ress, and D. J. Heeger. “Neuronal Basis of the Motion Aftereffect Reconsid-
ered.” Neuron 32:1 (2001), 161–172.
[499] H. C. van de Hulst. Light Scattering by Small Particles. Mineola, NY: Dover Publications,
1981.
[500] P.-C. Hung and R. S. Berns. “Determination of Constant Hue Loci for a CRT Gamut and
Their Predictions using Color Appearance Spaces.” Color Research and Application 20 (1995),
285–295.
i i
i i
i i
i i
986 Bibliography
[501] P.-C. Hung. “Colorimetric Calibration for Scanners and Media.” In Proceedings of the SPIE
1498, edited by W. Chang and J. R. Milch, pp. 164–174. Bellingham, WA: SPIE, 1991.
[502] P.-C. Hung. “Colorimetric Calibration in Electronic Imaging Devices using a Look-Up Table
Model and Interpolations.” Journal of Electronic Imaging 2 (1993), 53–61.
[503] P.-C. Hung. “Color Theory and its Application to Digital Still Cameras.” In Image Sensors and
Signal Processing for Digital Still Cameras, edited by J. Nakamura, pp. 205–221. Boca Raton,
FL: Taylor and Francis, 2006.
[504] R. W. G. Hunt. “Light and Dark Adaptation and the Perception of Color.” Journal of the
Optical Society of America 42 (1952), 190–199.
[505] R. W. G. Hunt. “The Strange Journey from Retina to Brain.” Journal of the Royal Television
Society 11 (1967), 220–229.
[506] R. W. G. Hunt. “Hue Shifts in Unrelated and Related Colors.” Color Research and Application
14 (1989), 235–239.
[507] R. W. G. Hunt. “Why is Black and White So Important in Colour?” In Colour Imaging: Vision
and Technology, edited by L. W. MacDonald and M. R. Luo, pp. 3–15. Chichester, UK: John
Wiley and Sons, Ltd., 1999.
[508] R. W. G. Hunt. “Saturation: Superfluous or Superior?” In IS&T/SID 9th Color Imaging
Conference, pp. 1–5. Springfield, VA: Society for Imaging Science and Technology, 2001.
[509] R. W. G. Hunt. The Reproduction of Colour, Sixth edition. Chichester, UK: John Wiley and
Sons Ltd., 2004.
[510] J.-M. Hupé, A. C. James, B. R. Payne, S. G. Lomber, P. Girard, and J. Bullier. “Cortical
Feedback Improves Discrimination between Figure and Ground.” Nature 394:6695 (1998), 784–
787.
[511] J.-M. Hupé, A. C. James, P. Girard, and J. Bullier. “Response Modulations by Static Texture
Surround in Area V1 of the Macaque Monkey Do Not Depend on Feedback Connections from
V2.” Journal of Neurophysiology 85:1 (2001), 146–163.
[512] B. S. Hur and M. G. Kang. “High Definition Color Interpolation Scheme for Progressive Scan
CCD Image Sensor.” IEEE Transactions on Consumer Electronics 47:1 (2001), 179–186.
[513] A. Hurlbert. “Formal Connections between Lightness Algorithms.” Journal of the Optical
Society of America A 3 (1986), 1684–1693.
[514] J. Hurri, A. Hyvärinen, and E. Oja. “Wavelets and Natural Image Statistics.” In Proceedings of
the 10th Scandinavian Conference on Image Analysis, pp. 13–18. International Association for
Pattern Recognition, 1997.
[515] L. M. Hurvich and D. Jameson. “Some Quantitative Aspects of Opponent-Colors Theory. IV.
A Psychological Color Specification System.” Journal of the Optical Society of America 46:6
(1956), 416–421.
[516] L. M. Hurvich and D. Jameson. “The Opponent Process Theory of Color Vision.” Psychologi-
cal Review 64 (1957), 384–404.
[517] L. M. Hurvich. Color Vision. Sunderland, MA: Sinauer Associates, 1981.
[518] J. B. Hutchings. “Colour and Appearance in Nature, Part 1.” Colour Research and Application
11 (1986), 107–111.
[519] J. B. Hutchings. “Color in Anthopology and Folklore.” In Color for Science, Art and Technol-
ogy, edited by K. Nassau. Amsterdam: Elsevier Science B.V., 1998.
[520] J. B. Hutchings. “Color in Plants, Animals and Man.” In Color for Science, Art and Technology,
edited by K. Nassau. Amsterdam: Elsevier Science B.V., 1998.
i i
i i
i i
i i
Bibliography 987
[521] J. Hynecek. “CDS Noise Reduction of Partially Reset Charge-Detection Nodes.” IEEE Trans-
actions on Circuits and Systems — I: Fundamental Theory and Applications 49:3 (2002), 276–
280.
[522] A. Hyvärinen. “Survey on Independent Components Analysis.” Neural Computing Surveys 2
(1999), 94–128.
[523] ICC. “Image Technology Colour Management - Architecture, Profile Format, and Data Struc-
ture.” Technical Report ICC.1:2004-10, International Color Consortium, 2004. Available online
(http://www.color.org).
[524] IEC. “Part 2-1: Colour Management - Default RGB Colour Space - sRGB.” Technical Report
61966, International Electrotechnical Commission, Geneva, Switzerland, 1999.
[525] M. Ikebe and K. Saito. “A Wide-Dynamic-Range Compression Image Sensor with Negative-
Feedback Resetting.” IEEE Sensors Journal 7:5 (2007), 897–904.
[526] E. Ikeda. “Image Data Processing Apparatus for Processing Combined Image Signals in order
to Extend Dynamic Range.” In 5801773, United States Patent, 2003.
[527] Q. M. T. Inc. “Interferometric Modulator (IMOD) Technology Overview.”, 2007. White paper.
[528] M. Inui. “Fast Algorithm for Computing Color Gamuts.” Color Research and Application 18
(1993), 341–348.
[529] P. Irawan, J. A. Ferwerda, and S. R. Marshner. “Perceptually Based Tone Mapping of High
Dynamic Range Image Streams.” In Eurographics Symposium on Rendering, pp. 231–242.
Aire-la-Ville, Switzerland: Eurographics Association, 2005.
[530] S. Ishida, Y. Yamashita, T. Matsuishi, M. Ohshima, T. Ohshima, K. Kato, and H. Maeda.
“Photosensitive Sseizures Provoked while Viewing ”Pocket Monster”, a Made-for-Television
Animation Program in Japan.” Epilepsia 39 (1998), 1340–1344.
[531] S. Ishihara. Tests for Colour-Blindness. Tokyo: Hongo Harukicho, 1917.
[532] M. F. Iskander. Electromagnetic Fields and Waves. Upper Saddle River, NJ: Prentice Hall,
1992.
[533] ISO. “Graphic technology - Spectral measurement and colorimetric computation for graphic
arts images.” Technical Report ISO 13655:1996, International Organization for Standardization,
1996.
[534] ISO. “Photography – Digital Still Cameras – Determination of Exposure Index, ISO Speed
Ratings, Standard Output Sensitivity, and Recommended Exposure Index.” 12232:2006.
[535] S. Itoh and M. Tanaka. “Current Status of Field-Emission Displays.” Proceedings of the IEEE
90:4 (2002), 514–520.
[536] ITU-R. “Parameter Values for HDTV Standards for Production and International Programme
Exchange,.” Technical report, BT.709-2, International Telecommunication Union, Geneva,
Switzerland, 1995.
[537] A. V. Ivashchenko. Dichroic Dyes for Liquid Crystal Displays. Boca Raton, Florida: CRC
Press, 1994.
[538] H. Ives. “The Transformation of Color-Mixture Equations from One System to Another.”
Journal of Franklin Inst. 16 (1915), 673–701.
[539] H. E. Ives. “The Resolution of Mixed Colors by Differential Visual Activity.” Philosophical
Magazine 35 (1918), 413–421.
[540] J. Iwao. “Errors in Color Calculations Due to Fluorescence when using the Neugebauer Equa-
tions.” In Proceedings of the Technical Association of the Graphical Arts, pp. 254–266. Sewick-
ley, PA: Technical Association of the Graphic Arts, 1973.
i i
i i
i i
i i
988 Bibliography
[541] D. Jackèl and B. Walter. “Modeling and Rendering of the Atmosphere using Mie Scattering.”
Computer Graphics Forum 16:4 (1997), 201–210.
[542] D. Jackèl and B. Walter. “Simulation and Visualization of Halos.” In Proceedings of ANI-
GRAPH ’97, 1997.
[543] A. K. Jain. Fundamentals of Digital Image Processing. Upper Saddle River, NJ: Prentice-Hall,
1989.
[544] D. Jameson and L. M. Hurvich. “Color Adaptation: Sensitivity, Contrast and After-Images.”
In Handbook of Sensory Physiology, VII/4, edited by D. Jameson and L. M. Hurvich, VII/4,
pp. 568–581. Berlin: Springer-Verlag, 1972.
[545] T. H. Jamieson. “Thin-Lens Theory of Zoom Systems.” Optica Acta 17:8 (1970), 565–584.
[546] J. Janesick, T. Elliott, S. Collins, M. Blouke, and J. Freeman. “Scientific Charge-Coupled
Devices.” Optical Engineering 26:8 (1987), 692–715.
[547] J.-S. Jang and B. Javidi. “Improvement of Viewing Angle in Integral Imaging by Use of Moving
Lenslet Arrays with Low Fill Factor.” Applied Optics 42:11 (2003), 1996–2002.
[548] H. W. Jensen, J. Legakis, and J. Dorsey. “Rendering of Wet Materials.” In Proceedings of
the 10th Eurographics Symposium on Rendering, edited by D. Lischinski and G. W. Larson,
pp. 273–281. Vienna: Springer-Verlag, 1999.
[549] H. W. Jensen. Realistic Image Synthesis using Photon Mapping. Natick, MA: A K Peters,
2001.
[550] E. Jin, X. Feng, and J. Newell. “The Development of a Color Visual Difference Model
(CVDM).” In IS&T PICS Conference, pp. 154–158. Springfield, VA: Society for Imaging Sci-
ence and Technology, 1998.
[551] L. Jing and K. Urahama. “Image Recoloring by Eigenvector Mapping.” In Proceedings of the
International Workshop on Advanced Image Technology, 2006.
[552] D. J. Jobson, Z. Rahman, and G. A. Woodell. “Retinex Image Processing: Improved Fidelity
to Direct Visual Observation.” In Proceedings of the IS&T Fourth Color Imaging Conference:
Color Science, Systems, and Applications, pp. 124–125. Springfield, VA: Society for Imaging
Science and Technology, 1995.
[553] D. J. Jobson, Z. Rahman, and G. A. Woodell. “A Multi-Scale Retinex for Bridging the Gap
between Color Images and Human Observation of Scenes.” IEEE Transactions on Image Pro-
cessing 6:7 (1997), 965–976.
[554] T. Johansson. Färg. Stockholm, Sweden: Lindfors Bokförlag, AB, 1937.
[555] G. M. Johnson and M. D. Fairchild. “A Top Down Description of S-CIELAB and CIEDE2000.”
Color Research and Application 28 (2003), 425–435.
[556] G. M. Johnson and F. Mark D. “The Effect of Opponent Noise on Image Quality.” In SPIE
Proceedings 5668 (Electronic Imaging Conference), pp. 82–89. Bellingham, WA: SPIE, 2005.
[557] P. D. Jones and D. H. Holding. “Extremely Long-Term Persistence of the McCollough Effect.”
Journal of Experimental Psychology 1:4 (1975), 323–327.
[558] D. B. Judd and G. Wyszecki. Color in business, science and industry, Third edition. New York:
Wiley Interscience, 1975.
[559] D. B. Judd, D. L. MacAdam, and G. Wyszecki. “Spectral Distribution of Typical Light as
a Function of Correlated Color Temperature.” Journal of the Optical Society of America 54:8
(1964), 1031–1040.
[560] D. B. Judd. “Chromaticity Sensibility to Stimulus Differences.” Journal of the Optical Society
of America 22 (1932), 72–108.
i i
i i
i i
i i
Bibliography 989
i i
i i
i i
i i
990 Bibliography
[581] N. Katoh and K. Nakabayashi. “Applying Mixed Adaptation to Various Chromatic Adapta-
tion Transformation (CAT) Models.” In IS&T PICS Conference, pp. 299–305. Springfield, VA:
Society for Imaging Science and Technology, 2001.
[582] J. Kautz and M. McCool. “Interactive Rendering with Arbitrary BRDFs using Separable Ap-
proximations.” In Proceedings of the 10th Eurographics Symposium on Rendering, edited by
D. Lischinski and G. W. Larson, pp. 247–260. Vienna: Springer-Verlag, 1999.
[583] R. Kawakami, J. Takamatsu, and K. Ikeuchi. “Color Constancy from Blackbody Illumination.”
Journal of the Optical Society of America A. (to appear), 2007.
[584] P. Kay and C. K. McDaniel. “The Linguistic Significance of the Meanings of Basic Color
Terms.” Language 54:3 (1978), 610–646.
[585] E. A. Khan and E. Reinhard. “Evaluation of Color Spaces for Edge Classification in Outdoor
Scenes.” In IEEE International Conference on Image Processing, pp. 952–955. Washington,
DC: IEEE Press, 2005.
[586] E. A. Khan, A. O. Akyüz, and E. Reinhard. “Ghost Removal in High Dynamic Range Images.”
In IEEE International Conference on Image Processing, pp. 2005–2008. Washington, DC: IEEE
Press, 2006.
[587] E. A. Khan, E. Reinhard, R. Fleming, and H. Bülthoff. “Image-Based Material Editing.” ACM
Transactions on Graphics 25:3 (2006), 654–663.
[588] J.-H. Kim and J. P. Allebach. “Color Filters for CRT-Based Rear Projection Television.” IEEE
Transactions on Consumer Electronics 42:4 (1996), 1050–1054.
[589] H. Kim, J.-Y. Kim, S. H. Hwang, I.-C. Park, and C.-M. Kyung. “Digital Signal Processor with
Efficient RGB Interpolation and Histogram Accumulation.” IEEE Transactions on Consumer
Electronics 44:4 (1998), 1389–1395.
[590] R. Kimmel. “Demosaicing: Image Reconstruction from Color CCD Samples.” IEEE Transac-
tions on Image Processing 8:9 (1999), 1221–1228.
[591] G. Kindlmann, E. Reinhard, and S. Creem. “Face-Based Luminance Matching for Perceptual
Colormap Generation.” In Proceedings of IEEE Visualization, pp. 309–406. Washington, DC:
IEEE, 2002.
[592] D. C. Kiper, J. B. Levitt, and K. R. Gegenfurtner. “Chromatic Signals in Extrastriate Areas
V2 and V3.” In Color Vision: From Genes to Perception, edited by K. R. Gegenfurtner and
L. Sharpe, pp. 249–268. Cambridge, UK: Cambridge University Press, 1999.
[593] A. Kitaoka, J. Gyoba, H. Kawabata, and K. Sakurai. “Two Competing Mechanisms Underlying
Neon Color Spreading, Visual Phantoms and Grating Induction.” Vision Research 41:18 (2001),
2347–2354.
[594] J. Kleinschmidt and J. E. Dowling. “Intracellular Recordings from Gecko Photoreceptors dur-
ing Light and Dark Adaptation.” Journal of General Physiology 66:5 (1975), 617–648.
[595] G. J. Klinker, S. A. Shafer, and T. Kanade. “Image Segmentation and Reflectance Analy-
sis through Color.” In Proceedings of the SPIE 937 (Application of Artificial Intelligence VI),
pp. 229–244. Bellingham, WA: SPIE, 1988.
[596] G. J. Klinker, S. A. Shafer, and T. Kanade. “A Physical Approach to Color Image Understand-
ing.” International Journal of Computer Vision 4:1 (1990), 7–38.
[597] K. Klug, N. Tiv, Y. Tsukamoto, P. Sterling, and S. Schein. “Blue Cones Contact OFF-Midget
Bipolar Cells.” Society for Neuroscience Abstracts 18 (1992), 838.
[598] K. Klug, Y. Tsukamoto, P. Sterling, and S. Schein. “Blue Cone OFF-Midget Ganglion Cells in
Macaque.” Investigative Ophthalmology and Visual Science, Supplement 34 (1993), 986–986.
[599] R. Knight, S. L. Buck, G. A. Fowler, and A. Nguyen. “Rods Affect S-Cone Discrimination on
the Farnsworth-Munsell 100-Hue Test.” Vision Research 38:21 (1998), 3477–3481.
i i
i i
i i
i i
Bibliography 991
[600] W. E. Kock. Lasers and Holography, Second edition. New York: Dover Publications, Inc,
1981.
[601] J. J. Koenderink, A. van Doorn, and M. Stavridi. “Bidirectional Reflection Distribution Func-
tion Expressed in Terms of Surface Scattering Modes.” In European Conference on Computer
Vision, pp. 28–39. London: Springer-Verlag, 1996.
[602] K. Koffka. Principles of Gestallt Psychology. New York: Harcourt, Brace, and World, 1935.
[603] A. J. F. Kok. “Ray Tracing and Radiosity Algorithms for Photorealistic Image Synthesis.”
Ph.D. thesis, Delft University of Technology, 1994.
[604] H. Kolb and L. Dekorver. “Midget Ganglion Cells of the Parafovea of the Human Retina: A
Study by Electron Microscopy and Serial Section Reconstructions.” Journal of Comparative
Neurology 303:4 (1991), 617–636.
[605] H. Kolb and E. V. Famiglietti. “Rod and Cone Pathways in the Inner Plexiform Layer of the
Cat Retina.” Science 186 (1974), 47–49.
[606] H. Kolb, K. A. Linberg, and S. K. Fisher. “Neurons of the Human Retina: A Golgi Study.”
Journal of Comparative Neurology 318:2 (1992), 147–187.
[607] H. Kolb. “The Inner Plexiform Layer in the Retina of the Cat: Electron Microscopic Observa-
tions.” Journal of Neurocytology 8 (1979), 295–329.
[608] H. Kolb. “Anatomical Pathways for Color Vision in the Human Retina.” Visual Neuroscience
7 (1991), 61–74.
[609] H. Kolb. “How the Retina Works.” American Scientist 91 (2003), 28–35.
[610] A. König. “Über den Menschlichen Sehpurpur und seine Bedeutung für das Sehen.” Sitzungs-
berichte der Akademie der Wissenschaften, Berlin, pp. 577–598.
[611] A. König. Gesammelte Abhandlungen zur Physiologischen Optik. Leipzig: Barth, 1903.
[612] N. Kouyama and D. W. Marshak. “Bipolar Cells Specific for Blue Cones in the Macaque
Retina.” Journal of Neuroscience 12:4 (1992), 1233–1252.
[613] T. Koyama. “Optics in Digital Still Cameras.” In Image Sensors and Signal Processing for
Digital Still Cameras, edited by J. Nakamura, pp. 21–51. Boca Raton, FL: Taylor and Francis,
2006.
[614] J. Krauskopf, Q. Zaidi, and M. B. Mandler. “Mechanisms of Simultaneous Color Induction.”
Journal of the Optical Society of America A 3:10 (1986), 1752–1757.
[615] J. Krauskopf. “Light Distribution in Human Retinal Images.” Journal of the Optical Society of
America 52:9 (1962), 1046–1050.
[616] J. Krauskopf. “Effect of Retinal Stabilization on the Appearance of Heterochromatic Targets.”
Journal of the Optical Society of America 53 (1963), 741–744.
[617] Y. A. Kravtsov and Y. I. Orlov. Geometrical Optics of Inhomogeneous Media. Springer Series
on Wave Phenomena, Berlin: Springer-Verlag, 1990.
[618] G. Krawczyk, R. Mantiuk, K. Myszkowski, and H.-P. Seidel. “Lightness Perception Inspired
Tone Mapping.” In APGV ’04: Proceedings of the 1st Symposium on Applied Perception in
Graphics and Visualization, pp. 172–172. New York: ACM Press, 2004.
[619] G. Krawczyk, K. Myszkowski, and H.-P. Seidel. “Lightness Perception in Tone Reproduction
for High Dynamic Range Images.” Computer Graphics Forum 24:3 (2005), 635–645.
[620] G. Krawczyk, K. Myszkowski, and H.-P. Seidel. “Computational Model of Lightness Per-
ception in High Dynamic Range Imaging.” In Proceedings of SPIE 6057 (Human Vision and
Electronic Imaging). Bellingham, WA: SPIE, 2006.
i i
i i
i i
i i
992 Bibliography
[621] W. Kress and M. Stevens. “Derivation of 3-Dimensional Gamut Descriptors for Graphic Arts
Output Devices.” In Proceedings of the Technical Association of Graphical Arts, pp. 199–214.
Sewickley, PA: Technical Assocation of the Graphics Arts, 1994.
[622] J. von Kries. “Die Gesichtsempfindungen.” In Handbuch der Physiologie des Menschen, edited
by W. Nagel, pp. 109–282. Braunschweig, Germany: Vieweg, 1905.
[623] J. von Kries. “Chromatic Adaptation.” In Sources of Color Vision, edited by D. L. MacAdam.
Cambridge, MA: MIT Press, 1970. Originally published in Festschrift der Albrecht-Ludwig-
Universitat, 1902.
[624] J. Kruger and P. Gouras. “Spectral Selectivity of Cells and Its Dependence on Slit Length in
Monkey Visual Cortex.” Journal of Neurophysiology 43 (1979), 1055–1069.
[625] J. Kuang, H. Yamaguchi, G. M. Johnson, and M. D. Fairchild. “Testing HDR Image Render-
ing Algorithms.” In Proceedings of IS&T/SID 12th Color Imaging Conference, pp. 315–320.
Springfield, VA: Society for Imaging Science and Technology, 2004.
[626] J. Kuang, G. M. Johnson, and M. D. Fairchild. “Image Preference Scaling for HDR Image
Rendering.” In IS&T/SID 13th Color Imaging Conference, pp. 8–13. Springfield, VA: Society
for Imaging Science and Technology, 2005.
[627] J. Kuang, H. Yamaguchi, C. Liu, G. M. Johnson, and M. D. Fairchild. “Evaluating HDR
Rendering Algorithms.” ACM Transactions on Applied Perception 4:2 (2007), 9–1 – 9–27.
[628] P. Kubelka and F. Munk. “Ein Beitrag zur Optik der Farbanstriche.” Zeitschrift für technische
Physik 12 (1931), 593–601.
[629] R. G. Kuehni. Computer Colorant Formulation. Lexington, MA: Lexington Books, 1975.
[630] R. G. Kuehni. Color: An Introduction to Practice and Principles. New York: John Wiley &
Sons, 1997.
[631] R. G. Kuehni. Color Space and Its Divisions: Color Order from Antiquity to the Present. New
York: Wiley-Interscience, 2003.
[632] S. W. Kuffler. “Discharge Patterns and Functional Organization of Mammalian Retina.” Jour-
nal of Neurophysiology 16 (1953), 37–68.
[633] T. Kuno, H. Sugiura, and N. Matoba. “New Interpolation Method using Discriminated Color
Correction for Digital Still Cameras.” IEEE Transactions on Consumer Electronics 45:1 (1999),
259–267.
[634] C. J. Kuo and M. H. Tsai, editors. Three-Dimensional Holographic Imaging. New York: John
Wiley and Sons, Inc., 2002.
[635] E. A. Lachica and V. A. Casagrande. “Direct W-Like Geniculate Projections to the
Cytochrome-Oxydase (CO) Blobs in Primate Visual Cortex: Axon Morphology.” Journal of
Comparative Neurology 319:1 (1992), 141–158.
[636] E. A. Lachica, P. D. Beck, and V. A. Casagrande. “Parallel Pathways in Macaque Monkey Stri-
ate Cortex: Anatomically Defined Columns in Layer III.” Proceedings of the National Academy
of Sciences of the United States of America 89:8 (1992), 3566–3560.
[637] E. P. F. Lafortune, S.-C. Foo, K. E. Torrance, and D. P. Greenberg. “Non-Linear Approximation
of Reflectance Functions.” In Proceedings of SIGGRAPH’97, Computer Graphics Proceedings,
Annual Conference Series, pp. 117–126. Reading, MA: Addison-Wesley, 1997.
[638] R. Lakowski. “Colorimetric and Photometric Data for the 10th Edition of the Ishihara Plates.”
The British Journal of Physiological Optics 22:4 (1965), 195–207.
[639] P. Lalonde and A. Fournier. “Filtered Local Shading in the Wavelet Domain.” In Proceedings
of the 8th Eurographics Symposium on Rendering, pp. 163–174. Vienna: Springer-Verlag, 1997.
i i
i i
i i
i i
Bibliography 993
i i
i i
i i
i i
994 Bibliography
[663] T. Lee, J. Zaumseil, Z. Bao, J. W. P. Hsu, and J. A. Rogers. “Organic Light-Emitting Diodes
Formed by Soft Contact Lamination.” Proceedings of the National Academy of Sciences of the
United States of America 101:2 (2004), 429–433.
[664] H.-C. Lee. “Method for Computing the Scene-Illuminant Chromaticity from Specular High-
light.” Journal of the Optical Society of America A 3:10 (1986), 1694–1699.
[665] H.-C. Lee. “Internet Color Imaging.” In Proceedings of the SPIE 3080, pp. 122–135. Belling-
ham, WA: SPIE, 2000.
[666] H.-C. Lee. Introduction to Color Imaging Science. Cambridge, UK: Cambridge University
Press, 2005.
[667] A. Lefohn, R. Caruso, E. Reinhard, B. Budge, and P. Shirley. “An Ocularist’s Approach to
Human Iris Synthesis.” IEEE Computer Graphics and Applications 23:6 (2003), 70–75.
[668] J. Lekner and M. C. Dorf. “Why Some Things Are Darker when Wet.” Applied Optics 27:7
(1988), 1278–1280.
[669] P. Lennie, J. Krauskopf, and G. Sclar. “Chromatic Mechanisms in Striate Cortex of Macaque.”
Journal of Neuroscience 10:2 (1990), 649–669.
[670] P. Lennie, J. Pokorny, and V. C. Smith. “Luminance.” Journal of the Optical Society of America
A 10:6 (1993), 1283–1293.
[671] H. Lensch, J. Kautz, M. Goessele, W. Heidrich, and H. Seidel. “Image-Based Reconstruction
of Spatially Varying Materials.” In Proceedings of the 12th Eurographics Workshop on Ren-
dering, edited by S. J. Gortler and K. Myszkowski, pp. 104–115. Aire-la-Ville, Switzerland:
Eurographics Association, 2001.
[672] S. Lerman, J. M. Megaw, and M. N. Moran. “Further Studies on the Effects of UV Radiation
on the Human Lens.” Ophthalmic Research 17:6 (1985), 354–361.
[673] A. G. Leventhal, R. W. Rodieck, and B. Dreher. “Retinal Ganglion Cell Classes in the Old
World Monkey: Morphology and Central Projections.” Science 213 (1981), 1139–1142.
[674] A. G. Leventhal, K. G. Thompson, D. Liu, Y. Zhou, and S. J. Ault. “Concomitant Sensitivity
to Orientation, Direction, and Color of Cells in Layers 2, 3, and 4 of Monkey Striate Cortex.”
Journal of Neuroscience 15:3 (1995), 1808–1818.
[675] H. W. Leverenz. An Introduction to Luminescence of Solids. Mineola, NY: Dover Publications,
1968.
[676] M. Levoy and P. Hanrahan. “Light Field Rendering.” In Proceedings SIGGRAPH ’96, Com-
puter Graphics Proceedings, Annual Conference Series, pp. 31–42. Reading, MA: Addison-
Wesley, 1996.
[677] I. Levy, U. Hasson, G. Avidan, T. Hendler, and R. Malach. “Center-Periphery Organization of
Human Object Areas.” Nature Neuroscience 4:5 (2001), 533–539.
[678] P. A. Lewis. “Colorants: Organic and Inorganic Pigments.” In Color for Science, Art and
Technology, edited by K. Nassau. Amsterdam: Elsevier Science B.V., 1998.
[679] X. Li and A. Gilchrist. “Relative Area and Relative Luminance Combine to Anchor Surface
Lightness Values.” Perception and Psychophysics 61 (1999), 771–785.
[680] M. Li and J.-M. Lavest. “Some Aspects of Zoom Lens Camera Calibration.” IEEE Transactions
on Pattern Analysis and Machine Intelligence 18:11 (1996), 1105–1110.
[681] Y. Li, L. Sharan, and E. H. Adelson. “Compressing and Companding High Dynamic Range
Images with Subband Architectures.” ACM Transactions on Graphics 24:3 (2005), 836–844.
[682] Z. Li-Tolt, R. L. Fink, and Z. Yaniv. “Electron Emission from Patterned Diamond Flat Cath-
odes.” J. Vac. Sci. Tech. B 16:3 (1998), 1197–1198.
i i
i i
i i
i i
Bibliography 995
[683] A. Liberles. Introduction to Molecular-Orbital Theory. New York: Holt, Rinehart and Winston,
Inc., 1966.
[684] A. Lien. “A Detailed Derivation of Extended Jones Matrix Representation for Twisted Nematic
Liquid Crystal Displays.” Liquid Crystals 22:2 (1997), 171–175.
[685] B. J. Lindbloom, 2005. Available online (http://www.brucelindbloom.com).
[686] D. Lischinski, Z. Farbman, M. Uyttendaele, and R. Szeliski. “Interactive Local Adjustment of
Tonal Values.” ACM Transactions on Graphics 25:3 (2006), 646–653.
[687] D. Litwiller. “CCD vs. CMOS: Facts and Fiction.” Photonics Spectra 35:1 (2001), 151–154.
[688] D. Litwiller. “CCD vs. CMOS: Maturing Technologies, Maturing Markets.” Photonics Spectra
39:8 (2005), 54–58.
[689] C. Liu and M. D. Fairchild. “Measuring the Relationship between Pereived Image Contrast and
Surround Illuminantion.” In IS&T/SID 12th Color Imaging Conference, pp. 282–288. Spring-
field, VA: Society for Imaging Science and Technology, 2004.
[690] Y. Liu, J. Shigley, E. Fritsch, and S. Hemphill. “Abnormal Hue-Angle Change of the Gemstone
Tanzanite between CIE Illuminants D65 and A in CIELAB Color Space.” Color Research and
Application 20 (1995), 245–250.
[691] C. Liu, P. R. Jonas, and C. P. R. Saunders. “Pyramidal Ice Crystal Scattering Phase Functions
and Concentric Halos.” Annales Geophysicae 14 (1996), 1192–1197.
[692] M. S. Livingstone and D. H. Hubel. “Anatomy and Physiology of a Color System in the Primate
Visual Cortex.” Journal of Neuroscience 4:1 (1984), 309–356.
[693] M. S. Livingstone and D. H. Hubel. “Connections between Layer 4B of Area 17 and the Thick
Cytochrome Oxidase Stripes of Area 18 in the Squirrel Monkey.” Journal of Neuroscience 7:11
(1987), 3371–3377.
[694] M. S. Livingstone and D. H. Hubel. “Segregation of Form, Color, Movement, and Depth:
Anatomy, Physiology, and Perception.” Science 240:4853 (1988), 740–749.
[695] M. S. Livingstone. Vision and Art: The Biology of Seeing. New York: Harry N Abrams, 2002.
[696] M. M. Loève. Probability Theory. Princeton: Van Nostrand, 1955.
[697] B. London and J. Upton. Photography, Sixth edition. London: Longman, 1998.
[698] A. V. Loughren. “Recommendations of the National Television System Committee for a Color
Television Signal.” Journal of the SMPTE 60 (1953), 321–326, 596.
[699] O. Lowenstein and I. E. Loewenfeld. “The Pupil.” In The Eye, Volume 3, edited by H. Davson,
pp. 255–337. New York: Academic Press, 1969.
[700] J. Lu and D. M. Healey. “Contrast Enhancement via Multiscale Gradient Transformation.”
In Proceedings of the 16th IEEE International Conference on Image Processing, Volume II,
pp. 482–486. Washington, DC: IEEE, 1992.
[701] R. Lu, J. J. Koenderink, and A. M. L. Kappers. “Optical Properties (Bidirectional Reflectance
Distribution Functions) of Velvet.” Applied Optics 37:25 (1998), 5974–5984.
[702] R. Lu, J. J. Koenderink, and A. M. L. Kappers. “Specularities on Surfaces with Tangential
Hairs or Grooves.” Computer Vision and Image Understanding 78 (1999), 320–335.
[703] B. D. Lucas and T. Kanade. “An Iterative Image Registration Technique with an Application to
Stereo Vision.” In Proceedings of the 1981 DARPA Image Understanding Workshop, pp. 121–
130, 1981.
[704] M. P. Lucassen and J. Walraven. “Evaluation of a Simple Method for Color Monitor Recali-
bration.” Color Research and Application 15:6 (1990), 321–326.
i i
i i
i i
i i
996 Bibliography
i i
i i
i i
i i
Bibliography 997
[725] M. Mahy. “Calculation of Color Gamuts Based on the Neugebauer Model.” Color Research
and Application 22 (1997), 365–374.
[726] R. Malach, R. B. Tootell, and D. Malonek. “Relationship between Orientation Domains, Cy-
tochrome Oxydase Strips, and Intrinsic Horizontal Connections in Squirrel Monkey Area V2.”
Cerebral Cortex 4:2 (1994), 151–165.
[727] R. Malladi and J. A. Sethian. “Image Processing via Level Set Curvature Flow.” Proceedings
of the National Academy of Science 92:15 (1995), 7046–7050.
[728] L. T. Maloney and B. A. Wandell. “Color Constancy: A Method for Recovering Surface
Spectral Reflectance.” Journal of the Optical Society of America A 3:1 (1986), 29–33.
[729] L. T. Maloney. “Evaluation of Linear Models of Surface Spectral Reflectance with Small
Number of Parameters.” Journal of the Optical Society of America A 3 (1986), 1673–1683.
[730] Étienne. L. Malus. “Sur la mesure du pouvoir réfringent des corps opaques.” Journal de l’École
Polytechnique 15:8 (1809), 219–228.
[731] S. Mann and R. W. Picard. “On Being ‘Undigital’ with Digital Cameras: Extending Dy-
namic Range by Combining Differently Exposed Pictures.” In IS&T’s 48th Annual Conference,
pp. 422–428. Springfield, VA: Society for Imaging Science and Technology, 1995.
[732] R. J. W. Mansfield. “Primate Photopigments and Cone Mechanisms.” In The Visual System,
edited by A. Fein and J. S. Levine, pp. 89–106. New York: Alan R Liss, 1985.
[733] R. Mantiuk, A. Efremov, K. Myszkowski, and H.-P. Seidel. “Backward Compatible High
Dynamic Range MPEG Video Compression.” ACM Transactions on Graphics 25:3 (2006),
713–723.
[734] E. W. Marchand. “Gradient Index Lenses.” In Progress in Optics, XI, edited by E. Wolf, XI,
pp. 305–337. Amsterdam: North Holland, 1973.
[735] R. F. Marcinik and K. M. Revak. “Effects of New Cool White Fluorescent Lamps on Viewing
and Measuring Color.” Textile Chemist and Colorist 30:1 (1998), 14–16.
[736] D. Marcuse. Light Transmission Optics, Second edition. New York: Van Nostrand Reinhold
Company, 1982.
[737] D. Marini and A. Rizzi. “A Computational Approach to Color Adaptation Effects.” Image and
Vision Computing 18:13 (2000), 1005–1014.
[738] D. Marini, A. Rizzi, and C. Carati. “Color Constancy Effects Measurement of the Retinex The-
ory.” In Proceedings of the IS&T/SPIE Conference on Electronic Imaging, pp. 23–29. Spring-
field, VA: Society for Imaging Science and Technology, 1999.
[739] K. Z. Markov. “Elementary Micromechanisms of Heterogeneous Media.” In Heterogeneous
Media: Modelling and Simulation, edited by K. Z. Markov and L. Preziosi, pp. 1–162. Boston:
Birkhauser, 1999.
[740] T. J. Marks, J. G. C. Veinot, J. Cui, H. Yan, A. Wang, N. L. Edlerman, J. Ni, Q. Huang, P. Lee,
and N. R. Armstrong. “Progress in High Work Function TCO OLED Anode Alternatives and
OLED Nanopixelation.” Synthetic Metals 127 (2002), 29–35.
[741] G. H. Markstein. Nonsteady Flame Propagation. Oxford, UK: Pergamon, 1964.
[742] D. Marr. Vision, A Computational Investigation into the Human Representation and Processing
of Visual Information. San Fransisco: W. H. Freeman and Company, 1982.
[743] J. S. Marsh and D. A. Plant. “Instrumental Colour Match Prediction using Organic Pigments.”
Journal of the Oil and Colour Chemists Association 47 (1964), 554–575.
[744] S. R. Marshner, S. H. Westin, E. P. F. Lafortune, K. E. Torrance, and D. P. Greenberg. “Image-
Based BRDF Measurement Including Human Skin.” In Proceedings of the 10th Eurographics
Workshop on Rendering, pp. 139–152. Aire-la-Ville, Switzerland: Eurographics Association,
1999.
i i
i i
i i
i i
998 Bibliography
i i
i i
i i
i i
Bibliography 999
[765] R. McDonald. “Acceptibility and Perceptibility Decisions using the CMC Colour Difference
Formula.” Textile Chemist and Colorist 20:6 (1988), 31–31.
[766] D. McIntyre. Colour Blindness: Causes and Effects. Austin, TX: Dalton Publishing, 2002.
[767] K. McLaren. The Colour Science of Dyes and Pigments, Second edition. Bristol, UK: Adam
Hilger Ltd., 1986.
[768] D. McQuarrie. Quantum Chemistry. Mill Valley, CA: University Science Books, 1983.
[769] H. McTavish. “A Demonstration of Photosynthetic State Transitions in Nature.” Photosynthesis
Research 17 (1988), 247–247.
[770] C. A. Mead. “Operation of Tunnel-Emission Devices.” Journal of Applied Physics 32:4 (1961),
646–652.
[771] J. A. Medeiros. Cone Shape and Color Vision: Unification of Structure and Perception.
Blountsville, AL: Fifth Estate Publishers, 2006.
[772] J. Mellerio. “Yellowing of the Human Lens: Nuclear and Cortical Contributions.” Vision
Research 27:9 (1987), 1581–1587.
[773] C. E. Metz. “Basic Principles of ROC Analysis.” Seminars in Nuclear Medicine 8 (1978),
283–298.
[774] L. Meylan and S. Süsstrunk. “Color Image Enhancement using a Retinex-Based Adaptive Fil-
ter.” In Proceedings of the 2nd IS&T European Conference on Color in Graphics, Image, and
Vision (CGIV 2004), pp. 359–363. Springfield, VA: Society for Imaging Science and Technol-
ogy, 2004.
[775] L. Meylan and S. Süsstrunk. “High Dynamic Range Image Rendering with a Retinex-Based
Adaptive Filter.” IEEE Transactions on Image Processing 15:9 (2006), 2820–2930.
[776] L. Meylan, S. Daly, and S. Süsstrunk. “The Reproduction of Specular Highlights on High
Dynamic Range Displays.” In Proceedings of the 14th IS&T/SID Color Imaging Conference.
Springfield, VA: Society for Imaging Science and Technology, 2006.
[777] L. Meylan, D. Alleysson, and S. Süsstrunk. “A Model of Retinal Local Adaptation for the Tone
Mapping of CFA Images.” Journal of the Optical Society of America A.
[778] L. Meylan, S. Daly, and S. Süsstrunk. “Tone Mapping for High Dynamic Range Displays.” In
Proceedings of SPIE 6492 (Human Vision and Electronic Imaging XII). Belingham, WA: SPIE,
2007.
[779] G. Mie. “Beiträge zur Optik trüber Medien, speziell kolloidaler Metallösungen.” Annalen der
Physik 25:4 (1908), 377–377.
[780] N. J. Miller, P. Y. Ngai, and D. D. Miller. “The Application of Computer Graphics in Lighting
Design.” Journal of the IES 14 (1984), 6–26.
[781] J. Millman. “Electronic Energy Bands in Metallic Lithium.” Physical Review 47:4 (1935),
286–290.
[782] W. I. Milne and K. B. K. Teo. “Growth and Characterization of Carbon Nanotubes for Field
Emission Display Applications.” Journal of the Society for Information Display 12 (2004),
289–292.
[783] M. G. J. Minnaert. Light and Color in the Outdoors. New York: Springer-Verlag, 1993.
[784] T. Mitsunaga and S. K. Nayar. “Radiometric Self Calibration.” In International Conference on
Computer Vision and Pattern Recognition, pp. 374–380. Washington, DC: IEEE Press, 1999.
[785] J. D. Mollon and J. K. Bowmaker. “The Spatial Arrangement of Cones in the Primate Retina.”
Nature 360:6405 (1992), 677–679.
[786] J. D. Mollon and G. Estévez. “Tyndall’s Paradox of Hue Discrimination.” Journal of the
Optical Society of America A 5 (1988), 151–159.
i i
i i
i i
i i
1000 Bibliography
[787] J. D. Mollon. “‘Tho’ She Kneel’d in that Place where They Grew...’, The Uses and Origins of
Primate Colour Vision.” Journal of Experimental Biology 146 (1989), 21–38.
[788] F. M. De Monasterio and P. Gouras. “Functional Properties of Ganglion Cells of the Rhesus
Monkey Retina.” Journal of Physiology (London) 251 (1975), 167–195.
[789] P. Moon and D. E. Spencer. “On the Stiles-Crawford effect.” Journal of the Optical Society of
America 34:6 (1944), 319–329.
[790] G. Mori and J. Malik. “Recognizing Objects in Adversarial Clutter: Breaking a Visual
Captcha.” In Proceedings of CVPR, pp. 134–141. Los Alamitos, CA: IEEE Computer Soci-
ety, 2003.
[791] N. Moroney and I. Tastl. “A Comparison of Retinex and iCAM for Scene Rendering.” Journal
of Electronic Imaging 13:1 (2004), 139–145.
[792] N. Moroney. “Chroma Scaling and Crispening.” In IS&T/SID 9th Color Imaging Conference,
pp. 97–101. Springfield, VA: Society for Imaging Science and Technology, 2001.
[793] N. Moroney. “A Hypothesis Regarding the Poor Blue Constancy of CIELAB.” Color Research
and Application 28 (2003), 371–378.
[794] J. Morovic and M. R. Luo. “Gamut Mapping Algorithms Based on Psychophysical Experi-
ment.” In Proceedings of the 5th IS&T/SID Color Imaging Conference, pp. 44–49. Springfield,
VA: Society for Imaging Science and Technology, 1997.
[795] J. Morovic and M. R. Luo. “The Fundamentals of Gamut Mapping: A Survey.” Journal of
Imaging Science and Technology 45:3 (2001), 283–290.
[796] J. Morovic and P. L. Sun. “The Influence of Image Gamuts on Cross-Media Colour Image
Reproduction.” In Proceedings of the 8th IS&T/SID Color Imaging Conference, pp. 324–329.
Springfield, VA: Society for Imaging Science and Technology, 2000.
[797] J. Morovic. “To Develop a Universal Gamut Mapping Algorithm.” Ph.D. thesis, University of
Derby, Derby, UK, 1998.
[798] J. Morovic. “Gamut Mapping.” In Digital Color Imaging Handbook, edited by G. Sharma,
Chapter 10, pp. 639–685. Boca Raton, FL: CRC Press, 2003.
[799] B. Moulden and F. A. A. Kingdom. “White’s Effect: A Dual Mechanism.” Vision Research 29
(1989), 1245–1236.
[800] K. Moutoussis and S. Zeki. “Responses of Spectrally Selective Cells in Macaque Area V2 to
Wavelengths and Colors.” Journal of Neurophysiology 87:4 (2002), 2104–2112.
[801] K. T. Mullen and F. A. A. Kingdom. “Losses in Periferal Colour Sensitivity Predicted from
“Hit and Miss” Post-Receptoral Cone Connections.” Vision Research 36:13 (1996), 1995–2000.
[802] C. D. Müller, A. Falcou, N. Reckefuss, M. Rojahn, V. Wiederhirn, P. Rudati, H. Frohne,
O. Nuyken, H. Becker, and K. Meerholtz. “Multi-Colour Organic Light Emitting Displays by
Solution Processing.” Nature 421 (2003), 829–833.
[803] H. Müller. “Über die entoptische Wahrnehmung der Netzhautgefässe, insbesondere als Be-
weismittel für die Lichtperzeption durch die nach hinten gelegenen Netzhautelemente.” Ver-
handlungen der Physikalisch Medizinische Gesellschaft in Würzburg 5 (1854), 411–411.
[804] J. C. Mullikin, L. J. van Vliet, H. Netten, F. R. Boddeke, G. van der Feltz, and I. T. Young.
“Methods for CCD Camera Characterization.” In Image Acquisition and Scientific Imaging
Systems, SPIE vol. 2173, edited by H. C. Titus and A. Waks, SPIE vol. 2173, pp. 73–84, 1994.
[805] A. H. Munsell. A Color Notation, First edition. Baltimore, MD: Munsell Color Company,
1905.
[806] A. H. Munsell. Atlas of the Munsell Color System. Malden, MA: Wadsworth-Howland and
Company, 1915.
i i
i i
i i
i i
Bibliography 1001
[807] A. H. Munsell. Munsell Book for Color. Baltimore, MD: Munsell Color Company, 1929.
[808] A. H. Munsell. A Color Notation: An Illustrated System Defining All Colors and Their Rela-
tions by Measured Scales of Hue, Value, and Chroma, Ninth edition. Baltimore, MD: Munsell
Color Company, 1941.
[809] Y. Muramatsu, S. Kurosawa, M. Furumiya, H. Ohkubo, and Y. Nakashiba. “A Signal-
Processing CMOS Image Sensor using a Simple Analog Operation.” In IEEE International
Solid-State Circuits Conference (ISSCC), pp. 98–99, 2001.
[810] S. Musa. “Active-Matrix Liquid-Crystal Displays.” Scientific American November.
[811] K. I. Naka and W. A. H. Rushton. “S-Potentials from Luminosity Units in the Retina of Fish
(Cyprinidae).” Journal of Physiology (London) 185:3 (1966), 587–599.
[812] J. Nakamura. “Basics of Image Sensors.” In Image Sensors and Signal Processing for Digital
Still Cameras, edited by J. Nakamura, pp. 53–93. Boca Raton, FL: Taylor and Francis, 2006.
[813] J. Nakamura, editor. Image Sensors and Signal Processing for Digital Still Cameras. Boca
Raton, FL: Taylor and Francis, 2006.
[814] K. Nassau. “Fundamentals of Color Science.” In Color for Science, Art and Technology, edited
by K. Nassau. Amsterdam: Elsevier Science B.V., 1998.
[815] K. Nassau. The Physics and Chemistry of Color: The Fifteen Causes of Color. New York:
John Wiley and Sons, 2001.
[816] S. K. Nayar and V. Branzoi. “Adaptive Dynamic Range Imaging: Optical Control of Pixel
Exposures over Space and Time.” In International Conference on Computer Vision, Volume 2,
pp. 1168–1175. Washington, DC: IEEE Press, 2003.
[817] S. K. Nayar and T. Mitsunaga. “High Dynamic Range Imaging: Spatially Varying Pixel Ex-
posures.” In International Conference on Computer Vision and Pattern Recognition, Volume 1,
pp. 472–479. Washington, DC: IEEE Press, 2000.
[818] S. K. Nayar, G. Krishnan, M. D. Grossberg, and R. Raskar. “Fast Separation of Direct and
Global Components of a Scene using High Frequency Illumination.” ACM Transactions on
Graphics 25:3 (2006), 935–944.
[819] Y. Nayatani, K. Takahama, and H. Sobagaki. “Formulation of a Nonlinear Model of Chromatic
Adaptation.” Color Research and Application 6 (1981), 161–171.
[820] Y. Nayatani, K. Takahama, H. Sobagaki, and J. Hirono. “On Exponents of a Nonlinear Model
of Chromatic Adaptation.” Color Research and Application 7 (1982), 34–45.
[821] Y. Nayatani, T. Mori, K. Hashimoto, and H. Sobagaki. “Comparison of Color Appearance
Models.” Color Research and Application 15 (1990), 272–284.
[822] Y. Nayatani. “Proposal of an Opponent Colors System Based on Color Appearance and Color
Vision Studies.” Color Research and Application 29 (2004), 135–150.
[823] M. Neitz, J. Neitz, and G. H. Jacobs. “Spectral Tuning of Pigments Underlying Red-Green
Color Vision.” Science 252 (1991), 971–974.
[824] J. Neitz, J. Carroll, Y. Yamauchi, M. Neitz, and D. R. Williams. “Color Perception Is Mediated
by a Plastic Neural Mechanism that Is Adjustable in Adults.” Neuron 35:4 (2002), 783–792.
[825] R. Nelson and H. Kolb. “A17: A Broad-Field Amacrine Cell of the Rod System in the Retina
of the Cat.” Joural of Neurophysiology 54 (1985), 592–614.
[826] R. Nelson. “Visual Responses of Ganglion Cells.”, 2006. Webvision Article. Available online
(http://webvision.med.utah.edu).
[827] F. L. van Ness and M. A. Bouman. “Spatial Modulation Transfer in the Human Eye.” Journal
of the Optical Society of America 57 (1967), 401–406.
i i
i i
i i
i i
1002 Bibliography
i i
i i
i i
i i
Bibliography 1003
[848] Y. Ohno. “Spectral Design Considerations for White LED Color Rendering.” Optical Engi-
neering 44:11 (2005), 111302–1 – 111302–9.
[849] N. Ohta and A. R. Robertson. Colorimetry: Fundamentals and Applications. Chichester, UK:
John Wiley and Sons, 2005.
[850] R. Oi and K. Aizawa. “Wide Dynamic Range Imaging by Sensitivity Adjustable CMOS Image
Sensor.” In Proceedings of the IEEE International Conference on Image Processing (ICIP),
pp. 583–586, 2003.
[851] H. W. Ok, S. D. Lee, W. H. Choe, D. S. Park, and C. Y. Kim. “Color Processing for
Multi-Primary Display Devices.” In IEEE International Conference on Image Processing 2005,
pp. 980–983. Washington, DC: IEEE, 2005.
[852] K. Okinaka, A. Saitoh, T. Kawai, A. Senoo, and K. Ueno. “High Performable Green-Emitting
OLEDs.” SID Symposium Digest of Technical Papers 35:1 (2004), 686–689.
[853] B. A. Olshausen and D. J. Field. “Emergence of Simple-Cell Receptive Field Properties by
Learning a Sparse Code for Natural Images.” Nature 381 (1996), 607–609.
[854] B. A. Olshausen and D. J. Field. “How Close Are We to Understanding V1?” Neural Compu-
tation 17:8 (2005), 1665–1699.
[855] W. O’Mara. Liquid Crystal Flat Panel Displays: Manufacturing Science and Technology. New
York: Van Nostrand Reinhold, 1993.
[856] A. V. Oppenheim, R. Schafer, and T. Stockham. “Nonlinear Filtering of Multiplied and Con-
volved Signals.” Proceedings of the IEEE 56:8 (1968), 1264–1291.
[857] A. V. Opppenheim and R. Schafer. Digital Signal Processing. Englewood Cliffs, NJ: Prentice
Hall, 1975.
[858] A. V. Opppenheim, R. Schafer, and J. R. Buck. Discrete-Time Signal Processing, Second
edition. Englewood Cliffs, NJ: Prentice-Hall, 1999.
[859] M. Oren and S. K. Nayar. “Generalization of Lambert’s Reflectance Model.” In Proceed-
ings of SIGGRAPH ’94, Computer Graphics Proceedings, Annual Conference Series, edited by
A. Glassner, pp. 239–246. New York: ACM Press, 1994.
[860] OSA. The Science of Color. Washington: Optical Society of America, 1963.
[861] D. Osorio, D. L. Ruderman, and T. W. Cronin. “Estimation of Errors in Luminance Signals
Encoded by Primate Retina Resulting from Sampling of Natural Images with Red and Green
Cones.” Journal of the Optical Society of America A 15:1 (1998), 16–22.
[862] G. Østerberg. “Topography of the Layer of Rods and Cones in the Human Retina.” Acta
Ophthalmologica, Supplement 6 (1935), 1–103.
[863] V. Ostromoukhov, C. Donohue, and P. Jodoin. “Fast Hierarchical Importance Sampling with
Blue Noise Properties.” ACM Transactions on Graphics 23:3 (2004), 488–495.
[864] V. Ostromoukhov. “Chromaticity Gamut Enhancement by Heptatone Multi-Color Printing.” In
Proceedings IS&T/SPIE 1909, 1993 International Symposium on Electronic Imaging: Science
& Technology, pp. 139–151. Bellingham, WA: SPIE, 1993.
[865] C. A. Padgham and J. E. Saunders. The Perception of Light and Color. New York: Academic
Press, 1975.
[866] R. H. Page, K. I. Schaffers, P. A. Waide, J. B. Tassano, S. A. Payne, W. F. Krupke, and W. K.
Bischel. “Upconversion-Pumped Luminescence Efficiency of Rare-Earth-Doped Hosts Sen-
sitized with Trivalent Ytterbium.” Journal of the Optical Society of America B 15:3 (1998),
996–1008.
[867] G. Palmer. Theory of Colors and Vision. London: Leacroft, 1777. Reprinted in David L
MacAdam (ed.), Selected Papers on Colorimetry — Fundamentals, SPIE Optical Engineering
Press, 1993.
i i
i i
i i
i i
1004 Bibliography
[868] S. Palmer. Vision Science: Photons to Phenomenology. Cambridge MA: MIT Press, 1999.
[869] M. A. Paradiso and K. Nakayama. “Brightness Perception and Filling-In.” Vision Research 31
(1991), 1221–1236.
[870] D. A. Pardo, G. E. Jabbour, and N. Peyghambarian. “Application of Screen Printing in the
Fabrication of Organic Light Emitting Diodes.” Advanced Materials 12:17 (2000), 1249–1252.
[871] S. Paris and F. Durand. “A Fast Approximation of the Bilateral Filter using a Signal Process-
ing Approach.” In Computer Vision- ECCV 2006, Lecture Notes in Computer Science. Berlin:
Springer-Verlag, 2006.
[872] S. H. Park and E. D. Montag. “Evaluating Tone Mapping Algorithms for Rendering Non-
Pictorial (Scientific) High-Dynamic-Range Images.” Journal of Visual Communication and Im-
age Representation 18:5 (2007), 415–428.
[873] S. C. Park and R. R. Shannon. “Zoom Lens Design using Lens Modules.” Optical Engineering
35:6 (1996), 1668–1676.
[874] R. H. Park and E. I. Stearns. “Spectrophotometric Formulation.” Journal of the Optical Society
of America 34:2 (1944), 112–113.
[875] S. Parker, W. Martin, P.-P. Sloan, P. Shirley, B. Smits, and C. Hansen. “Interactive Ray Tracing.”
In Symposium on Interactive 3D Computer Graphics, pp. 119–126. New York: ACM Press,
1999.
[876] S. Parker, M. Parker, Y. Livnat, P.-P. Sloan, C. Hansen, and P. Shirley. “Interactive Ray Tracing
for Volume Visualization.” IEEE Transactions on Visualization and Computer Graphics 5:3
(1999), 238–250.
[877] A. C. Parr. “A National Measurement System for Radiometry, Photometry, and Pyrometry
based upon Absolute Detectors, (Version 1.0) [Online].” Originally published as: NIST Techni-
cal Note 1421, September 1996. Available online (http://physics.nist.gov/Pubs/TN1421/contents.
html).
[878] C. A. Párraga, G. Brelstaff, T. Troscianko, and I. R. Moorhead. “Color and Luminance Infor-
mation in Natural Scenes.” Journal of the Optical Society of America A 15:3 (1998), 563–569.
[879] C. A. Párraga, T. Troscianko, and D. J. Tolhurst. “The Human Visual System Is Optimised
for Processing the Spatial Information in Natural Visual Images.” Current Biology 10:1 (2000),
35–38.
[880] K. A. Parulski. “Color Filter Arrays and Processing Alternatives for One-Chip Cameras.” IEEE
Transactions on Electron Devices 32:8 (1985), 1381–1389.
[881] E. Parzen. “On Estimation of a Probability Density and Mode.” Annals of Mathematical
Statistics 33 (1962), 1065–1076.
[882] D. Pascale. “A Review of RGB Color Spaces.” Technical report, The BabelColor Company,
2003. Available online (http://www.babelcolor.com).
[883] S. N. Pattanaik and H. Yee. “Adaptive Gain Control for High Dynamic Range Image Display.”
In Proceedings of Spring Conference in Computer Graphics (SCCG2002), pp. 24–27. New York:
ACM Press, 2002.
[884] S. N. Pattanaik, J. A. Ferwerda, M. D. Fairchild, and D. P. Greenberg. “A Multiscale Model
of Adaptation and Spatial Vision for Realistic Image Display.” In Proceedings SIGGRAPH
’98, Computer Graphics Proceedings, Annual Conference Series, pp. 287–298. Reading, MA:
Addison-Wesley, 1998.
[885] S. N. Pattanaik, J. Tumblin, H. Yee, and D. P. Greenberg. “Time-Dependent Visual Adaptation
for Fast Realistic Display.” In Proceedings SIGGRAPH 2000, Computer Graphics Proceedings,
Annual Conference Series, pp. 47–54. Reading, MA: Addison-Wesley, 2000.
i i
i i
i i
i i
Bibliography 1005
[886] W. Pauli. “Über den Zusammenhang des Abschlusses von Elektronengruppen im Atom mit der
Komplexstructure der Spektren.” Zeitschrift für Physik 31 (1925), 765–783.
[887] A. L. Pearlman, J. Birch, and J. C. Meadows. “Cerebral Color Blindness: An Acquired Defect
in Hue Discrimination.” Annals of Neurology 5:3 (1979), 253–261.
[888] E. Peli. “Contrast in Complex Images.” Journal of the Optical Society of America 7:10 (1990),
2032–2040.
[889] D. J. Perkel, J. Bullier, and H. Kennedy. “Topography of the Afferent Connectivity of Area 17
in the Macaque Monkey: A Double Labelling Study.” Journal of Comparative Neurology 253:3
(1986), 374–402.
[890] P. Perona and J. Malik. “Scale-Space and Edge Detection using Anisotropic Diffusion.” IEEE
Transactions on Pattern Analysis and Machine Intelligence 12:7 (1990), 629–639.
[891] V. H. Perry, R. Oehler, and A. Cowey. “Retinal Ganglion Cells that Project to the Dorsal Lateral
Geniculate Nucleus in the Macaque Monkey.” Neuroscience 12:4 (1984), 1101–1123.
[892] B. B. Petersen and D. M. Dacey. “Morphology of Retinal Ganglion Ccells with Intra-Retinal
Axon Collaterals.” Visual Neuroscience 15:2 (1998), 377–387.
[893] M. Petrou and P. Bosdogianni. Image Processing: The Fundamentals. Chichester, UK: John
Wiley and Sons, Ltd., 1999.
[894] M. Pfeiffer, S. R. Forrest, X. Zhou, and K. Leo. “A Low Drive Voltage, Transparent, Metal-Free
n-i-p Electrophosphorescent Light Emitting Diode.” Organic Electronics 4 (2003), 21–26.
[895] M. Pharr and G. Hunphreys. Physically Based Rendering: From Theory to Implementation.
San Francisco, CA: Morgan Kaufmann Publishers, 2004.
[896] M. L. F. Phillips, M. P. Hehlen, K. Nguyen, J. M. Sheldon, and N. J. Cockroft. “Upconversion
Phosphors: Recent Advances and New Applications.” In Physics and Chemistry of Lumines-
cent Materials. Proceedings of the Eighth International Symposium (Electrochemical Society
Proceedings Vol 99-40), pp. 123–129. Pennington, NJ: The Electrochemical Society, 2000.
[897] B. T. Phong. “Illumination for Computer Generated Pictures.” Communications of the ACM
18:6 (1975), 311–317.
[898] B. Pinna, G. Brelstaff, and L. Spillmann. “Surface Color from Boundaries: A New ‘Watercolor’
Illusion.” Vision Research 41 (2001), 2669–2676.
[899] M. H. Pirenne. “Dark-Adaptation and Night Vision.” In The Eye, Volume 2, edited by
H. Davson, pp. 93–122. New York: Academic Press, 1962.
[900] M. H. Pirenne. “Liminal Brightness Increments.” In The Eye, Volume 2, edited by H. Davson,
pp. 159–174. New York: Academic Press, 1962.
[901] M. H. Pirenne. Vision and the Eye, Second edition. London: Associated Book Publishers,
1967.
[902] F. H. G. Pitt and E. W. H. Selwyn. “Colour of Outdoor Photographic Objects.” The Photo-
graphic Journal 78 (1938), 115–121.
[903] F. H. G. Pitt. “Characteristics of Dichromatic Vision with an Appendix on Anomalous Trichro-
matic Vision.” In Great Britain Medical Report Series, 200. London: His Majesty’s Stationery
Office, 1935.
[904] D. G. Pitts. “The Ocular Effects of Ultraviolet Radiation.” American Journal of Optometry and
Physiological Optics 55 (1978), 19–35.
[905] G. N. Plass, T. J. Humphreys, and G. W. Kattawar. “Color of the Ocean.” Applied Optics 17:9
(1978), 1432–1446.
[906] W. T. Plummer. “Photographic Shutters: Better Pictures with a Reconsideration of Shutter
Efficiency.” Applied Optics 16:7 (1917), 1914–1917.
i i
i i
i i
i i
1006 Bibliography
[907] I. Pobbaravski. “Methods of Computing Inks Amount to Produce a Scale of Neutrals for
Photomechanical Reproduction.” In Proceedings of the Technical Association of the Graphical
Arts, pp. 10–33. Sewickley, PA: Technical Association of the Graphic Arts, 1966.
[908] A. B. Poirson and B. A. Wandell. “The Appearance of Colored Patterns: Pattern-Color Sepa-
rability.” Journal of the Optical Society of America A 10:12 (1993), 2458–2471.
[909] A. Poirson and B. Wandell. “Pattern-Color Separable Pathways Predicts Sensitivity to Simple
Colored Patterns.” Vision Research 35:2 (1996), 239–254.
[910] J. Pokorny, V. C. Smith, G. Verriest, and A. J. L. G. Pinkers. Congenital and Acquired Color
Vision Defects. New York: Grune and Stratton, 1979.
[911] J. Pokorny, V. C. Smith, and M. Wesner. “Variability in Cone Populations and Implications.”
In From Pigments to Perception, edited by A. Valberg and B. B. Lee, pp. 23–34. New York:
Plenum, 1991.
[912] F. Pollak. “Masking for Halftone.” Journal of Photographic Science 3 (1955), 180–188.
[913] F. Pollak. “The Relationship between the Densities and Dot Size of Halftone Multicolor Im-
ages.” Journal of Photographic Science 3 (1955), 112–116.
[914] F. Pollak. “New Thoughts on Halftone Color Masking.” Penrose Annual 50 (1956), 106–110.
[915] S. L. Polyak. The Vertebrate Visual System. Chicago, IL: Chicago University Press, 1957.
[916] M. Ponzo. “Rapports entre quelque illusions visuelles de contraste angulaire et l’appréciation
de grandeur des astres à l’horizon.” Archives Italiennes de Biologie 58 (1913), 327–329.
[917] R. M. Pope and E. S. Fry. “Absorption Spectrum (380–700 nm) of Pure Water. II. Integrating
Cavity Measurements.” Applied Optics 36:33 (1997), 8710–8723.
[918] D. Pope. “The Elusive Blue Laser.” The Industrial Physicist 3:3 (1997), 16.
[919] J. Portilla and E. P. Simoncelli. “A Parametric Texture Model based on Joint Statistics of
Complex Wavelet Coefficients.” International Journal of Computer Vision 40:1 (2000), 49–71.
Available online ({www.cns.nyu.edu/∼eero/publications.html}).
[920] D. L. Post and C. S. Calhoun. “An eEvaluation of Methods for Producing Desired Colors on
CRT Monitors.” Color Research and Application 14:4 (1989), 172–186.
[921] P. Poulin and A. Fournier. “A Model for Anisotropic Reflection.” Computer Graphics (Pro-
ceedings of SIGGRAPH ’90) 24:4 (1990), 273–282.
[922] C. Poynton. “Color FAQ.”, 2002. Available online (http://www.poynton.com/ColorFAQ.html).
[923] C. Poynton. Digital Video and HDTV: Algorithms and Interfaces. San Francisco: Morgan
Kaufmann Publishers, 2003.
[924] O. Prache. “Active Matrix Molecular OLED Microdisplays.” Displays 22 (2001), 49–56.
[925] A. J. Preetham, P. Shirley, and B. Smits. “A Practical Analytic Model for Daylight.” In
Proceedings of SIGGRAPH ’99, Computer Graphics Proceedings, Annual Conference Series,
pp. 91–100. Reading, MA: Addison-Wesley Longman, 1999.
[926] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C:
The Art of Scientific Computing, Second edition. Cambridge, UK: Cambridge University Press,
1992.
[927] I. G. Priest. “A Proposed Scale for Use in Specifying the Chromaticity of Incandescent Illu-
minants and Various Phases of Daylight.” Journal of the Optical Society of America 23 (1933),
141–141.
[928] E. N. Pugh Jr. and T. D. Lamb. “Phototransduction in Vertebrate Rods and Cones: Molecular
Mechanisms of Amplification, Recovery and Light Adaptation.” In Handbook of Biological
Physics, Volume 3, edited by D. G. S. et al., pp. 183–254. Amsterdam: Elsevier, 2000.
i i
i i
i i
i i
Bibliography 1007
i i
i i
i i
i i
1008 Bibliography
[949] E. Reinhard, M. Ashikhmin, B. Gooch, and P. Shirley. “Color Transfer between Images.” IEEE
Computer Graphics and Applications 21 (2001), 34–41.
[950] E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda. “Photographic Tone Reproduction for
Digital Images.” ACM Transactions on Graphics 21:3 (2002), 267–276.
[951] E. Reinhard, A. O. Akyüz, M. Colbert, C. E. Hughes, and M. O’Connor. “Real-Time Color
Blending of Rendered and Captured Video.” In Proceedings of the Interservice/Industry Train-
ing, Simulation and Education Conference. Arlington, VA: National Defense Industrial Associ-
ation, 2004.
[952] E. Reinhard, G. Ward, S. Pattanaik, and P. Debevec. High Dynamic Range Imaging: Acquisi-
tion, Display and Image-Based Lighting. San Francisco: Morgan Kaufmann Publishers, 2005.
[953] E. Reinhard, P. Debevec, G. Ward, K. Myszkowski, H. Seetzen, D. Hess, G. McTaggart, and
H. Zargarpour. “High Dynamic Range Imaging: Theory and Practice.”, 2006. (SIGGRAPH
2006 Course).
[954] E. Reinhard. “Parameter Estimation for Photographic Tone Reproduction.” journal of graphics
tools 7:1 (2003), 45–51.
[955] A. G. Rempel, M. Trentacoste, H. Seetzen, H. D. Young, W. Heidrich, L. Whitehead, and
G. Ward. “Ldr2Hdr: On-the-fly Reverse Tone Mapping of Legacy Video and Photographs.”
ACM Transactions on Graphics 26:3 (2007), 39–1 – 39–6.
[956] E. Renner. Pinhole Photography: Rediscovering a Historic Technique, Third edition. Focal
Press, 2004.
[957] M. Rezak and L. A. Benevento. “A Comparison of the Organization of the Projections of
the Dorsal Lateral Geniculate Nucleus, the Inferior Pulvinar and Adjacent Lateral Pulvinar to
Primary Visual Cortex (Area 17) in the Macaque Monkey.” Brain Research 167 (1979), 19–40.
[958] P. Rheingans. “Color, Change, and Control for Quantitative Data Display.” In Proceedings of
IEEE Visualization, pp. 252–258. Los Alamitos, CA: IEEE Press, 1999.
[959] P. Rheingans. “Task-Based Color Scale Design.” In SPIE Proceedings (Applied Image and
Pattern Recognition), pp. 33–43. Bellingham, WA: SPIE, 1999.
[960] D. C. Rich. “Light Sources and Illuminants: How to Standardize Retail Lighting.” Textile
Chemist and Colorist 30:1 (1998), 8–12.
[961] D. W. Rickman and N. C. Brecha. “Morphologies of Somatostatin-Immunoreactive Neurons
in the Rabbit Retina.” In Neurobiology of the Inner Retina, H31, edited by R. Weiler and N. N.
Osborne, H31, pp. 461–468. Berlin: Springer-Verlag, 1989.
[962] L. A. Riggs. “Visual Acuity.” In Vision and Visual Perception, edited by C. H. Graham. New
York: John Wiley and Sons, Inc., 1965.
[963] K. Riley, D. S. Ebert, M. Kraus, J. Tessendorf, and C. Hansen. “Efficient Rendering of Atmo-
spheric Phenomena.” In Eurographics Symposium on Rendering, edited by H. W. Jensen and
A. Keller, pp. 375–386. Aire-la-Ville, Switzerland: Eurographics Association, 2004.
[964] D. L. Ringach, R. M. Shapley, and M. J. Hawken. “Orientation Selectivity in Macaque V1:
Diversity and Laminar Dependence.” Journal of Neuroscience 22:13 (2002), 5639–5651.
[965] A. Rizzi, C. Gatta, and D. Marini. “From Retinex to Automatic Color Equalization: Issues in
Developing a New Algorithm for Unsupervised Color Equalization.” SPIE Journal of Electronic
Imaging 13:1 (2004), 75–84.
[966] P. K. Robertson and J. F. O’Callaghan. “The Generation of Color Sequences for Univariate and
Bivariate Mapping.” IEEE Computer Graphics and Applications 6:2 (1986), 24–32.
[967] M. Robertson, S. Borman, and R. Stevenson. “Estimation-Theoretic Approach to Dynamic
Range Enhancement using Multiple Exposures.” Journal of Electronic Imaging 12:2 (2003),
219–228.
i i
i i
i i
i i
Bibliography 1009
i i
i i
i i
i i
1010 Bibliography
[989] R. Saito and H. Kotera. “Extraction of Image Gamut Surface and Calculations of Its Volume.”
In Proceedings of the 8th IS&T/SID Color Imaging Conference, pp. 330–334. Springfield, VA:
Society for Imaging Science and Technology, 2000.
[990] E. Salvador, A. Cavallaro, and T. Ebrahimi. “Shadow Identification and Classification using In-
variant Color Models.” In International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), Volume 3, pp. 1545–1548. Washington, DC: IEEE Press, 2001.
[991] C. D. Salzman, C. M. Murasugi, K. H. Britten, and W. T. Newsome. “Microstimulation in
Visual Area MT: Effects on Direction Discrimination Performance.” Journal of Neuroscience
12:6 (1992), 2331–2355.
[992] F. Samadzadegan, M. Hahn, M. Sarpulaki, and N. Mostofi. “Geometric and Radiometric Evalu-
ation of the Potential of a High Resolution CMOS-Camera.” In Proceedings of the International
Society for Photogrammetry and Remote Sensing (ISPRS), XXXV, B3, XXXV, B3, pp. 488–493.
Istanbul, Turkey, 2004.
[993] J. H. Sandell and P. H. Schiller. “Effect of Cooling Area 18 on Striate Cortex Cells in the
Squirrel Monkey.” Journal of Neurophysiology 48:1 (1982), 38–48.
[994] G. Sapiro. “Color and Illuminant Voting.” IEEE Transactions on Pattern Analysis and Machine
Intelligence 21 (1999), 1210–1215.
[995] M. Sasaki, M. Mase, S. Kawahito, and Y. Tadokoro. “A Wide-Dynamic-Range CMOS Im-
age Sensor based on Multiple Short Exposure-Time Readout with Multiple-Resolution Column
Parallel ADC.” IEEE Sensors Journal 7:1 (2007), 151–158.
[996] K. Sassen, N. C. Knight, Y. Takano, and A. J. Heymsfield. “Effects of Ice-Crystal Structure
on Halo Formation: Cirrus Cloud Experimental and Ray-Tracing Modeling Studies.” Applied
Optics 33:21 (1994), 4590–4601.
[997] K. Sassen. “Halos in Cirrus Clouds: Why Are Classic Displays So Rare?” Applied Optics
44:27 (2005), 5684–5687.
[998] Y. Sato, M. D. Wheeler, and K. Ikeuchi. “Object Shape and Reflectance Modeling from Ob-
servation.” In Proceedings of SIGGRAPH’97, Computer Graphics Proceedings, Annual Con-
ference Series, pp. 379–387. Reading, MA: Addison-Wesley, 1997.
[999] Y. Sato, K. Amemiya, and M. Uchidoi. “Recent Progress in Device Performance and Picture
Quality of Color Plasma Displays.” Journal of the Society for Information Display 10 (2002),
17–23.
[1000] K. Sato. “Image-Processing Algorithms.” In Image Sensors and Signal Processing for Digital
Still Cameras, edited by J. Nakamura, pp. 223–253. Boca Raton, FL: Taylor and Francis, 2006.
[1001] A. van der Schaaf. “Natural Image Statistics and Visual Processing.” Ph.D. thesis, Rijksuni-
versiteit Groningen, The Netherlands, 1998. Available online ({http://www.ub.rug.nl/eldoc/dis/
science/a.v.d.schaaf/}).
[1002] J. Schanda. “Current CIE Work to Achieve Physiologically-Correct Color Metrics.” In Color
Vision: Perspecitves from Different Disciplines, edited by W. Backhaus, R. Kliegl, and J. S.
Werner, pp. 307–318. Berlin: Walter de Gruyter, 1998.
[1003] R. Schettini, G. Ciocca, and I. Gagliardi. “Content-Based Color Image Retrieval with
Relevance Feedback.” In Proceedings of the International Conference on Image Processing,
pp. 27AS2.8–27AS2.8. Los Alamitos, CA: IEEE Computer Society, 1999.
[1004] H. M. Schey. Div, Grad, Curl, and All That. New York: W. W. Norton and Company, 1973.
[1005] C. Schlick. “A Customizable Reflectance Model for Everyday Rendering.” In Fourth Euro-
graphics Workshop on Rendering, pp. 73–84. Aire-la-Ville, Switzerland: Eurographics Associ-
ation, 1993.
i i
i i
i i
i i
Bibliography 1011
[1006] C. Schlick. “Fast Alternatives to Perlin’s Bias and Gain Functions.” In Graphics Gems IV,
pp. 401–403. San Diego: Academic Press, 1994.
[1007] C. Schlick. “Quantization Techniques for the Visualization of High Dynamic Range Pictures.”
In Photorealistic Rendering Techniques, edited by P. Shirley, G. Sakas, and S. Müller, pp. 7–20.
Berlin: Springer-Verlag, 1994.
[1008] M. Schmolesky. “The Primary Visual Cortex.”, 2006. Available online (http://webvision.med.
utah.edu/VisualCortex.html).
[1009] J. L. Schnapf, T. W. Kraft, and D. A. Baylor. “Spectral Sensitivity of Human Cone Photore-
ceptors.” Nature 325:6103 (1987), 439–441.
[1010] J. L. Schnapf, T. W. Kraft, B. J. Nunn, and D. A. Baylor. “Spectral Sensitivity of Primate
Photoreceptors.” Visual Neuroscience 1 (1988), 255–261.
[1011] M. Schrauf, B. Lingelbach, E. Lingelbach, and E. R. Wist. “The Hermann Grid and the
Scintillation Effect.” Perception 24, supplement (1995), 88–89.
[1012] M. Schrauf, B. Lingelbach, and E. R. Wist. “The Scintillating Grid Illusion.” Vision Research
37:8 (1997), 1033–1038.
[1013] P. Schröder and W. Sweldens. “Spherical Wavelets: Efficiently Representing Functions on
the Sphere.” In Proceedings of SIGGRAPH ’95, Computer Graphics Proceedings, Annual Con-
ference Series, pp. 161–172. Reading, MA: Addison-Wesley, 1995.
[1014] E. F. Schubert and J. K. Kim. “Solid-State Light Sources Getting Smart.” Science 308 (2005),
1274–1278.
[1015] M. Schultze. “Zur Anatomie und Physiologie der Retina.” Archiv für mikroskopische
Anatomie und Entwicklungsmechanik 2 (1866), 165–286.
[1016] A. J. Schwab. Field Theory Concepts: Electromagnetic Fields, Maxwell’s Equations, Grad,
Curl, Div, etc. Berlin: Springer-Verlag, 1988.
[1017] E. Schwartz, R. B. Tootell, M. S. Silverman, E. Switkes, and R. L. de Valois. “On the Math-
ematical Structure of the Visuotopic Mapping of Macaque Striate Cortex.” Science 227 (1985),
1065–1066.
[1018] E. A. Schwartz. “Voltage Noise Observed in Rods of the Turtle Retina.” Journal of Physiology
272 (1977), 217–246.
[1019] S. H. Schwartz. Visual Perception: A Clinical Orientation. New York: McGraw-Hill, 2004.
[1020] H. Seetzen, L. A. Whitehead, and G. Ward. “A High Dynamic Range Display using Low
and High Resolution Modulators.” SID Symposium Digest of Technical Papers 34:1 (2003),
1450–1453.
[1021] H. Seetzen, W. Heidrich, W. Stuerzlinger, G. Ward, L. Whitehead, M. Trentacoste, A. Ghosh,
and A. Vorozcovs. “High Dynamic Range Display Systems.” ACM Transactions on Graphics
23:3 (2004), 760–768.
[1022] H. Seetzen, H. Li, G. Ward, L. Whitehead, and W. Heidrich. “Guidelines for Contrast, Bright-
ness, and Amplitude Resolution of Displays.” In Society for Information Display (SID) Digest,
pp. 1229–1233, 2006.
[1023] H. Seetzen, S. Makki, H. Ip, T. Wan, V. Kwong, G. Ward, W. Heidrich, and L. Whitehead.
“Self-Calibrating Wide Color Gamut High Dynamic Range Display.” In Human Vision and
Electronic Imaging XII, 6492, edited by B. E. Rogowitz, T. N. Pappas, and S. J. Daly, 6492,
p. 64920Z. SPIE, 2007.
[1024] R. G. Seippel. Optoelectronics for Technicians and Engineering. Englewood Cliffs, NJ:
Prentice Hall, 1989.
i i
i i
i i
i i
1012 Bibliography
i i
i i
i i
i i
Bibliography 1013
i i
i i
i i
i i
1014 Bibliography
[1067] Society of Dyers and Colourists. Colour Index. Bradford, UK: Society of Dyers and
Colourists, 1971. Available online (http://www.colour-index.org).
[1068] F. M. Sogandares and E. S. Fry. “Absorption Spectrum (340–640 nm) of Pure Water. II.
Photothermal Measurements.” Applied Optics 36:33 (1997), 8699–8709.
[1069] J. A. Solomon, G. Sperling, and C. Chubb. “The Lateral Inhibition of Perceived Contrast Is
Indifferent to On-Center/Off-Center Segregation, but Specific to Orientation.” Vision Research
33:18 (1993), 2671–2683.
[1070] X. Song, G. M. Johnson, and M. D. Fairchild. “Minimizing the Perception of Chromatic Noise
in Digital Images.” In IS&T/SID 12th Color Imaging Conference, pp. 340–346. Springfield, VA:
Society for Imaging Science and Technology, 2004.
[1071] M. Sonka, V. Hlavac, and R. Boyle. Image Processing, Analysis, and Machine Vision, Second
edition. Pacific Grove, CA: PWS Publishing, 1999.
[1072] K. E. Spaulding, E. J. Giorgianni, and G. Woolfe. “Reference Input/Output Medium Metric
RGB Color Encoding (RIMM-ROMM RGB).” In PICS2000: Image Processing, Image Quality,
Image Capture, Systems Conference, pp. 155–163. Springfield, VA: Society for Imaging Science
and Technology, 2000.
[1073] K. E. Spauling, G. J. Woolfe, and R. L. Joshi. “Extending the Color Gamut and Dynamic
Range of an sRGB Image using a Residual Image.” Color Research and Application 25:4 (2003),
251–266.
[1074] N. Speranskaya. “Determination of Spectrum Color Co-ordinates for Twenty-Seven Normal
Observers.” Optics Spectrosc. 7 (1959), 424–428.
[1075] L. Spillmann and J. S. Werner, editors. Visual Perception: The Neurological Foundations.
San Diego: Academic Press, 1990.
[1076] L. Spillmann. “The Hermann Grid Illusion: A Tool for Studying Human Receptive Field
Organization.” Perception 23 (1994), 691–708.
[1077] B. Stabell and U. Stabell. “Chromatic Rod-Cone Interaction during Dark Adaptation.” Journal
of the Optical Society of America A 15:11 (1998), 2809–2815.
[1078] B. Stabell and U. Stabell. “Effects of Rod Activity on Color Perception with Light Adapta-
tion.” Journal of the Optical Society of America A 19:7 (2002), 1248–1258.
[1079] J. Stam. “Stable Fluids.” In Proceedings SIGGRAPH ’99, Computer Graphics Proceed-
ings,Annual Conference Series, pp. 121–128. Reading, MA: Addison-Wesley Longman, 1999.
[1080] J. Starck, E. Pantin, and F. Murtagh. “Deconvolution in Astronomy: A Review.” Publications
of the Astronomical Society of the Pacific 114 (2002), 1051–1069.
[1081] O. N. Stavroudis. “Orthomic Systems of Rays in Inhomogeneous Isotropic Media.” Applied
Optics 10:2 (1971), 260–263.
[1082] O. N. Stavroudis. The Optics of Rays, Wavefronts and Caustics. New York: Academic Press,
1972.
[1083] D. A. Steigerwald, J. C. Bhat, D. Collins, R. M. Fletcher, M. O. Holcomb, M. J. Ludowise,
P. S. Martin, and S. L. Rudaz. “Illumination with Solid State Lighting Technology.” IEEE
Journal on Selected Topics in Quantum Electronics 8:2 (2002), 310–320.
[1084] J. Steinhoff and D. Underhill. “Modifications of the Euler Equations for “Vorticity Confine-
ment”: Application to the Computation of Interacting Vortex Rings.” Physics of Fluids 6:8
(1994), 2738–2744.
[1085] J. C. Stevens and S. S. Stevens. “Brightness fFunctions: Effects of Adaptation.” Journal of
the Optical Society of America 53 (1963), 375–385.
[1086] S. S. Stevens. “To Honor Fechner and Repeal His Law.” Science 133:3446 (1961), 80–86.
i i
i i
i i
i i
Bibliography 1015
[1087] W. Stiles and J. Burch. “N.P.L Colour-Matching Investigation: Final Report.” Optica Acta 6
(1959), 1–26.
[1088] W. S. Stiles and B. H. Crawford. “The Luminous Efficiency of Rays Entering the Eye Pupil
at Different Points.” Proceedings of the Royal Society of London, B 112:778 (1933), 428–450.
[1089] W. S. Stiles. “The Luminous Sensitivity of Monochromatic Rays Entering the Eye Pupil at
Different Points and a New Colour Effect.” Proceedings of the Royal Society of London, B
123:830 (1937), 64–118.
[1090] A. Stimson. Photometry and Radiometry for Engineers. New York: John Wiley and Sons,
Inc., 1974.
[1091] A. Stockman and L. T. Sharpe. “The Spectral Sensitivities of the Middle- and Long-
Wavelength-Sensitive Cones Derived from Measurements on Observers of Known Genotype.”
Vision Research 40:13 (2000), 1711–1737.
[1092] A. Stockman, D. I. MacLeod, and N. E. Johnson. “Spectral Sensitivities of the Human Cones.”
Journal of the Optical Society of America A 10:12 (1993), 2491–2521.
[1093] E. J. Stollnitz, V. Ostromoukhov, and D. H. Salesin. “Reproducing Color Images using Cus-
tom Inks.” In Proceedings SIGGRAPH ’98, Computer Graphics Proceedings, Annual Confer-
ence Series, pp. 267–274. Reading, MA: Addison-Wesley, 1998.
[1094] R. I. Stolyarevskaya. “The 25th Meeting of CIE and Measurements on LED Optical Parame-
ters.” Measurement Techniques 47:3 (2004), 316–320.
[1095] M. C. Stone, W. B. Cowan, and J. C. Beatty. “Color Gamut Mapping and the Printing of
Digital Color Images.” ACM Transactions on Graphics 7:4 (1988), 249–292.
[1096] M. C. Stone. “Color Balancing Experimental Projection Displays.” In Proceedings of the 9th
Color Imaging Conference: Color Science and Engineering: Systems, Technologies, Applica-
tions, pp. 342–347. Springfield, VA: Society for Imaging Science and Technology, 2001.
[1097] M. C. Stone. A Field Guide to Digital Color. Natick, MA: A K Peters, 2003.
[1098] M. Störring, E. Granum, and H. J. Andersen. “Estimation of the Illuminant Colour using
Highlights from Human Skin.” In International Conference on Color in Graphics and Image
Processing, pp. 45–50. Paris: Lavoisier, 2000.
[1099] A. Streitwieser Jr, C. H. Heathcock, and E. M. Kosower. Introduction to Organic Chemistry,
Fourth edition. New York: Macmillan, 1992.
[1100] C. F. Stromeyer. “Form-Color After Effects in Human Vision.” In Handbook of Sensory
Physiology: Perception, VIII, edited by R. Held, H. W. Leibowitz, and H. L. Teuber, pp. 97–142.
New York: Springer-Verlag, 1978.
[1101] S. Subramanian and I. Biederman. “Does Contrast Reversal Affect Object Identification?”
Investigative Ophthalmology and Visual Science 38:998.
[1102] K. Suffern. Ray Tracing from the Ground Up. Wellesley, MA: A K Peters, 2007.
[1103] A. Sugimoto, H. Ochi, S. Fujimura, A. Yoshida, T. Miyadera, and M. Tsuchida. “Flexible
OLED Displays using Plastic Substrates.” IEEE Journal of Selected Topics in Quantum Elec-
tronics 10:1 (2004), 107–114.
[1104] H. Sugiura, K. Sakawa, and J. Fujimo. “False Color Signal Reduction Method for Single-Chip
Color Video Cameras.” IEEE Transactions on Consumer Electronics 40:2 (1994), 100–106.
[1105] W. E. Sumpner. “The Diffusion of Light.” Proceedings of the Physical Society of London 12
(1892), 10–29.
[1106] Y. Sun, D. Fracchia, T. W. Calvert, and M. S. Drew. “Deriving Spectra from Colors and
Rendering Light Interference.” IEEE Computer Graphics and Applications 19:4 (1999), 61–67.
i i
i i
i i
i i
1016 Bibliography
[1107] Y. Sun. “Rendering Biological Iridescences with RGB-Based Renderers.” ACM Transactions
on Graphics 25:1 (2006), 100–129.
[1108] W. Suzuki, K. S. Saleem, and K. Tanaka. “Divergent Backward Projections from the An-
terior Part of the Inferotemporal Cortex (Area TE) in the Macaque.” Journal of Comparative
Neurology 422:2 (2000), 206–228.
[1109] H. Suzuki. Electronic Absorption Spectra and Geometry of Organic Molecules: An Applica-
tion of Molecular Orbital Theory. New York: Academic Press, 1967.
[1110] M. J. Swain and D. H. Ballard. “Color Indexing.” International Journal of Computer Vision
7:1 (1991), 11–32.
[1111] Swedish Standards Institute. Colour Notation System, Second edition. Stockholm: Swedish
Standards Institute, 1990.
[1112] K. Takatoh, M. Hasegawa, M. Koden, N. Itoh, R. Hasegawa, and M. Sakamoto. Alignment
Technologies and Applications of Liquid Crystal Devices. London: Taylor and Francis, 2005.
[1113] H. Takeshi, H. Onoe, H. Shizuno, E. Yoshikawa, N. Sadato, H. Tsukada, and Y. Watanabe.
“Mapping of Cortical Areas Involved in Color Vision in Non-Human Primates.” Neuroscience
Letters 230:1 (1997), 17–20.
[1114] A. A. Talin, K. A. Dean, and J. E. Jaskie. “Field Emission Displays: A Critical Review.”
Solid-State Electronics 45 (2001), 963–976.
[1115] E.-V. Talvala, A. Adams, M. Horowitz, and M. Levoy. “Veiling Glare in High Dynamic Range
Imaging.” ACM Transactions on Graphics 26:3 (2007), 37–1 – 37–9.
[1116] N. Tamura, N. Tsumura, and Y. Miyake. “Masking Model for Accurate Colorimetric Charac-
terization of LCD.” In Proceedings of the IS&T/SID Tenth Color Imaging Conference, pp. 312–
316. Springfield, VA: Society of Imaging Science and Technology, 2002.
[1117] C. W. Tang and S. A. VanSlyke. “Organic Electroluminescent Diode.” Applied Physics Letters
51:12 (1987), 913–915.
[1118] J. Tanida and K. Yamada. “TOMBO: Thin Observation by Bound Optics.” Applied Optics
40:11 (2001), 1806–1813.
[1119] J. Tanida, R. Shogenji, Y. Kitamura, M. Miyamoto, and S. Miyatake. “Color Imaging with
an Integrated Compound Imaging System.” Journal of the Optical Society of America A 11:18
(2003), 2109–2117.
[1120] L. E. Tannas, editor. Flat-Panel Displays and CRTs. New York: Van Nostrand Reinhold
Company, 1985.
[1121] A. H. Taylor and E. B. Rosa. “Theory, Construction, and Use of the Photometric Integrating
Sphere.” Scientific Papers of the Bureau of Standards 18 (1922), 280–325.
[1122] H. Terstiege. “Chromatic Adaptation: A State-of-the-Art Report.” Journal of Color and
Appearance 1 (1972), 19–23.
[1123] A. J. P. Theuwissen. “CCD or CMOS Image Sensors for Consumer Digital Still Photography.”
In IEEE International Symposium on VLSI Technology, Systems, and Applications, pp. 168–171,
2001.
[1124] W. B. Thompson, P. Shirley, and J. Ferwerda. “A Spatial Post-Processing Algorithm for
Images of Night Scenes.” journal of graphics tools 7:1 (2002), 1–12.
[1125] M. G. A. Thomson. “Higher-Order Structure in Natural Scenes.” Journal of the Optical
Society of America A 16:7 (1999), 1549–1553.
[1126] M. G. A. Thomson. “Visual Coding and the Phase Structure of Natural Scenes.” Network:
Computation in Neural Systems 10:2 (1999), 123–132.
i i
i i
i i
i i
Bibliography 1017
i i
i i
i i
i i
1018 Bibliography
[1148] T. Tsuji, S. Kawami, S. Miyaguchi, T. Naijo, T. Yuki, S. Matsuo, and H. Miyazaki. “Red-
Phosphorescent OLEDs Employing bis-(8-Quinolinolato)-Phenolato-Aluminum (III) Com-
plexes as Emission-Layer Hosts.” SID Symposium Digest of Technical Papers 35:1 (2004),
900–903.
[1149] M. Tsukada and Y. Ohta. “An Approach to Color Constancy from Mutual Reflection.” In
Proceedings of the 3rd International Conference on Computer Vision, pp. 385–389. Washington,
DC: IEEE Press, 1990.
[1150] T. Tsutsui. “A Light-Emitting Sandwich Filling.” Nature 420:6917 (2002), 752–755.
[1151] H. F. J. M. van Tuijl. “A New Visual Illusion: Neonlike Color Spreading and Complementary
Color Induction between Subjective Contours.” Acta Psychologica 39 (1975), 441–445.
[1152] J. Tumblin and H. Rushmeier. “Tone Reproduction for Computer Generated Images.” IEEE
Computer Graphics and Applications 13:6 (1993), 42–48.
[1153] J. Tumblin and G. Turk. “LCIS: A Boundary Hierarchy for Detail-Preserving Contrast Re-
duction.” In Proceedings SIGGRAPH ’99, Computer Graphics Proceedings, Annual Conference
Series, edited by A. Rockwood, pp. 83–90. Reading, MA: Addison-Wesley Longman, 1999.
[1154] K. Turkowski. “Transformations of Surface Normal Vectors with Applications to Three Di-
mensional Computer Graphics.” Technical Report 22, Apple Computer, Inc., 1990.
[1155] S. R. Turns. An Introduction to Combustion: Concepts and Applications, Second edition.
New York: McGraw-Hill, 1996.
[1156] S. A. Twomey, C. F. Bohren, and J. L. Mergenthaler. “Reflectance and Albedo Differences
between Wet and Dry Surfaces.” Applied Optics 25:3 (1986), 431–437.
[1157] C. W. Tyler. “Analysis of Human Receptor Density.” In Basic and Clinical Applications of
Vision Science, edited by V. Lakshminarayanan, pp. 63–71. Norwood, MA: Kluwer Academic
Publishers, 1997.
[1158] E. P. T. Tyndall. “Chromaticity Sensibility to Wavelength Difference as a Function of Purity.”
Journal of the Optical Society of America 23:1 (1933), 15–22.
[1159] H. Uchiike and T. Hirakawa. “Color Plasma Displays.” Proceedings of the IEEE 90:4 (2002),
533–539.
[1160] H. Uchiike, K. Miura, N. Nakayama, and T. Shinoda. “Secondary Electron Emission Charac-
teristics of Dielectric Materials in AC-Operated Plasma Display Panels.” IEEE Transactions on
Electron Devices ED-23 (1976), 1211–1217.
[1161] Z. Ulanowski. “Ice Analog Halos.” Applied Optics 44:27 (2005), 5754–5758.
[1162] R. Ulbricht. “Die Bestimmung der mittleren raumlichen Lichtintensitat durch nur eine Mes-
sung.” Elektrotechnische Zeitschrift 21 (1900), 595–597.
[1163] I. Underwood. “A Review of Microdisplay Technologies.” In The Annual Conference of
the UK and Ireland Chapter of the Society for Information Display. San Jose, CA: Society for
Information Display, 2000.
[1164] L. G. Ungerleider and R. Desimone. “Cortical Connections of Visual Area MT in the
Macaque.” Journal of Comparative Neurology 248:2 (1986), 190–222.
[1165] L. G. Ungerleider and R. Desimone. “Projections to the Superior Temporal Sulcus from the
Central and Peripheral Field Representations of V1 and V2.” Journal of Comparative Neurology
248:2 (1986), 147–163.
[1166] United States Committee on Extension to the Standard Atmosphere. U.S. Standard Atmo-
sphere, 1976. Washington, D.C.: National Oceanic and Atmospheric Administration, National
Aeronautics and Space Administration, United States Air Force, 1976.
i i
i i
i i
i i
Bibliography 1019
[1167] M. Uyttendaele, A. Eden, and R. Szeliski. “Eliminating Ghosting and Exposure Artifacts
in Image Mosaics.” In IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, Volume 2, p. 509. Washington, DC: IEEE Press, 2001.
[1168] K. M. Vaeth. “OLED-display technology.” Information Display 19:6 (2003), 12–17.
[1169] J. M. Valeton and D. van Norren. “Light-Adaptation of Primate Cones: An Analysis Based
on Extracellular Data.” Vision Research 23:12 (1982), 1539–1547.
[1170] R. L. de Valois and K. K. de Valois. “Neural Coding of Color.” In Handbook of Perception V,
edited by E. C. Carterette and M. P. Friedman, pp. 117–166. New York: Academic Press, 1975.
[1171] R. L. de Valois and K. K. de Valois. “A Multi-Stage Color Model.” Vision Research 33 (1993),
1053–1065.
[1172] R. L. de Valois, I. Abramov, and G. H. Jacobs. “Analysis of Response Patterns of LGN Cells.”
Journal of the Optical Society of America 56:7 (1966), 966–977.
[1173] R. L. de Valois, D. M. Snodderly Jr, E. W. Yund, and N. K. Hepler. “Responses of Macaque
Lateral Geniculate Cells to Luminance and Color Figures.” Sensory Processes 1:3 (1977), 244–
259.
[1174] R. L. de Valois, K. K. de Valois, E. Switkes, and L. E. Mahon. “Hue Scaling of Isoluminant
and Cone-Specific Lights.” Vision Research 37:7 (1997), 885–897.
[1175] R. L. de Valois, N. P. Cottaris, S. D. Elfar, L. E. Mahon, and J. A. Wilson. “Some Transfor-
mations of Color Information from Lateral Geniculate Nucleus to Striate Cortex.” Proceedings
of the National Academy of Sciences of the United States of America 97:9 (2000), 4997–5002.
[1176] R. L. de Valois, K. K. de Valois, and L. E. Mahon. “Contribution of S Opponent Cells to
Color Appearance.” Proceedings of the National Academy of Sciences of the United States of
America 97:1 (2000), 512–517.
[1177] D. I. Vaney. “The Mosaic of Amacrine Cells in the Mammalian Retina.” Progress in Retinal
and Eye Research 9 (1990), 49–100.
[1178] R. VanRullen and T. Dong. “Attention and Scintillation.” Vision Research 43:21 (2003),
2191–2196.
[1179] D. Varin. “Fenomeni di contrasto e diffusione cromatica nell organizzazione spaziale del
campo percettivo.” Rivista di Psicologia 65 (1971), 101–128.
[1180] R. G. Vautin and B. M. Dow. “Color Cell Groups in Foveal Striate Cortex of the Behaving
Macaque.” Journal of Neurophysiology 54:2 (1985), 273–292.
[1181] T. R. Vidyasagar, J. J. Kulikowski, D. M. Lipnicki, and B. Dreher. “Convergence of Parvocel-
lular and Magnocellular Information Channels in the Primary Visual Cortex of the Macaque.”
European Journal of Neuroscience 16:5 (2002), 945–956.
[1182] J. A. S. Viggiano. “Modeling the Color of Multi-Colored Halftones.” In Proceedings of the
Technical Association of Graphical Arts, pp. 44–62. Sewickley, PA: Technical Association of
the Graphic Arts, 1990.
[1183] B. Vohnsen. “Photoreceptor Waveguides and Effective Retinal Image Quality.” Journal of the
Optical Society of America A 24:3 (2007), 597–607.
[1184] E. Völkl, L. F. Allard, and D. C. Joy, editors. Introduction to Electron Holography. New
York: Kluwer Academic / Plenum Publishers, 1999.
[1185] J. J. Vos, J. Walraven, and A. van Meeteren. “Light Profiles of the Foveal Image of a Point
Source.” Vision Research 16:2 (1976), 215–219.
[1186] J. Vos. “Disability Glare—A State of the Art Report.” CIE Journal 3:2 (1984), 39–53.
[1187] M. J. Vrhel and H. J. Trussell. “Color Device Calibration: A Mathematical Formulation.”
IEEE Transactions on Image Processing 8:12 (1999), 1796–1806.
i i
i i
i i
i i
1020 Bibliography
[1188] H. L. de Vries. “The Quantum Character of Light and Its Bearing upon the Threshold of
Vision, the Differential Sensitivity and Acuity of the Eye.” Physica 10:7 (1943), 553–564.
[1189] T. Wachtler, T. J. Sejnowski, and T. D. Albright. “Representation of Color Stimuli in Awake
Macaque Primary Visual Cortex.” Neuron 37:4 (2003), 681–691.
[1190] H. G. Wagner, E. F. MacNichol, and M. L. Wolbarsht. “Three Response Properties of Single
Ganglion Cells in the Goldfish Retina.” Journal of General Physiology 43 (1960), 45–62.
[1191] G. Wald and D. R. Griffin. “The Change in Refractive Power of the Human Eye in Dim and
Bright Light.” Journal of the Optical Society of America 37 (1946), 321–336.
[1192] I. Wald and P. Slusallek. “State of the Art in Interactive Ray Tracing.” In Eurographics STAR -
State of the Art Reports, pp. 21–42. Aire-la-Ville, Switzerland: Eurographics Association, 2001.
[1193] I. Wald, P. Slusallek, and C. Benthin. “Interactive Distributed Ray Tracing of Highly Complex
Models.” In Proceedings of the 12th Eurographics Workshop on Rendering, edited by S. J.
Gortler and K. Myszkowski, pp. 274–285. Aire-la-Ville, Switzerland: Eurographics Association,
2001.
[1194] G. Wald. “Human Vision and the Spectrum.” Science 101, pp. 653–658.
[1195] M. M. Waldrop. “Brilliant Displays.” Scientific American 297:3 (2007), 94–97.
[1196] H. Wallach. “Brightness Constancy and the Nature of Achromatic Colors.” Journal of Exper-
imental Psychology 38 (1948), 310–324.
[1197] G. Walls. “The Filling-In Process.” American Journal of Optometry 31 (1954), 329–340.
[1198] E. Walowit, C. J. McCarthy, and R. S. Berns. “Spectrophotometric Color Matching Based on
Two-Constant Kubelka-Munk Theory.” Color Research and Application 13:6 (1988), 358–362.
[1199] J. W. T. Walsh. Photometry, Third edition. New York: Dover Publications, Inc., 1965.
[1200] B. Walter. “Notes on the Ward BRDF.” Technical Report PCG-05-06, Cornell Program of
Computer Graphics, Ithaca, NY, 2005.
[1201] B. A. Wandell and J. E. Farrell. “Water into Wine: Converting Scanner RGB to Tristimulus
XYZ [1909-11].” In Proceedings of SPIE 1909, pp. 92–101. Bellingham, WA: SPIE, 1993.
[1202] B. A. Wandell, H. A. Baseler, A. B. Poirson, G. M. Boynton, and S. A. Engel. “Computational
Neuroimaging: Color Tuning in Two Human Cortical Areas Measured using fMRI.” In Color
Vision: From Genes to Perception, edited by K. R. Gegenfurtner and L. Sharpe, pp. 269–282.
Cambridge, UK: Cambridge University Press, 1999.
[1203] B. A. Wandell. “Color Rendering of Color Camera Data.” Color Research and Application
11 (1986), S30–S33.
[1204] B. A. Wandell. “The Synthesis and Analysis of Color Images.” IEEE Transactions on Patterns
Analysis and Machine Intelligence 9:1 (1987), 2–13.
[1205] B. A. Wandell. Foundations of Vision. Sunderland, Massachusetts: Sinauer Associates, inc.,
1995.
[1206] J. Y. A. Wang and E. H. Adelson. “Representing Moving Images with Layers.” IEEE Trans-
actions on Image Processing 3:5 (1994), 625–638.
[1207] L. Wang, L.-Y. Wei, K. Zhou, B. Guo, and H.-Y. Shum. “High Dynamic Range Image Hallu-
cination.” In Rendering Techniques ’07 (Proceedings of the Eurographics Symposium on Ren-
dering), 2007.
[1208] G. Ward and E. Eydelberg-Vileshin. “Picture Perfect RGB Rendering using Spectral Prefilter-
ing and Sharp Color Primaries.” In Thirteenth Eurographics Workshop on Rendering, edited by
P. Debevec and S. Gibson, pp. 123–130. Aire-la-Ville, Switzerland: Eurographics Association,
2002.
i i
i i
i i
i i
Bibliography 1021
[1209] G. Ward and M. Simmons. “Subband Encoding of High Dynamic Range Imagery.” In APGV
’04: Proceedings of the 1st Symposium on Applied Perception in Graphics and Visualization,
pp. 83–90. New York: ACM Press, 2004.
[1210] G. Ward and M. Simmons. “JPEG-HDR: A Backwards Compatible, High Dynamic Range
Extension to JPEG.” In Proceedings of the Thirteenth Color Imaging Conference, pp. 283–290.
Springfield, VA: Society for Imaging Science and Technology, 2005.
[1211] G. Ward, H. Rushmeier, and C. Piatko. “A Visibility Matching Tone Reproduction Operator
for High Dynamic Range Scenes.” IEEE Transactions on Visualization and Computer Graphics
3:4 (1997), 291–306.
[1212] G. Ward. “Measuring and Modeling Anisotropic Reflection.” Proc. SIGGRAPH ’92, Com-
puter Graphics 26:2 (1992), 265–272.
[1213] G. Ward. “A Contrast-Based Scalefactor for Luminance Display.” In Graphics Gems IV,
edited by P. Heckbert, pp. 415–421. Boston: Academic Press, 1994.
[1214] G. Ward. “Fast, Robust Image Registration for Compositing High Dynamic Range Pho-
tographs from Hand-Held Exposures.” journal of graphics tools 8:2 (2003), 17–30.
[1215] C. Ware. “Color Sequences for Univariate Maps: Theory, Experiments, and Principles.” IEEE
Computer Graphics and Applications 8:5 (1988), 41–49.
[1216] C. Ware. Information Visualization: Perception for Design. Morgan Kaufmann, 2000.
[1217] H. Wässle and B. B. Boycott. “Functional Architecture of the Mammalian Retina.” Physio-
logical Reviews 71:2 (1991), 447–480.
[1218] H. Wässle, U. Grünert, M.-H. Chun, and B. B. Boycott. “The Rod Pathway of the Macaque
Monkey Retina: Identification of AII-Amacrine Cells with Antibodies against Calretinin.” Jour-
nal of Comparative Neurology 361:3 (1995), 537–551.
[1219] A. Watson and A. Ahumada. “A Standard Model for Foveal Detection of Spatial Contrast.”
Journal of Vision 5:9 (2005), 717–740.
[1220] A. Watson and J. Solomon. “A Model of Visual Contrast Gain Control and Pattern Masking.”
Journal of the Optical Society of America 14 (1997), 2378–2390.
[1221] A. Watson. “The Spatial Standard Observer: A Human Vision Model for Display Inspection.”
SID Symposium Digest of Tehcnical Papers 37 (2006), 1312–1315.
[1222] R. A. Weale. “Human Lenticular Fluorescence and Transmissivity, and Their Effects on Vi-
sion.” Experimental Eye Research 41:4 (1985), 457–473.
[1223] R. A. Weale. “Age and Transmittance of the Human Crystalline Lens.” Journal of Physiology
395 (1988), 577–587.
[1224] M. S. Weaver, V. Adamovich, M. Hack, R. Kwong, and J. J. Brown. “High Efficiency and
Long Lifetime Phosphorescent OLEDs.” In Proceedings of the International Conference on
Electroluminescence of Molecular Materials and Related Phenomena, pp. 10–35. Washington,
DC: IEEE, 2003.
[1225] E. H. Weber. “De pulsu, resorptione, auditu et tactu.” In Annotationes Anatomicae et Physio-
logicae. Lipsiae: C F Köhler, 1834.
[1226] E. H. Weber. “On the Sense of Touch and Common Sensibility.” In E H Weber On the Tactile
Senses, edited by H. E. Ross and D. J. Murray, Second edition. Hove, UK: Erlbaum, 1996.
[1227] M. A. Webster and J. D. Mollon. “Color Constancy Influenced by Contrast Adaptation.”
Nature 373:6516 (1995), 694–698.
[1228] M. Webster. “Light Adaptation, Contrast Adaptation, and Human Colour Vision.” In Colour
Perception: Mind and the Physical World, edited by R. Mausfeld and D. Heyer, pp. 67–110.
Oxford, UK: Oxford University Press, 2003.
i i
i i
i i
i i
1022 Bibliography
[1229] M. Webster. “Pattern-Selective Adaptation in Color and Form Perception.” In The Visual
Neurosciences, Volume 2, edited by L. Chalupa and J. Werner, pp. 936–947. Cambridge, MA:
MIT Press, 2004.
[1230] A. R. Weeks. Fundamentals of Image Processing. New York: Wiley-IEEE Press, 1996.
[1231] B. Weiss. “Fast Median and Bilateral Filtering.” ACM Transactions on Graphics 25:3 (2006),
519–526.
[1232] E. W. Weisstein. “Full Width at Half Maximum.” In MathWorld — A Wolfram Web Resource.
Wolfram Research, 1999. Available online (http://mathworld.wolfram.com/).
[1233] E. W. Weisstein. “Mie Scattering.” In MathWorld — A Wolfram Web Resource. Wolfram
Research, 1999. Available online (http://mathworld.wolfram.com/).
[1234] S.-W. Wen, M.-T. Lee, and C. H. Chen. “Recent Development of Blue Fluorescent OLED
Materials and Devices.” IEEE/OSA Journal of Display Technology 1:1 (2005), 90–99.
[1235] J. Weng, P. Cohen, and M. Herniou. “Camera Calibration with Distortion Models and Ac-
curacy Evaluation.” IEEE Transactions on Pattern Analysis and Machine Intelligence 14:10
(1994), 965–980.
[1236] S. Wesolkowski, S. Tominaga, and R. D. Dony. “Shading and Highlight Invariant Color Image
Segmentation using the MPC Algorithm.” In Proceedings of the SPIE (Color Imaging: Device-
Independent Color, Color Hard Copy and Graphic Arts VI), pp. 229–240. Bellingham, WA:
SPIE, 2001.
[1237] R. S. West, H. Konijn, W. Sillevis-Smitt, S. Kuppens, N. Pfeffer, Y. Martynov, Y. Takaaki,
S. Eberle, G. Harbers, T. W. Tan, and C. E. Chan. “High Brightness Direct LED Backlight for
LCD-TV.” SID Symposium Digest of Technical Papers 34:1 (2003), 1262–1265.
[1238] G. Westheimer. “Visual Acuity.” Annual Review of Psychology 16 (1965), 359–380.
[1239] G. Westheimer. “Visual Acuity.” In Adler’s Physiology of the Eye, edited by R. A. Moses and
W. M. Hart. St. Louis: The C V Mosby Company, 1987.
[1240] S. Westin, J. Arvo, and K. Torrance. “Predicting Reflectance Functions from Complex Sur-
faces.” In Proceedings SIGGRAPH ’99, Computer Graphics Proceedings, Annual Conference
Series, pp. 255–264. Reading, MA: Addison-Wesley Longman, 1999.
[1241] M. White. “A New Effect on Perceived Lightness.” Perception 8:4 (1979), 413–416.
[1242] T. Whitted. “An Improved Illumination Model for Shaded Display.” Communications of the
ACM 23:6 (1980), 343–349.
[1243] H. Widdel, D. L. Grossman, D. L. Post, and J. Walraven, editors. Color in Electronic Displays.
Dordrecht, Netherlands: Kluwer Academic / Plenum Publishers, 1992.
[1244] T. N. Wiesel and D. H. Hubel. “Spatial and Chromatic Interactions in the Lateral Geniculate
Body of the Rhesus Monkey.” Journal of Neurophysiology 29 (1966), 1115–1156.
[1245] D. R. Williams and A. Roorda. “The Trichromatic Cone Mosaic in the Human Eye.” In Color
Vision: From Genes to Perception, edited by K. R. Gegenfurtner and L. T. Sharpe, pp. 113–122.
Cambridge, UK: Cambridge University Press, 1999.
[1246] T. L. Williams. The Optical Transfer Function of Imaging Systems. Bristol, UK: Institute of
Physics Publishing, 1999.
[1247] S. J. Williamson and H. Z. Cummins. Light and Color in Nature and Art. New York: John
Wiley and Sons, 1983.
[1248] E. N. Wilmer and W. D. Wright. “Colour Sensitivity of the Fovea Centralis.” Nature 156
(1945), 119–121.
[1249] K. Witt. “Modified CIELAB Formula Tested using a Textile Pass/Fail Data Set.” Colour
Research and Application 19 (1994), 273–285.
i i
i i
i i
i i
Bibliography 1023
i i
i i
i i
i i
1024 Bibliography
i i
i i
i i
i i
Bibliography 1025
[1286] M. Yukie and E. Iwai. “Direct Projection from the Dorsal Lateral Geniculate Nucleus to
the Prestriate Cortex in Macaque Monkeys.” Journal of Comparative Neurology 201:1 (1981),
81–97.
[1287] J. A. C. Yule. Principles of Color Reproduction. New York: Wiley, 1967.
[1288] Q. Zaidi, B. Spehar, and M. Shy. “Induced Effects of Background and Foregrounds in Three-
Dimensional Configurations: The Role of T-Junctions.” Perception 26 (1997), 395–408.
[1289] S. A. Zeki. “A Century of Cerebral Achromatopsia.” Brain 113:6 (1990), 1721–1777.
[1290] S. A. Zeki. Vision of the Brain. Oxford, UK: Blackwell Publishers, 1993.
[1291] S. A. Zeki. Inner Vision: An Exploration of Art and the Brain. Oxford, UK: Oxford University
Press, 2000.
[1292] X. M. Zhang and B. Wandell. “A Spatial Extension to CIELAB for Digital Color Image
Reproductions.” Proceedings of the SID Symposium Digest 27 (1996), 731–734.
[1293] R. Zhang, P. Tsai, J. Cryer, and M. Shah. “Shape from Shading: A Survey.” IEEE Transactions
on Pattern Analysis and Machine Intelligence 21:8 (1999), 690–706.
[1294] X. Zhu and S.-T. Wu. “Overview on Transflective Liquid Crystal Displays.” In The 16th
Annual Meeting of the IEEE Lasers and Electro-Optical Society, Volume 2, pp. 953–954. Wash-
ington, DC: IEEE Press, 2003.
[1295] C. Ziegaus and E. W. Lang. “Statistics of Natural and Urban Images.” In Proceedings of the
7th International Conference on Artificial Neural Networks, Lecture Notes in Computer Science
1327, pp. 219–224. Berlin: Springer-Verlag, 1997.
[1296] C. Ziegaus and E. W. Lang. “Statistical Invariances in Artificial, Natural and Urban Images.”
Zeitschrift für Naturforschung A 53a:12 (1998), 1009–1021.
[1297] S. Zigman. “Vision Enhancement using a Short Wavelength Light Absorbing Filter.” Optom-
etry and Vision Science 67:2 (1990), 100–104.
[1298] M. D’Zmura and G. Iverson. “Color Constancy. I. Basic Theory of Two-Stage Linear Recov-
ery of Spectral Description for Lights and Surfaces.” Journal of the Optical Society of America
A 10:10 (1993), 2148–2165.
[1299] M. D’Zmura and G. Iverson. “Color Constancy. II. Results for Two-Stage Linear Recovery
of Spectral Description for Lights and Surfaces.” Journal of the Optical Society of America A
10:10 (1993), 2166–2180.
[1300] M. D’Zmura and P. Lennie. “Mechanisms of Color Constancy.” Journal of the Optical Society
of America A 3:10 (1986), 1662–1673.
[1301] M. D’Zmura, G. Iverson, and B. Singer. “Probabilistic Color Constancy.” In Geometric Rep-
resentations of Perceptual Phenomena, pp. 187–202. Mahwah, NJ: Lawrence Erlbaum, 1995.
[1302] J. A. Zuclich, R. D. Glickman, and A. R. Menendez. “In Situ Measurements of Lens Fluo-
rescence and Its Interference With Visual Function.” Investigative Ophthalmology and Visual
Science 33:2 (1992), 410–415.
i i
i i