May 06 Course 15 Notes Computational Photo Full
May 06 Course 15 Notes Computational Photo Full
May 06 Course 15 Notes Computational Photo Full
Organizers
Ramesh Raskar
Senior Research Scientist
MERL - Mitsubishi Electric Research Labs
201 Broadway
Cambridge, MA 02139
Email: raskar@merl.com
http://www.merl.com/people/raskar/
Jack Tumblin
Assistant Professor
Computer Science Dept., Northwestern University
1890 Maple Avenue, Evanston IL 60201,
Email: jet@cs.northwestern.edu
http://www.cs.northwestern.edu/~jet
http://www.merl.com/people/raskar/photo/
i
Contents
• Cover
o Course Description
o Prerequisites
o Course Schedule
o Lecturer Biographies
o Course Syllabus
• Course Slides
• Papers
• Code
• Bibliography
Acknowledgements
We would like to thank Amit Agrawal for his help on several aspects of this course and for
providing the pseudo-code included in these notes.
We would also like to thank the following for providing us their slides and videos which have
been extensively used throughout this course.
ii
Course Description
Digital photography is evolving rapidly with advances in electronic sensing, processing
and storage. The emerging field of computational photography attempts to exploit the
cheaper and faster computing to overcome the physical limitations of a camera, such as
dynamic range, resolution or depth of field, and extend the possible range of
applications. The computational techniques encompass methods from modification of
imaging parameters during capture to modern image reconstruction methods from the
captured samples.
Many ideas in computational photography are still relatively new to digital artists and
programmers although they are familiar with photography and image manipulation
techniques. A larger problem is that a multi-disciplinary field that combines ideas from
computational methods and modern digital photography involves a steep learning curve.
For example photographers are not always familiar with advanced algorithms now
emerging to capture high dynamic range images, but image processing researchers
face difficulty in understanding the capture and noise issues in digital cameras. These
topics, however, can be easily learned without extensive background. The goal of this
course is to present both aspects in a compact form.
The participants learn about topics in image capture and manipulation methods for
generating compelling pictures for computer graphics and for extracting scene
properties for computer vision, with several examples. We hope to provide enough
fundamentals to satisfy the technical specialist without intimidating the curious graphics
researcher interested in photography.
Prerequisites
A basic understanding of camera operation and image processing is required.
Familiarity with concepts of linear systems, convolution, and machine vision will be
useful.
iii
Course Schedule
A.1 Introduction (Raskar, 5 mins)
A.2 Concepts in Computational Photography (Tumblin, 15 mins)
A.3 Understanding Film-like Photography (Tumblin, 10 mins)
A.4 Image Processing Tools (Raskar, 10 mins)
Q&A (5 minutes)
Topics not covered: film cameras, advanced optics, traditional image processing, image based
rendering (IBR) and novel view synthesis, lighting and illumination hardware technologies,
camera assisted projector display systems, geometric calibration and photometric calibration
techniques, compression, storage, photo-editing software packages, file formats and standards.
iv
Lecturer Biographies
Marc Levoy
Marc Levoy is an Associate Professor of Computer Science and Electrical
Engineering at Stanford University. He received his PhD in Computer
Science from the University of North Carolina in 1989. In the 1970's Levoy
worked on computer animation, developing an early computer-assisted
cartoon animation system. In the 1980's Levoy worked on volume rendering,
a family of techniques for displaying sampled three-dimensional functions,
such as CT and MR data. In the 1990's he worked on technology and
algorithms for 3D scanning. This led to the Digital Michelangelo Project, in
which he and a team of researchers spent a year in Italy digitizing the statues
of Michelangelo using laser rangefinders. His current interests include light
field sensing and display, computational imaging, and digital photography.
Levoy received the NSF Presidential Young Investigator Award in 1991 and
the SIGGRAPH Computer Graphics Achievement Award in 1996 for his work
in volume rendering.
http://graphics.stanford.edu/~levoy/
Shree Nayar
Shree K. Nayar received his PhD degree in Electrical and Computer
Engineering from the Robotics Institute at Carnegie Mellon University in 1990.
He heads the Columbia Automated Vision Environment (CAVE), which is
dedicated to the development of advanced computer vision systems. His
research is focused on three areas; the creation of novel vision sensors, the
design of physics based models for vision, and the development of algorithms
for scene understanding. His work is motivated by applications in the fields of
digital imaging, computer graphics, and robotics. Professor Nayar has
received best paper awards at ICCV 1990, ICPR 1994, CVPR 1994, ICCV
1995, CVPR 2000 and CVPR 2004. He is the recipient of the David and
Lucile Packard Fellowship (1992), the National Young Investigator Award
(1993), the NTT Distinguished Scientific Achievement Award (1994), and the
Keck Foundation Award for Excellence in Teaching (1995). He has published
over 100 scientific papers and has several patents on inventions related to
vision and robotics.
http://www.cs.columbia.edu/~nayar/
v
Ramesh Raskar
Ramesh Raskar is a Senior Research Scientist at MERL. His research
interests include projector-based graphics, computational photography and
non-photorealistic rendering. He has published several articles on imaging
and photography including multi-flash photography for depth edge detection,
image fusion, gradient-domain imaging and projector-camera systems. His
papers have appeared in SIGGRAPH, EuroGraphics, IEEE Visualization,
CVPR and many other graphics and vision conferences. He was a course
organizer at Siggraph 2002 through 2005. He was the panel organizer at the
Symposium on Computational Photography and Video in Cambridge, MA in
May 2005 and taught a graduate level class on Computational Photography at
Northeastern University, Fall 2005. He is a member of the ACM and IEEE.
http://www.merl.com/people/raskar/
Jack Tumblin
Jack Tumblin is an assistant professor of computer science at Northwestern
University. His interests include novel photographic sensors and lighting
devices to assist museum curators in historical preservation, computer
graphics and visual appearance, and image-based modeling and rendering.
During his doctoral studies at Georgia Tech and post-doc at Cornell, he
investigated tone-mapping methods to depict high-contrast scenes. His MS in
Electrical Engineering (December 1990) and BSEE (1978), also from Georgia
Tech, bracketed his work as co-founder of IVEX Corp., (>45 people as of
1990) where his flight simulator design work was granted 5 US Patents. He
was co- organizer of a course on Computational Photography at Siggraph
2005. He is an Associate Editor of ACM Transactions on Graphics.
http://www.cs.northwestern.edu/~jet
vi
Copyright Information
vii
Course 15: Computational Photography Course 15: Computational Photography
Course WebPage
http://www.merl.com/people/raskar/photo
Organisers
Course Evaluation
Ramesh Raskar
Mitsubishi Electric Research Labs
Jack Tumblin http://www.siggraph.org/courses_evaluation
Northwestern University
Course WebPage :
http://www.merl.com/people/raskar/photo
1
Speaker: Jack Tumblin Opportunities
– Unlocking Photography
Jack Tumblin is an Assistant Professor of Computer
Science at Northwestern University.
• How to expand camera capabilities
• Digital photography that goes beyond film-
film-like photography
His interests include novel photographic sensors to assist
museum curators in historical preservation, computer graphics – Opportunities
and visual appearance, and image-based modeling and • Computing corrects for lens, sensor and lighting limitations
rendering. During his doctoral studies at Georgia Tech and
post-doc at Cornell, he investigated tone-mapping methods to • Computing merges results from multiple images
depict high-contrast scenes. His MS in Electrical Engineering • Computing reconstructs from coded image samples
(December 1990) and BSEE (1978), also from Georgia Tech,
bracketed his work as co-founder of IVEX Corp., (>45 people • Cameras benefit from computerized light sources
as of 1990) where his flight simulator design work was granted
5 US Patents. He is an Associate Editor of ACM Transactions
– Think beyond post-
post-capture image processing
on Graphics, was a member of the SIGGRAPH Papers • Computation well before image processing and editing
Committee (2003, 2004), and in 2001 was a Guest Editor of
IEEE Computer Graphics and Applications. – Learn how to build your own camera-
camera-toys
http://www.cs.northwestern.edu/~jet
Computational Photography:
Computational Photography:
Traditional Photography Optics, Sensors and Computations
Generalized
Detector Sensor
Lens
Generalized
Computations
Optics
Image Picture
Novel Cameras Novel Cameras
Generalized Generalized
Sensor Sensor
2
Computational Photography Novel Illumination Computational Photography Novel Illumination
Light Sources Light Sources
Novel Cameras Novel Cameras
Generalized Generalized
Sensor Sensor
Display
Computational Photography Novel Illumination
A Teaser: Dual Photography
Light Sources
Processing Generalized
Optics 4D Incident Lighting
Ray 4D Ray Bender
Reconstruction Upto 4D
Ray Sampler
4D Light Field
Display
Scene
Recreate 4D Lightfield Scene: 8D Ray Modulator
Scene Scene
3
A Teaser: Dual Photography The 4D transport matrix:
Contribution of each projector pixel to each camera pixel
Projector Photocell
Camera projector camera
Scene scene
scene scene
Sen et al, Siggraph 2005 Sen et al, Siggraph 2005
Dual photography
A Brief History of Images 1558
from diffuse reflections
Camera Obscura, Gemma Frisius, 1558
4
A Brief History of Images 1558 A Brief History of Images 1558
1568 1568
1837
1837 1837
Silicon Image Detector, 1970
Old New
• People and Time ~Cheap Precious
• Each photo Precious Free
• Lighting Critical Automated*
• External Sensors No Yes
• ‘Stills / Video’ Disjoint Merged
5
Course: Computational Photography
A2:
What is the Ideal
Photographic Signal?
Jack Tumblin
Northwestern University
What is Photography?
Safe answer:
A wholly new,
expressive medium
(ca. 1830s)
1
What is Photography?
• A ‘bucket’ word: a neat container for messy notions
(e.g. aviation, music, comprehension)
‘Film-Like’ Photography
Film Camera designs still dominate:
– ‘Instantaneous’ light measurement…
– Of focal plane image behind a lens.
– Reproduce those amounts of light;
– EXACT MATCH!
Implied:
“What we see is ≅
focal-plane intensities.”
well, no…
2
Why we like Photography
PHYSICAL Exposure
PERCEIVED
Light & Control,
3D Scene Optics
Scene
light sources, tone map light sources,
BRDFs, BRDFs,
shapes, shapes,
Display
Vision
positions, Image positions,
movements, I(x,y,λ
I(x,y,λ,t) RGB(x,y,tn) movements,
… …
Eyepoint Eyepoint
position, position,
movement, movement,
projection, projection,
… Tangible Record …
Editable, storable as
Film or Pixels
• COMPARE:
– Digital Camera result.
– Digitized (Scanned) Film result.
3
Goals for a New Photography
PERCEIVED
PHYSICAL or UNDERSTOOD
Light &
3D Scene Optics
light sources, 3?D Scene
BRDFs, light sources,
Computing
Sensor(s)
shapes, BRDFs,
Visual
Vision
positions, shapes,
positions,
movements, Stimulus movements,
…
…
Eyepoint
position, Eyepoint
movement, position,
projection, Tangible Record movement,
… projection,
estimates we can …
capture, edit, store, display
Meaning…
4
Missing: More Revealing Sets of Rays
“Multiple-
Multiple-Center-
Center-of-
of-Projection Images”
Images” Rademacher,
Rademacher, P, Bishop, G., SIGGRAPH '98
5
Photographic Signal: Pixels Rays
• Core ideas are ancient, simple, seem obvious:
– Lighting: ray sources
– Optics: ray bending/folding devices
– Sensor: measure light
– Processing: assess it
– Display: reproduce it
• Ancient Greeks:
‘eye rays’ wipe the world
to feel its contents…
http://www.mlahanas.de/Greeks/Optics.htm
http://www.mlahanas.de/Greeks/Optics.htm
Rays
Display
“Scene”
Rays
6
Review: How many Rays are there?
(Gortler et al. ‘96) (Levoy et al. SIGG’96)
7
4-D Illumination Field
Same Idea: Measure all the incoming light rays
8
Is a 4-D Light Source Required?
Is A 4D Camera Required?
e.g. MIT Dynamic Light Field Camera 2002
Is this the whole answer?
• Multiple dynamic
Virtual Viewpoints
• Efficient Bandwidth usage:
‘send only what you see’
• Yang, et al 2002
• 64 tightly packed commodity
CMOS webcams
• 30 Hz, Scaleable, Real-time:
9
Or do Ray Changes Convey Appearance?
5 ray sets Æ explicit geometric occlusion boundaries
10
Mild Viewing & Lighting Changes;
Are these Enough?
Convicing visual appearance:
Is Accurate Depth really necessary?
a few good 2-
2-D images may be enough…
11
Novel Illuminators
Future Photography: Lights
General Optics:
Generalized 4D Ray Benders
4D Ray Benders
General Optics:
Sensor
Ray Reconstructor
Generalized 4D Incident Lighting
Processing
ield
4D Ray
Sampler
tF
igh
DL
Novel Displays
d4
Generalized Display
we
Vie
12
Course : Computational Photography
Jack Tumblin
Northwestern University
‘Film-Like’ Photography
Implied:
“What we see is ≅
focal-plane intensities.”
well, no…
1
‘Film-Like Photography’: Ray Model
Position(x,y)
Angle(θθ,ϕ)
Angle(
‘Center of
Projection’
(P or P2 Origin)
3
Film-Like Photography
• Lighting: Ray Sources (external)
2
Film-like Optics: ‘Thin Lens Law’
http://www.nationmaster.com/encyclopedia/Lens-(optics)
http://www.nationmaster.com/encyclopedia/Lens-(optics)
3
Ray BUNDLES approximate Rays
• Ray Bundles:
Finite, measurable power from combined rays
– A finite span of SOLID ANGLE, and
– A finite span of SURFACE AREA
r
EXAMPLE:
• Power from 1 point
on a spherical lamp?
0/+ = 60Watts / (∞ points)
– BUT has a finite, measurable ratio: (flux/area)
(60Watts / 30 cm2 area) = 2 W / cm2
4
Ray BUNDLES approximate Rays
• Rays are doubly infinitesimal!
A ‘ray’ Leaves 0+ area in 0+ directions
EXAMPLE:
• Power from spherical lamp
in just 1 direction?
0+ = (60Watts / (∞ directions)
– BUT has finite ratio:
4π steradians)
(60Watts / 4π steradians) = 4.77 W / cm2
5
Ray ‘Bundles’
• Smaller lens
– Gathers a narrower ray
bundle:
– Less light: dimmer image
– Deeper depth-of-focus
6
Lens Flaws: Geometric Aberration
• Aberrations:
Real lenses don’t converge rays perfectly
• Spherical: edge rays ≠ center rays
• Coma: diagonal rays focus deeper at edge
http://www.nationmaster.com/encyclopedia/Lens-(optics)
Radial Distortion
(e.g. ‘Barrel’ and ‘pin-cushion’)
straight lines curve around the image center
7
Lens Flaws: Chromatic Aberration
• Dispersion: wavelength-dependent refractive index
– (enables prism to spread white light beam into rainbow)
• Modifies ray-bending and lens focal length: f(λ)
8
Lens Flaws: Intensity Aberrations
Image ‘Vignette’: bright at center, dark at edges.
Polarization
9
Film-like Color Sensing
ÅEquiluminant Curve
defines ‘luminance’
luminance’
vs. wavelength
http://www.yorku.ca/eye/photopik.htm
www.vaytek.com/specDVC.htm
10
Color Sensing
http://www.cooldictionary.com/words/Bayer
http://www.cooldi tionary.com/words/Bayer--filter.wikipedia
• Estimate RGB
at ‘G’ cels from
neighboring
values
http://www.cooldictionary.com/
words/Bayer-filter.wikipedia
11
Conclusions
• Film-like photography methods limit digital
photography to film-like results or less.
Useful links:
12
Image Tools
?
?
1 1 1 1
I(x) G(x) G(x) I(x)
Gradient at x,
G(x) = I(x+1)- I(x) For n intensity values, about n gradients
Forward Difference
Grad X
1 1
G(x) I(x)
1D Integration
I(x) = I(x-1) + G(x)
Grad Y
Cumulative sum
1
Intensity Gradient Vectors in Images
Reconstruction from Gradients
Given G(x,y) = (Gx , Gy)
How to compute I(x,y) for the image ?
Gradient Vector
For n 2 image pixels, 2 n 2 gradients !
Grad X
2D
Integration
Grad Y
2
Intensity Gradient Manipulation Reconstruction from Gradients
A Common Pipeline
Euler-
Euler-Lagrange Equation Application: Compressing Dynamic Range
Maintain local detail at the cost • Fine details consist only of smaller magnitude
of global range gradients
Fattal et al Siggraph 2002
3
Gradient Compression in 1D
Gradient Domain Method
magnitudes) Processing
• Re-
Re-integrate the gradients and invert the log Grad Y New Grad Y
4
Agrawala et al, Digital Photomontage, Siggraph 2004 Agrawala et al, Digital Photomontage, Siggraph 2004
5
Source images Brush strokes Computed labeling
Composite
set ofactual
originals perceived
photomontage
Image objective
0 if red
∞ otherwise
i
Wij
j
Cut: Set of edges whose removal makes a graph disconnected
Wij Si,j : Similarity between pixel i and pixel j
Cost of a cut,
6
Problem with min cuts Normalize cuts in a graph
Ncut = balanced cut
Graph Cuts for Segmentation and Mosaicing Bilateral and Trilateral Filter
Cut ~ String on a height field
7
‘Unilateral’ Filter Bilateral Filter
A 2-D filter window: weights vary with intensity
Forces a Tradeoff: [Tomasi&Manduchi1998]
-Broad window: better detail removal Further Analysis: [Black99] [Elad02] [Durand&Dorsey02], ...
-- OR -- Range
-Narrow window: better large structure c: distance from input (domain of input)
f(x) s: difference from input (range of input)
But we want BOTH... x
Domain
8
Bilateral Filter: 3 Difficulties Bilateral Filter: 3 Difficulties
c
BilateralÆ
BilateralÆTrilateral Filter BilateralÆ
BilateralÆTrilateral Filter
Key Ideas: Key Ideas:
c c
• Tilt the filter window s • Tilt the filter window s
according to bilaterally- according to bilaterally-
smoothed gradients smoothed gradients
9
.
Application: Tone Mapping
• Filter removes details.
Details
In
w
(<1: detail strength)
Base Out
Detail-
Detail- 10x
log10 Removing 10X 10X 10X
Filter Æ Æ Æ
γ (<1: compression)
More Trilateral
Results .
Comparable to •,
Gradient Attenuation
[Fattal et al 2002]
Similar to LCIS
[Tumblin&Turk`99]
[Bertozzi’03]
Simple, Robust
10
Course 15: Computational Photography
Image Fusion and Reconstruction
B1: Reconstruction
• Epsilon Photography
– Vary time, view
– Vary focus, exposure polarization, illumination
– Better than any one photo
Time-
Time-Lapse Time-
Time-Lapse
• Duchamp
– Nude Descending a Staircase
Freeman et al 2003
Agrawala et al, Digital Photomontage, Siggraph 2004
1
Source images Computed labeling Computer Vision Techniques
• Photometric Stereo
– Varying light source positions
– Estimate surface normal from shading
– Diffuse objects: minimum 3 lights
Composite
• Depth from Defocus
– Varying focus
• Defogging
– Varying time and polarization
Varying Focus: Depth from Defocus Varying Focus: Depth from Defocus
(Nayar, Watanabe and Noguchi, 95 ) (Nayar, Watanabe and Noguchi, 95 )
image image
detectors lens detectors lens
scene scene
P P
f f
Q Q
i o i o
near far
focus aperture focus aperture
1 1 1 1 1 1
+ = + =
Previous Work: o i f Previous Work: o i f
Pentland 87, Subbarao 88, Nayar 89. Pentland 87, Subbarao 88, Nayar 89.
Clear Day from Foggy Days
Real Time Defocus Depth Camera (Movies)
Two Different Foggy Conditions (Shree Nayar, Srinivasa Narasimhan 00)
(Nayar , Watanabe , Noguchi 95 )
Clear Day Image
Time: 3 PM Deweathering
2
Varying Polarization Varying Polarization
Yoav Y. Schechner, Nir Karpel 2005
• Schechner,
Schechner, Narasimhan,
Narasimhan, Nayar
• Instant dehazing
Best polarization state
of images using
polarization
+ =
Non-
Non-photorealistic Camera:
Depth Edge Detection and Stylized Rendering
using
Multi-
Multi-Flash Imaging
3
Our Method
Film-
Film-Style Camera: Dynamic Range Limits Problem:Map Scene to Display
Domain of Human Vision:
from ~10-6 to ~10+8 cd/m2
starlight moonlight office light daylight flashbulb
Under-Exposure
• Highlight details: Captured
Over-Exposure
• Highlight details: Lost
?? 0 255
??
• Shadow details: Lost • Shadow details: Captured
Range of Typical Displays:
from ~1 to ~100 cd/m2
4
High dynamic range capture (HDR) Debevec’97 (see www.HDRshop.com)
www.HDRshop.com)
Pixel Value Z
j=0 1 2 3 4 5 6
j=0
• overcomes one of photography’s key limitations j=1
i=2
– negative film = 250:1 (8 stops) j=3
j=4
– paper prints = 50:1 j=5
j=6
– [Debevec97] = 250,000:1 (18 stops) logLi
– hot topic at recent SIGGRAPHs
? f(logL)
f(logL)
STEP 1:
→ --number
--number the images ‘i’,
--pick
--pick fixed spots (x
(xj,yj)
that sample scene’s
radiance values logLi well:
© 2004 Marc Levoy
Pixel Value Z
5
HDR Direct Sensing? HDR From Multiple Measurements
– Self-
Self-resetting pixels (e.g. sMaL /Cypress Semi) Ginosar et al 92, Burt & Kolczynski 93,
Madden 93, Tsai 94, Saito 95, Mann & Picard 95,
– Gradient detectors (CVPR 2005 Tumblin,Raskar et al) Debevec & Malik 97, Mitsunaga & Nayar 99,
Robertson et al. 99, Kang et al. 03 Computed Image
MANY ways to make multiple exposure measurments
MANY ways to make multiple exposure measurments
MANY ways to make multiple exposure measurements
Sequential Exposure Change:
Assorted‐
Assorted‐Pixel Camera Prototype Another Approach: Locally Adjusted Sensor Sensitivity
( Courtesy : Sony Kihara Research Lab )
Computational Pixels:
6
Sensor: LCD Adaptive Light Attenuator Improving FILM-
FILM-LIKE
light Camera Performance
Unprotected
8-bit Sensor
attenuator
element Tt+1 Output: • Vary Focus Point-by-Point
Controller
detector
element
It
LCD Light Attenuator
limits image intensity
reaching 8-bit sensor
Attenuator-
Protected
8-bit Sensor
Output
Single-
Single-Axis Multi-
Multi-Parameter Camera
High depth-of-field (SAMP)
Levoy et al., SIGG2005
Idea:
• adjacent views use different focus settings Cameras + Beamsplitters
• for each pixel, select sharpest view Place MANY (8) cameras
at same virtual location
[Haeberli90]
‘Co-located’
Back Fore Lenses
zF zB
Fore & Back
Focal Planes
Strongly desired in microscopy, too: see
http://www.micrographia.com/articlz/artmicgr/mscspec/mscs0100.htm
7
Long-range
Focus Adjustment: Sum of Bundles
synthetic aperture photography
Improving FILM-
FILM-LIKE A tiled camera array
Camera Performance
8
1/60 1/60 Improving FILM-
FILM-LIKE
same exposure Camera Performance
in all cameras 1/60 1/60
1/60
1/120
same and
overlapped 50%
1/60 1/30
Say you want 120 frame per second (fps) video. Say you want 120 frame per second (fps) video.
• You could get one camera that runs at 120 fps • You could get one camera that runs at 120 fps
• Or… • Or… get 4 cameras running at 30 fps.
• Multiple measurements:
– Multi-camera, multi-sensor, multi-optics, multi-lighting
9
Computational Cameras:
Convergence of Optics and Software
Shree K. Nayar
Computer Science
Columbia University
http://www.cs.columbia.edu/CAVE/
Support:
NSF, ONR, Packard Foundation
T. C. Chang Endowed Chair
Traditional Camera
Detector
Lens
Pixels
c Shree Nayar, Columbia University
1
Computational Cameras
Detector
Computations New Optics
Pixels
Vision
c Shree Nayar, Columbia University
Wide Angle Imaging
Examples: Rees 70, Charles 87, Nayar 88,
Examples: Disney 55, McCutchen 91, Nalwa 96, Yagi 90, Hong 91, Yamazawa 95, Bogner 95,
Swaminathan & Nayar 99, Cutler et al. 02 Nalwa 96, Nayar 97, Chahl & Srinivasan 97
c Shree Nayar, Columbia University
2
What’s the Mirror’s Shape ?
camera
(with Simon Baker, ICCV 98)
scene
lens
z
mirror z(r) Complete Class of Mirrors
c2 ⎛ k − 2 ⎞
2
⎛ c⎞ 2⎛ k ⎞
viewpoint r ⎜ z − ⎟ − r ⎜ − 1⎟ = ⎜ ⎟ (k > 0)( k ≥
⎝ 2⎠ ⎝2 ⎠ 4 ⎝ k ⎠
2⎛ c2 ⎞ ⎛ 2k − c 2 ⎞
2
⎛ c⎞
⎜ z − ⎟ − r ⎜⎜1 + ⎟ = ⎜⎜ ⎟⎟ (k > 2)( k >
⎝ 2⎠ ⎝ 2 k ⎟⎠ ⎝ 4 ⎠
c Shree Nayar, Columbia University
OneShot 360 by RemoteReality
(Nayar 97)
4 Megapixel (2000 x 2000)
360 degree still camera
c Shree Nayar, Columbia University
3
c Shree Nayar, Columbia University
4
Omnidirectional Periscope Wide Area Surveillance
Vehicle Navigation
Perimeter Monitoring
Commercial
Security
Virtual Tours
c Shree Nayar, Columbia University Near Vehicle Awareness (Courtesy : RemoteReality Inc.)
Radial Stereoscopic Imaging
(with Sujit Kuthirummal, SIGGRAPH 06)
c Shree Nayar, Columbia University
5
c Shree Nayar, Columbia University
c Shree Nayar, Columbia University
6
Mosaicing
c Shree Nayar, Columbia University
……Redundant Measurements
c Shree Nayar, Columbia University
7
Generalized Mosaicing
( Schechner and Nayar, ICCV 2001 )
Field of View
camera Dynamic Range
Spectrum
spatially varying
Depth of Field
optical filter
Polarization
c Shree Nayar, Columbia University
Exposure Spectrum Polarization Focus
focus
focal
c Shree Nayar, Columbia University
8
High Dynamic Range Mosaicing
Attenuation
c Shree Nayar, Columbia University
High Dynamic Range Mosaic
88 - 18,794
c Shree Nayar, Columbia University
9
Multispectral Mosaicing
Spectral
λ
x
c Shree Nayar, Columbia University
Multispectral Mosaic
400 500 600 λ 700 400 500 600 λ 700 400 500 600 λ 700
c Shree Nayar, Columbia University
10
High Dynamic Range Imaging: Assorted Pixels
scene
Spatially Varying
Exposures (SVE)
optical mask
pixels
image detector
( with Tomoo Mitsunaga, CVPR 2000)
c Shree Nayar, Columbia University
c Shree Nayar, Columbia University
11
Sony Cybershot Sony Cybershot with Assorted Pixels
c Shree Nayar, Columbia University
Motion Blur
Object Motion
Camera Motion
c Shree Nayar, Columbia University
12
Fundamental Trade‐Off in Imaging
3M 75K
2048x1536 320x240
Debluring Approach: Hybrid Imaging
y
Low‐Res. Camera
Motion Analysis
Same Time Period
PSF Estimation
High‐Res. Camera
Deconvolution
c Shree Nayar, Columbia University
13
Hybrid Imaging System: Prototype
Primary Detector
(2048x1536)
Secondary Detector
(360x240)
Resolution Ratio of 1 : 36
c Shree Nayar, Columbia University
Example: Blurred High Resolution Image
f = 633mm, Exp. Time 1 Sec (> ‐9 stops)
c Shree Nayar, Columbia University
14
Example: PSF Estimation from Motion
Estimated PSF
90 0.06
Low Resolution Video
Y (Pixels)
10
0.001
10 130
X (Pixels)
c Shree Nayar, Columbia University
Example: PSF Estimation from Motion
Blurred Image Deblurred image
Tripod image (Ground Truth)
c Shree Nayar, Columbia University
15
Super‐Resolution using Jitter Video
Conventional Video Jitter Video
Ti Ti
m m
e e
Space Space
(with Moshe Ben‐Ezra and Assaf Zomet, CVPR 2004)
c Shree Nayar, Columbia University
Jitter Camera
Lens Detector
Micro‐Actuator
Jitter is Instantaneous and Synchronized
c Shree Nayar, Columbia University
16
Jitter Camera
Lens Detector
Micro‐Actuator
Jitter is Instantaneous and Synchronized
c Shree Nayar, Columbia University
Computer Controlled Computer Controlled
Y Micro‐Actuator X Micro‐Actuator
Lens
c Shree Nayar, Columbia University
Board Camera
17
Super‐Resolution using Jitter Video
1 (out of 4) Jitter Camera Image Super‐Resolution
c Shree Nayar, Columbia University
18
Imaging Through Micro‐Mirrors
(with Vlad Branzoi and Terry Boult, 2004)
detector
scene xi
black
viewpoint
surface
ni
oi nb
mirror array
Geometry: Ray Orientation Photometry: Ray Attenuation
ni
t ( )
oi ( ) = G ( )
xi ni ai ( ) =
xi
ni + t ( )
t ( ) nb
c Shree Nayar, Columbia University
Digital Micromirror Device (DMD)
(by Texas Instruments)
o o
‐10 10
14 um
DMD for Imaging:
(Malbet et al. 95, Kearney et al. 98, Castracane et al. 99, Christensen et al. 02)
c Shree Nayar, Columbia University
19
Programmable Imaging System
Imaging Lens DMD Electronics
Modulation: Examples
Scene DMD Image Camera Image
* =
* =
* =
c Shree Nayar, Columbia University
20
Adaptive Dynamic Range Imaging (ADR)
Normal (Constant Exposure) Video
Pixel‐wise Dynamic
Range Control
(Nayar & Branzoi 03)
(Christensen et al. 02)
ADR Video DMD Control Video
c Shree Nayar, Columbia University
Camera with a Lens
Scene
Aperture
Lens
Image Detector
c Shree Nayar, Columbia University
21
Lensless Camera with Volumetric Aperture
Scene
Volumetric
Aperture
Image Detector
(with Assaf Zomet, 2005)
c Shree Nayar, Columbia University
Single Aperture Layer
u = tan(α )
S (u, v )
Scene
Single Layer
Aperture
α
Image Detector
f
22
Initial Implementation: LCD Attenuator
Camera without Lens LCD Aperture
LCD
Controller
c Shree Nayar, Columbia University
Panning without Moving Parts
LCD Attenuator
Image Detector
Captured Video
c Shree Nayar, Columbia University
23
Multiple Aperture Layers
u = tan(α )
S (u, v )
Scene
Multi‐Layered
Aperture
α
j=1
j=2
.
. Image Detector
I ( x, y ) = ∫ ∫ S (u , v )∏ T j (x − f j u, y − f j v )dudv
N
Pixel Brightness:
j =1
Conventional View Desired View
c Shree Nayar, Columbia University
24
Split Field of View using Multiple Layers
Fov 2
Fov 1
Fov 3
Attenuating Layers
Pinholes
Split Field of View
Lens
Camera
Lensless
Camera
c Shree Nayar, Columbia University
25
Computational Cameras
Detector
Computations New Optics
Pixels
Vision
c Shree Nayar, Columbia University
Programmable Imaging
Detector
Computations New Optics
Pixels
Vision Programmable Controller
c Shree Nayar, Columbia University
26
Light field
photography and videography
Marc Levoy
Time = 1
List of projects
Time = 2
High performance imaging
using large camera arrays
Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez,
Adam Barth, Andrew Adams, Mark Horowitz, Marc Levoy
Time = 3
Stanford multi-camera array
• synchronized timing
• continuous streaming
• flexible arrangement
Time = 4
Ways to use large camera arrays
Time = 5
Intermediate camera spacing:
synthetic aperture photography
Time = 6
Example using 45 cameras
[Vaish CVPR 2004]
•Reference:
–Vaish, V., Wilburn, B., Joshi, N., Levoy, M., Using
Plane + Parallax for Calibrating Dense Camera
Arrays,
Proc. CVPR 2004.
Time = 7
•At left is a single view, at right is a sum of all views,
appropriately shifted.
•For the movie, see 2nd bullet of “Slides and videos” on
http://graphics.stanford.edu/projects/array/
Time = 8
Video
•Video available at
http://graphics.stanford.edu/papers/CameraArray/
Time = 9
Tiled camera array
Can we match the image quality of a cinema camera?
Time = 10
Tiled panoramic image
(before geometric or color calibration)
Time = 11
Tiled panoramic image
(after calibration and blending)
Time = 12
Tiled camera array
Can we match the image quality of a cinema camera?
Time = 13
same exposure
in all cameras
individually
metered
checkerboard
of exposures
Time = 14
High-performance photography
as multi-dimensional sampling
• spatial resolution
• field of view
• frame rate
• dynamic range
• bits of precision
• depth of field
• focus setting
• color sensitivity
Time = 15
Spacetime aperture shaping
Time = 16
• center of aperture: few cameras, long exposure →
high depth of field, low noise,
but action is blurred
• periphery of aperture: many cameras, short exposure →
freezes action, low noise,
but low depth of field
Time = 17
Time = 18
Time = 19
Light field photography using a
handheld plenoptic camera
Ren Ng, Marc Levoy, Mathieu Brédif,
Gene Duval, Mark Horowitz and Pat Hanrahan
Time = 20
Conventional versus light field camera
Time = 21
Conventional versus light field camera
uv-plane st-plane
Time = 22
Conventional versus light field camera
st-plane uv-plane
Time = 23
Prototype camera
Time = 24
Mechanical design
Time = 25
Time = 26
Prior work
• integral photography
– microlens array + film
– application is autostereoscopic effect
• [Adelson 1992]
– proposed this camera
– built an optical bench prototype using relay lenses
– application was stereo vision, not photography
•Reference:
–Adelson, E.H., Wang, J.Y.A., Single Lens Stereo
with a Plenoptic Camera ,
–IEEE Transactions on Pattern Analysis and
Machine Intelligence (PAMI),
Vol. 14, No. 2, 1992, pp. 99-106.
Time = 27
Digitally stopping-down
Time = 28
Digital refocusing
Time = 29
A digital refocusing theorem
– or –
Time = 30
Example of digital refocusing
Time = 31
Refocusing portraits
Time = 32
Action photography
Time = 33
Extending the depth of field
Time = 34
Macrophotography
Time = 35
Digitally moving the observer
Time = 36
Example of moving the observer
Time = 37
Moving backward and forward
Time = 38
Implications
Time = 39
Dual Photography
Time = 40
Helmholtz reciprocity
light
camera
scene
Time = 41
Helmholtz reciprocity
camera
light
scene
Time = 42
Measuring transport along a set of paths
projector photocell
scene
Time = 43
Reversing the paths
camera
point light
scene
Time = 44
Forming a dual photograph
projector photocell
scene
Time = 45
Forming a dual photograph
“dual” camera
“dual” light
image of
scene
scene
Time = 46
Physical demonstration
Time = 47
Related imaging methods
• time-of-flight scanner
– if they return reflectance as well as range
– but their light source and sensor are typically coaxial
Time = 48
The 4D transport matrix
projector camera
scene
Time = 49
The 4D transport matrix
projector camera
P C
pq x 1 mn x 1
mn x pq
scene
Time = 50
The 4D transport matrix
mn x pq
C = T P
mn x 1 pq x 1
Time = 51
The 4D transport matrix
mn x pq
1
0
C = T 0
0
0
mn x 1 pq x 1
Time = 52
The 4D transport matrix
mn x pq
0
1
C = T 0
0
0
mn x 1 pq x 1
Time = 53
The 4D transport matrix
mn x pq
0
0
C = T 1
0
0
mn x 1 pq x 1
Time = 54
The 4D transport matrix
mn x pq
C = T P
mn x 1 pq x 1
Time = 55
The 4D transport matrix
mn x pq
C = T P
mn x 1 pq x 1
C’ = TT P’
pq x 1 mn x 1
Time = 56
Example
Video available at
http://graphics.stanford.edu/papers/dual_photography/
Time = 57
Properties of the transport matrix
Time = 58
Dual photography
from diffuse reflections
Time = 59
The relighting problem
Paul Debevec’s
Light Stage 3
Time = 60
The 6D transport matrix
Time = 61
The 6D transport matrix
Time = 62
The advantage of dual photography
Time = 63
Measuring the 6D transport matrix
mirror array
projector
scene
Time = 64
Relighting with complex illumination
camera array
projector
pq x mn x uv
C’ = TT P’
scene pq x 1 mn x uv x 1
Time = 65
Running time
•5 minutes
–using a video-rate camera
–and (effectively) measuring 1M x 1M transport
entries
–for scenes having average amounts of diffuse
interreflection
–everything depends on the density of the T matrix
Time = 66
Can we measure an 8D transport matrix?
projector array camera array
scene
Time = 67
C = T P
mn x 1 mn x pq pq x 1
Time = 68
Computational Imaging
in the Sciences (and Medicine)
Marc Levoy
Due to copyright restrictions, some images have been removed from this version of the slides.
To see presentation with these images intact, go to:
http://graphics.stanford.edu/courses/cs448a-06-winter/
and look under the heading “Lectures for SIGGRAPH 2006 course on Computational Photography”.
Time = 1
Some examples
• medical imaging
– rebinning inspiration for light field rendering
9 – transmission tomography
– reflection tomography (for ultrasound)
• geophysics
9 – borehole tomography
– seismic reflection surveying
• applied physics
9 – diffuse optical tomography
9 – diffraction tomography 9 in this lecture
time-of-flight or wave-based
– inverse scattering ©2006 Marc Levoy
Time = 2
• biology
9 – confocal microscopy applicable at macro scale too
9 – deconvolution microscopy related to tomography
• astronomy
9 – coded-aperture imaging
– interferometric imaging
• airborne sensing
– multi-perspective panoramas
9 – synthetic aperture radar
Time = 3
• optics
9 – holography
– wavefront coding
Time = 4
Confocal scanning microscopy
light source
pinhole
•typical reference:
–Corle, T.R.. Kino, G.S. Confocal Scanning
Optical Microscopy and Related Imaging
Systems,
Academic Press, 1996.
•if you introduce a pinhole
–only one point on the focal plane will be
illuminated
Time = 5
Confocal scanning microscopy
r
light source
pinhole
pinhole
photocell © 2004 Marc Levoy
Time = 6
Confocal scanning microscopy
light source
pinhole
pinhole
photocell © 2004 Marc Levoy
Time = 7
Confocal scanning microscopy
light source
pinhole
pinhole
photocell © 2004 Marc Levoy
Time = 8
Image removed due to
copyright restrictions
[UMIC SUNY/Stonybrook]
Time = 9
Synthetic confocal scanning
[Levoy 2004]
light source
→ 5 beams
→ 0 or 1 beams
•our goal
–is to approximate this effect at the large
scale
•we can understand the photometry of this setup
–using a simplified counting argument
•Reference:
–Levoy, M., Chen, B., Vaish, V., Horowitz,
M., McDowall, I., Bolas, M., Synthetic
aperture confocal imaging,
Proc. SIGGRAPH 2004.
Time = 10
Synthetic confocal scanning
light source
→ 5 beams
→ 0 or 1 beams
•5:0 or 5:1
–if we had 5 cameras as well as 5 projectors,
then the ratio would be 25:0 or 25:1
Time = 11
Synthetic confocal scanning
→ 5 beams
→ 0 or 1 beams
dof
• works with any number of projectors ≥ 2
• discrimination degrades if point to left of
• no discrimination for points to left of
• slow!
• poor light efficiency ©2006 Marc Levoy
•depth of field
–a microscoper would call it the axial
resolution
–to make the depth of field shallower,
spread out the projectors, i.e. a larger
synthetic aperture
Time = 12
Synthetic coded-aperture
confocal imaging
•Reference:
–Wilson, T., Juskaitis, R., Neil, M.A.A.,
Kozubek, M., Confocal microscopy by
aperture correlation,
Optics Letters, Vol. 21, No. 23, December 1,
1996.
Time = 13
Synthetic coded-aperture
confocal imaging
Time = 14
Synthetic coded-aperture
confocal imaging
Time = 15
Synthetic coded-aperture
confocal imaging
Time = 16
Synthetic coded-aperture
confocal imaging
100 trials
→ 2 beams × 50/100 trials = 1
→ ~1 beam × 50/100 trials = 0.5
Time = 17
Synthetic coded-aperture
confocal imaging
100 trials
→ 2 beams × 50/100 trials = 1
→ ~1 beam × 50/100 trials = 0.5
floodlit
→ 2 beams
→ 2 beams
trials – ¼ × floodlit
→ 1 – ¼ ( 2 ) = 0.5
→ 0.5 – ¼ ( 2 ) = 0
Time = 18
Synthetic coded-aperture
confocal imaging
100 trials
→ 2 beams × 50/100 trials = 1
→ ~1 beam × 50/100 trials = 0.5
floodlit
→ 2 beams
→ 2 beams
trials – ¼ × floodlit
→ 1 – ¼ ( 2 ) = 0.5
• works with relatively few trials (~16) → 0.5 – ¼ ( 2 ) = 0
• 50% light efficiency
• works with any number of projectors ≥ 2
• discrimination degrades if point vignetted for some projectors
• no discrimination for points to left of
• needs patterns in which illumination of tiles are uncorrelated
©2006 Marc Levoy
Time = 19
Example pattern
Time = 20
What are good patterns?
pattern one trial 16 trials
Time = 21
Patterns with less aliasing
multi-phase
sinusoids?
[Neil 1997]
•reference:
–Neil, M.A.A., Juskaitis, R., Wilson, T.,
Method of obtaining optical sectioning by
using structured light in a conventional
microscope,
Optics Letters, Vol. 22, No. 24, December
15, 1997.
Time = 22
Implementation
using an array of mirrors
Time = 23
Implementation using an
array of mirrors
Time = 24
Confocal imaging in scattering media
• small tank
– too short for attenuation
– lit by internal reflections
Time = 25
Experiments in a large water tank
Time = 26
Experiments in a large water tank
•titanium dioxide
–the stuff in white paint
Time = 27
Experiments in a large water tank
Time = 28
Seeing through turbid water
Time = 29
Other patterns
sparse grid
staggered grid
swept stripe
© 2004 Marc Levoy
Time = 30
Other patterns
Time = 31
Stripe-based illumination
Time = 32
Application to
underwater exploration
[Ballard/IFE 2004]
Image removed due to
copyright restrictions
Time = 33
Shaped illumination in a
computer vision algorithm
Time = 34
Shaped illumination in a
computer vision algorithm
Time = 35
Confocal imaging versus
triangulation rangefinding
• triangulation • confocal
– line sweep of W pixels or log(W) – point scan over W2 pixels or time
time sequence of stripes, W ≈ 1024 sequence of T trials, T ≈ 32-64
– projector and camera lines of sight – works if some fraction of aperture
must be unoccluded, so requires S is unoccluded, but gets noisier, max
scans, 10 ≤ S ≤ 100 aperture ≈ 90°, so 6-12 sweeps?
– one projector and camera – multiple projectors and cameras
– S log(W) ≈ 100-1000 – 6 T = 200-800 no moving parts
30º
90º
© 2006 Marc Levoy
Time = 36
The Fourier projection-slice theorem
(a.k.a. the central section theorem) [Bracewell 1956]
P(t
) G(ω)
Image removed due to
copyright restrictions
(from Kak)
•References:
–Bracewell, R. N. Strip Integration in Radio
Astronomy,
Australian Journal of Physics, Vol. 9, 1956,
No. 2, pp. 198-217.
–Kak, A, Slaney, M., Principles of
Computerized Tomographic Imaging,
IEEE Press, 1988.
Time = 37
Reconstruction of g(x,y)
from its projections
P(t)
G(ω)
Image removed due to
copyright restrictions
G θ (ω ) = G(u, v) |θ
P(t, s)
+∞
Pθ (t) = ∫ Pθ (t, s) ds (from Kak)
−∞
Time = 38
The need for filtering before
(or after) backprojection
v
1/ω |ω|
u
ω ω
hot spot correction
Time = 39
)-1 { |ω| } = Hilbert transform of (∂/ ∂t) P(t)
Image removed due to
= −1 / ( π t ) * (∂/ ∂t) P(t) copyright restrictions
= )-1{ lim
ε→0
ω e −ε ω }
Image removed due to
copyright restrictions
v
(from Bracewell)
1/ω |ω|
• sum of slices would create 1/ω hot spot at origin ~2nd derivative
• correct by multiplying each slice by |ω|, or
• convolve P(t) by )-1 { |ω| } before backprojecting
• this is called filtered backprojection
•Reference:
–Bracewell, R.N., The Fourier Transform
and its Applications,
2nd ed., McGraw-Hill, 1985.
Time = 40
Summing filtered
backprojections
(from Kak)
Time = 41
Example of reconstruction
by filtered backprojection
X-ray sinugram
(from Herman)
filtered sinugram reconstruction
©2006 Marc Levoy
•Reference:
–Herman, G.T., Image Reconstruction from
Projections,
Academic Press, 1980.
Time = 42
More examples
CT scan
of head
Time = 43
Shape from light fields
using filtered backprojection
backprojection occupancy
•Reference:
–M. Levoy, unpublished.
•but also see:
–Brady, D.J., Stack, R., Feller, S., Cull, E.,
Fernandez, L., Kammeyer, D., and Brady,
R., Information Flow in Streaming 3D
Video,
Proc. SPIE, Vol. CR76-13, 2000.
Time = 44
Relation to Radon Transform
P( r, θ )
• Radon transform
+∞
P( r, θ ) = ∫ g ( r cos θ − s sin θ, r sin θ + s cos θ ) ds
−∞
Time = 45
= [ B H 1 D1 P ] ( r , θ ) where
D1 is partial differenti ation wrt the 1st variab le,
H 1 is the Hilbert tr ansform wrt the 1st variab le, and
B is backprojec tion
• Radon transform
+∞
P( r, θ ) = ∫ g ( r cos θ − s sin θ, r sin θ + s cos θ ) ds
−∞
Time = 46
Higher dimensions
• Radon transform in ·n
– P(r,ξ), where ξ is a unit vector in ·n, r is a scalar,
and P contains integrals over (n-1)-D hyperplanes Image removed due to
copyright restrictions
– in 2D: each point (in P) is the integral along the line (in g)
perpendicular to a ray connecting that point and the origin
– in 3D: each point is the integral across a plane
normal to a ray connecting that point and the origin ©2006 Marc Levoy
•Reference:
–Deans, S., The Radon Transform and Some
of its Applications,
Krieger, 1983.
Time = 47
Frequency domain volume rendering
[Totsuka and Levoy, SIGGRAPH 1993]
volume rendering
•Reference:
–Totsuka, T. and Levoy, M., Frequency Domain
Volume Rendering,
Proc. SIGGRAPH 1993.
•Example depth cueing
–sinusoidal falloff = multiplication of volume by large
sinusoid
–= convolution of spectrum by F(sinusoid) =
convolution by spike = shifting the 3D spectrum
before extracting slice!
•Example directional shading
–we can’t compute |N.L| or max(N.L,0)
–Lambertian under hemispherical shading = ½ + ½
N.L, which smoothly maps N.L to 0..1
–N.L = first derivative of volume in direction x of
pole = directional first moment (x f(x)) of spectrum,
which again can be computed while extracting a slice
Time = 48
Other issues in tomography
Time = 49
Limited-angle projections
(from Olson)
•Reference:
–Olson, T., Jaffe, J.S., An explanation of the
effects of squashing in limited angle
tomography,
IEEE Trans. Medical Imaging, Vol. 9, No.
3., September 1990.
Time = 50
Reconstruction using Algebraic
Reconstruction Technique (ART)
N
pi = ∑ wij f j , i = 1, 2, … , M
j =1
Time = 51
Image removed due to
copyright restrictions
( k −1) f ( k −1) • ( wi − pi )
f (k )
= f − wi
wi • wi
Image removed due to
copyright restrictions
Procedure
• make an initial guess, e.g. assign zeros to all cells
• project onto p1 by increasing cells along ray 1 until Σ = p1
• project onto p2 by modifying cells along ray 2 until Σ = p2, etc.
• to reduce noise, reduce by α Δf (k ) for α < 1
•Formula is derived in Kak, chapter 7, p. 278
Time = 52
• linear system, but big, sparse, and noisy
• ART is solution by method of projections [Kaczmarz 1937]
• to increase angle between successive hyperplanes, jump by 90°
• SART modifies all cells using f (k-1), then increments k
• overdetermined if M > N, underdetermined if missing rays
• optional additional constraints:
• f > 0 everywhere (positivity)
• f = 0 outside a certain area
Procedure
• make an initial guess, e.g. assign zeros to all cells
• project onto p1 by increasing cells along ray 1 until Σ = p1
• project onto p2 by modifying cells along ray 2 until Σ = p2, etc.
• to reduce noise, reduce by α Δf (k ) for α < 1
•SIRT = Simultaneous Itertative Reconstruction
Technique
•SART = Simultaneous ART
Time = 53
• linear system, but big, sparse, and noisy
• ART is solution by method of projections [Kaczmarz 1937]
• to increase angle between successive hyperplanes, jump by 90°
• SART modifies all cells using f (k-1), then increments k
• overdetermined if M > N, underdetermined if missing rays
• optional additional constraints:
• f > 0 everywhere (positivity)
• f = 0 outside a certain area
(Olson)
•Reference:
–Olson, T., A stabilized inversion for limited
angle tomography. Manuscript.
–35 degrees missing
Time = 54
Image removed due to
copyright restrictions
(Olson)
•Nonlinear constraints
–f = 0 outside of circle (oval?)
Time = 55
Shape from light fields
using iterative relaxation
•Reference:
–M. Levoy, unpublished
•but also see:
–DeBonet, J., Viola, P., Poxels:
responsibility weighted 3D volume
reconstruction,
Proc. ICCV 1999.
Time = 56
Borehole tomography
(from Reynolds)
•Reference:
–Reynolds, J.M., An Introduction to Applied
and Environmental Geophysics,
Wiley, 1997.
Time = 57
Applications
Time = 58
From microscope light fields
to volumes
• 4D light field → digital refocusing →
3D focal stack → deconvolution microscopy →
3D volume data
•Reference:
–http://www.api.com/lifescience/DeltaVision
RT.html
Time = 59
3D deconvolution
[McNally 1999]
•Reference:
–McNally, J.G., Karpova, T., Cooper, J.,
Conchello, J.A., Three-Dimensional Imaging
by Deconvolution Microscopy,
Methods, Vol. 19, 1999.
Time = 60
Example: 15μ hollow fluorescent bead
conventional microscope
* =
* =
•Images from:
– Levoy, M., Ng, R., Adams, A., Footer, M.,
Horowitz, M., Light field microscopy,
ACM Transactions on Graphics (Proc.
SIGGRAPH), 25(3), 2006.
Time = 61
Silkworm mouth
(collection of B.M. Levoy)
Time = 62
Legs of unknown insect
(collection of B.M. Levoy)
Time = 63
Tomography and 3D deconvolution:
how different are they?
Fourier domain
• deconvolution
) ) -1
– 4D LF → refocusing → 3D spectrum → ){PSF} →
volume
• tomography
) ) -1
• deconvolution
– 4D LF → refocusing → 3D stack → inverse filter → volume
• tomography
© 2006 Marc Levoy
4D LF b k j ti b k j ti filt l
•Full proof appears in [Levoy 2006] (previously
cited)
Time = 64
For finite apertures,
they are still the same
• deconvolution
– nonblind iterative deconvolution with
positivity constraint on 3D reconstruction
• limited-angle tomography
– Simultaneous Algebraic Reconstruction
Technique (SART) with same constraint
Time = 65
Their artifacts are also the same
[Delaney 1998]
[McNally 1999]
© 2006 Marc Levoy
•References:
–Delaney, A.H., Bresler, Y., Globally
Convergent Edge-Preserving Regularized
Reconstruction: An Application to Limited-
Angle Tomography,
IEEE Transactions on Image Processing,
Vol. 7, No. 2, February 1998.
–others are previously cited
Time = 66
Diffraction tomography
limit as λ → 0 (relative to
object size) is Fourier
(from Kak) projection-slice theorem
Time = 67
[Devaney 2005]
limit as λ → 0 (relative to
object size) is Fourier
(from Kak) projection-slice theorem
Time = 68
Inversion by
filtered backpropagation
backprojection
backpropagation
Image removed due to
copyright restrictions
(Jebali)
•Reference:
–Jebali, A., Numerical Reconstruction of
semi-transparent objects in Optical
Diffraction Tomography,
Diploma Project, Ecole Polytechnique,
Lausanne, 2002.
Time = 69
Diffuse optical tomography
(Arridge)
•References:
–Arridge, S.R., Methods for the Inverse
Problem in Optical Tomography,
Proc. Waves and Imaging Through Complex
Media, Kluwer, 307-329, 2001.
–Schweiger, M., Gibson, A., Arridge, S.R.,
“Computational Aspects of Diffuse Optical
Tomography,”
IEEE Computing, Vol. 5, No. 6, Nov./Dec.,
2003. (for image)
–Jensen, H.W., Marschner, S., Levoy, M.,
Hanrahan, P., A Practical Model for
Subsurface Light Transport,
Proc. SIGGRAPH 2001.
Time = 70
The optical diffusion equation
Time = 71
Diffuse optical tomography
•acquisition
–81 source positions, 81 detector positions
–for each source position, measure light at
all detector positions
–use time-of-flight measurement to estimate
initial guess for absorption, to reduce cross-
talk between absoprtion and scattering
Time = 72
Coded aperture imaging
(from Zand)
(source assumed infinitely distant)
•Reference:
–Zand, J., Coded aperture imaging in high
energy astronomy,
http://lheawww.gsfc.nasa.gov/docs/cai/coded
_intr.html
Time = 73
Reconstruction by
matrix inversion
⎡ d1 ⎤ ⎡ C1 C2 C3 C4 ⎤ ⎡ s1 ⎤ d=Cs
⎢d ⎥ ⎢C C C C ⎥ ⎢ s ⎥
⎢ 2⎥ = ⎢ 4 1 2 3⎥ ⎢ 2⎥ s = C-1 d
⎢ d 3 ⎥ ⎢C3 C4 C1 C2 ⎥ ⎢ s3 ⎥
⎢ ⎥ ⎢ ⎥⎢ ⎥ • ill-conditioned unless
⎣d 4 ⎦ ⎣C2 C3 C4 C1 ⎦ ⎣ s4 ⎦ auto-correlation of
detector mask source mask is a delta function
(0/1)
Time = 74
Reconstruction
by backprojection
(from Zand)
•Another example:
–Carlisle, P., Coding aperture imaging
("Mmm, gumballs..."),
http://paulcarlisle.net/old/codedaperture.htm
l
–cross correlation is just convolution (of
detected image by mask) without first
reversing detected image in x and y
–conversion of blacks to -1’s in “decoding
matrix” just serves to avoid normalization of
resulting reconstruction
–performing this on an image of gumballs,
rather than a 3D gumball scene, is equivalent
to assuming the gumballs cover the sky at
infinity, i.e. they are an angular function
•assumes non-occluding source
–otherwise it’s the voxel coloring problem
Time = 75
Interesting techniques
I didn’t have time to cover
• reflection tomography
• synthetic aperture radar & sonar
• holography
• wavefront coding
Time = 76
Computational Illumination
Ramesh Raskar
Computational
Mitsubishi Electric Research Labs
Illumination
Course WebPage :
http://www.merl.com/people/raskar/photo/course/
Ramesh Raskar, Computational Illumination Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
• Spatial Modulation
– Synthetic Aperture Illumination
• Temporal Modulation
Display
– TV remote, Motion Tracking, Sony ID-
ID-cam, RFIG
• Exploiting (uncontrolled) natural lighting condition
– Day/Night Fusion
Recreate 4D Lightfield Scene: 8D Ray Modulator
Ramesh Raskar, Computational Illumination Ramesh Raskar, Computational Illumination
• Spatial Modulation
– Synthetic Aperture Illumination
• Temporal Modulation
– TV remote, Motion Tracking, Sony ID-
ID-cam, RFIG
• General lighting condition
– Day/Night
Ramesh Raskar, Computational Illumination
1
Capturing HDR Image
Varying Exposure time Varying Flash brightness Varying both
Flash-
Flash-Exposure
Sampling
Flash Brightness
Flash-
Flash-Exposure
HDR:
Varying both
Exposure Time Ramesh Raskar, Computational Illumination Ramesh Raskar, Computational Illumination
No-flash
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
Flash
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
Elmar Eisemann and Frédo Durand , Flash Photography Enhancement via Intrinsic
Relighting
Georg Petschnigg, Maneesh Agrawala, Hugues Hoppe, Richard Szeliski, Michael Cohen,
Computational Illumination
Kentaro Toyama. Digital Photography with Flash and No-Flash Image Pairs
• Presence or Absence
Use no-flash image relight flash image – Flash/No-
Flash/No-flash
• Light position
– Multi-
Multi-flash for depth edges
– Programmable dome (image re-re-lighting and matting)
• Light color/wavelength
2
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
3
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
4
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Depth Discontinuities
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Sigma = 9 Sigma = 5
Depth
Edges
Canny Intensity Edge Detection
Sigma = 1
5
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Photo
Our Method
Canny Our Method
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Photo Result
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Our Method
6
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Imaging Geometry
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
m m
Shadow lies along epipolar ray, Shadow lies along epipolar ray,
Epipole and Shadow are on opposite sides of the edge Shadow and epipole are on opposite sides of the edge
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
7
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk Mitsubishi Electric Research Labs MultiFlash NPR Camera Raskar, Tan, Feris, Yu, Turk
8
Photomontage
courtesy of A Agrawala
courtesy of P. Debevec
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
• Spatial Modulation
– Synthetic Aperture Illumination
• Temporal Modulation
– TV remote, Motion Tracking, Sony ID- • width of aperture 6’
ID-cam, RFIG
• General lighting condition • number of cameras 45
– Day/Night • spacing between cameras 5”
• camera’s field of view 4.5°
Ramesh Raskar, Computational Illumination © 2004 Marc Levoy
9
Synthetic aperture photography
The scene
using an array of mirrors
• technologies
– array of projectors
– array of microprojectors
– single projector + array of mirrors
10
Underwater confocal imaging
6-D Methods and beyond...
with and without SAP
Relighting with 4D Incident Light Fields Vincent Masselus, Pieter Peers,
Philip Dutre and Yves D. Willems SIGG2003
• Spatial Modulation
– Synthetic Aperture Illumination • Motion Capture Cameras
• Temporal Modulation
– TV remote, Motion Tracking, Sony ID-
– Visualeyez™ VZ4000 Tracking System
ID-cam, RFIG
• General lighting condition – PhaseSpace motion digitizer
– Day/Night
Ramesh Raskar, Computational Illumination
Demodulating Cameras
11
Raskar, Beardsley, vanBaar, Wang, Raskar, Beardsley, vanBaar, Wang,
Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher
R F I G Lamps :
Radio Frequency Identification Tags (RFID)
Interacting with a Self-
Self-describing World via
Photosensing Wireless Tags and Projectors
No batteries,
Antenna
Ramesh Raskar, Paul Beardsley, Jeroen van Baar, Yao Wang, Small size,
Paul Dietz, Johnny Lee, Darren Leigh, Thomas Willwacher
Warehousing
Warehousing Routing
Routing Conventional Passive RFID
Memory Micro
Controller
Livestock
Livestock
tracking
tracking
Baggage
Baggage READER
handling
handling
Currency
Currency
9 Id
Easy to get list of books in RF range
12
Raskar, Beardsley, vanBaar, Wang, Raskar, Beardsley, vanBaar, Wang,
Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher
Conventional RF tag
Photosensor
Photosensor ? Projector ?
Light
Memory Micro Computer
Controller
RF Data
READER
Compatible with Directional transfer,
RFID size and power AR with Image overlay
needs
Photo-sensing RF tag
b.
b. Projector
Projector beams
beams aa time-varying
time-varying
a.
a. Photosensing
Photosensing RFID
RFID tags
tags pattern
pattern unique
unique for
for each
each (x,y)
(x,y) pixel
pixel
are
are queried
queried via
via RF
RF which
which is
is decoded
decoded byby tags
tags
c.
c. Tags
Tags respond
respond via
via RF,
RF, with
with date
date
d.
d. Multiple
Multiple users
users can
can and
and precise
precise (x,y)
(x,y) pixel
pixel location.
location.
simultaneously
simultaneously work
work from
from aa distance
distance Projector
Projector beams
beams ‘O’‘O’ or
or ‘X’
‘X’ at
at that
that
without
without RF
RF collision
collision location
location for
for visual
visual feedback
feedback
Prototype Tag
RFID
RF tag +
(Radio Frequency Identification) photosensor
RFIG
13
Raskar, Beardsley, vanBaar, Wang, Raskar, Beardsley, vanBaar, Wang,
Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher
Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern
MSB
MSB MSB-1
MSB-1 LSB
LSB MSB
MSB MSB-1
MSB-1 LSB
LSB
•Handheld Projector beams binary coded stripes •Handheld Projector beams binary coded stripes
•Tags decode temporal code •Tags decode temporal code
Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern
MSB
MSB MSB-1
MSB-1 LSB
LSB MSB
MSB MSB-1
MSB-1 LSB
LSB
•Handheld Projector beams binary coded stripes •Handheld Projector beams binary coded stripes
•Tags decode temporal code •Tags decode temporal code
Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern Pattern
Pattern
MSB
MSB MSB-1
MSB-1 LSB
LSB MSB
MSB MSB-1
MSB-1 LSB
LSB
00 11 11 00 00 X=12
X=12
14
Raskar, Beardsley, vanBaar, Wang,
Mitsubishi Electric Research Labs R F I G Lamps Dietz, Lee, Leigh, Willwacher
• Spatial Modulation
– Synthetic Aperture Illumination
• Temporal Modulation
– TV remote, Motion Tracking, Sony ID-
ID-cam, RFIG
• Natural lighting condition
– Day/Night Fusion
Ramesh Raskar, Computational Illumination
Reflections on Reflections in
bldgs bldgs windows
Day Image Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
15
Pixel Blending
Our Method:
Integration of
blended Gradients
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
G2 G2
• Full multigrid GY
solver
Daytime image
Gradient field
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
– Video from fixed cameras – Combine day and night time images
• Night videos have low contrast, areas with no detail
• Improve low quality InfraRed video using high-quality visible video
• Same camera during day can capture static information
• Fill in dark areas, enhance change in intensity
• Dark areas of night video are replaced to provide context
• Output style: better context • Moving object (from night) + Static scene (from day)
– Current Demo
• Fusion of Night video and Daytime image
Modified
Modified Surveillance
Surveillance Camera
Camera
Night time
Day time
Video
Photograph
(or Photo)
Enhanced
Night Video
(or Photo)
with context
Easy-to-understand
Original Video Frame
Non-photorealistic (NPR)
Image or Video
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
16
Video Enhancement Overview of Process
Original night time traffic camera Day time image: By averaging
320x240 video 5 seconds of day video
Input
Output
Enhanced video
Mask frame (for frame above): Note: exit ramp, lane dividers,
Encodes pixel with intensity change buildings not visible in original night
video, but clearly seen here.
Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005 Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
Algorithm
Frame N Gradient field
Final result
Gradient field
Frame N-1 Daytime image Ramesh Raskar, CompPhoto Class Northeastern, Fall 2005
17
D.2: Smart Optics, Modern
Sensors and Future Cameras
Ramesh Raskar
Mitsubishi Electric Research Labs
Jack Tumblin
Northwestern University
Course WebPage :
http://www.merl.com/people/raskar/photo
Computational Photography Novel Illumination
Light Sources
Novel Cameras
Generalized
Sensor
Processing Generalized
Optics
Display
Recreate 4D Lightfield Scene: 8D Ray Modulator
1
Future Directions
• Smart Lighting
– Light stages, Domes, Light waving, Towards 8D
• Smart Optics
– Handheld Light field camera, Programmable imaging/aperture
• Smart Sensors
– HDR Cameras, Gradient Sensing, Line-scan Cameras, Demodulators
• Speculations
Wavefront Coding:
10X Depth of Field
• In-focus:
small ‘Circle of Confusion’:
• Out-of-focus: LARGE
“circle of confusion’
• Coma-like distortion:
Make Circle MOVE
as focus changes:
http://www.cdm-optics.com/site/extended_dof.php
2
Wavefront Coding:
10X Depth of Field
• In-focus:
small ‘Circle of Confusion’:
• Out-of-focus: LARGE
“circle of confusion’
• Coma-like distortion
allows us to
De-convolve,
sharpen out-of-focus
items
http://www.cdm-optics.com/site/extended_dof.php
3
Conventional versus light field
camera
uv-plane st-plane
4
Conventional versus light field
camera
st-plane uv-plane
Prototype camera
5
Digital refocusing
6
Example of digital refocusing
7
Programmable Imaging
Detector
Computations New Optics
Pixels
Vision Programmable Controller
Imaging Through Micro‐Mirrors
(Nayar, Branzoi and Boult, 2004)
detector
scene xi
black
viewpoint
surface
ni
oi nb
mirror array
Geometry: Ray Orientation Photometry: Ray Attenuation
ni
t ( )
oi ( ) = G ( )
xi ni ai ( ) =
xi
ni + t ( )
t ( ) nb
8
Digital Micromirror Device (DMD)
(by Texas Instruments)
o o
‐10 10
14 um
DMD for Imaging:
(Malbet et al. 95, Kearney et al. 98, Castracane et al. 99, Christensen et al. 02)
Programmable Imaging System
Imaging Lens DMD Electronics
9
Modulation: Examples
Scene DMD Image Camera Image
* =
* =
* =
Optical Intra‐Pixel Feature Detection
f *g = f *g+ − f *g−
Laplacian:
CCD Pixels DMD Pixels 1 4 1 1 4 1 0 0 0
1 4 1
1 4 1
1 4 1
‐ 0 0 0
0 0 0
0 0 0
0 0 0
4 0 4 4 0 4 0 20 0 0 20 0
1 4 1 1 4 1 0 0 0 0 0 0
10
Optical Edge Detection
Scene Video Edge Video
• Smart Optics
– Handheld Light field camera,
– Programmable imaging/aperture
• Smart Sensors
– HDR Cameras,
– Gradient Sensing,
– Line-scan Cameras,
– Demodulators
11
Future Directions
• Smart Lighting
– Light stages, Domes, Light waving, Towards 8D
• Smart Optics
– Handheld Light field camera, Programmable imaging/aperture
• Smart Sensors
– HDR Cameras, Gradient Sensing, Line-scan Cameras, Demodulators
• Speculations
12
Pixim
http://www.cybergrain.com/tech/hdr/
Fuji's SuperCCD S3 Pro camera has a chip with high and low sensitivity sensors
per pixel location to increase dynamic range
13
Gradient Camera
1 1
x x
14
Original Image Intensity Camera Image
Intensity values ranging from 0 to 1800 8 bit camera for 1:1000 range
Intensity ramp plus low contrast logo Problem: . saturation at high intensity regions
Gradient Camera
15
Camera Pipeline
On-board Hardware Software
Detail Preserving
16
Quantization
Intensity Histogram
1 105
Gradient Histogram
Original Image Uniform
quantization 3 bits
-105 105
GradCam requires
fewer bits
In the reconstructed
image, error is pushed
to high gradient pixel
Log Uniform Log Uniform gradients positions which is
quantization 3 bits quantization 3 bits visually imperceptible
17
Line Scan Camera: PhotoFinish 2000 Hz
18
3D Cameras
• Time of flight
– ZCam (Shuttered Light Pulse)
• Phase Decoding of modulated illumination
– Canesta (Phase comparison)
– Phase difference = depth
– Magnitude = reflectance
• Structured Light
– Binary coded light and triangulation
Resolution :
1cm for 2-
2-7 meters
19
Graphics can inserted behind and between characters
Canesta:
Canesta: Modulated Emitter
Phase ~ distance
Amplitude ~ reflectance
20
Demodulating Cameras
21
Demodulating Cameras
22
Figure 2 results
Input Image
23
Approximate cutout of the blurred image containing the
taxi (vignetting on left edge). Exact alignment of cutout
with taxi extent is not required.
24
Coded Exposure Photography
25
Novel Sensors
• Gradient sensing
• HDR Camera, Log sensing
• Line-scan Camera
• Demodulating
• Motion Capture
• Fluttered Shutter
• 3D
Fantasy Configurations
• ‘Cloth-cam’: ‘Wallpaper-cam’
elements 4D light emission and 4D capture in
the surface of a cloth…
26
Dream of A New Photography
Old New
• People and Time ~Cheap Precious
• Each photo Precious Free
• Lighting Critical Automated?
• External Sensors No Yes
• ‘Stills / Video’ Disjoint Merged
• Exposure Settings Pre-select Post-Process
• Exposure Time Pre-select Post-Process
• Resolution/noise Pre-select Post-Process
• ‘HDR’ range Pre-select Post-Process
Computational Photography Novel Illumination
Light Sources
Modulators
Novel Cameras
Generalized
Generalized Optics
Sensor
Processing Generalized
Optics 4D Incident Lighting
Ray 4D Ray Bender
Reconstruction Upto 4D
Ray Sampler
4D Light Field
Display
Recreate 4D Lightfield Scene: 8D Ray Modulator
27
Film‐like Computational Photography
Photography
with bits Computational Camera Smart Light
Acknowledgements
• MERL, Northwestern Graphics Group
• Amit Agrawal
• Shree Nayar
• Marc Levoy
• Jinbo Shi
• Ankit Mohan, Holger Winnemoller
• Image Credits
– Ren Ng, Vaibhav Vaish,
Vaish, William Bennet
– Fredo Durand, Aseem Agrawala
– Morgan McGuire, Paul Debevec
– And more
28
Non-photorealistic Camera:
Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging
Ramesh Raskar∗ Kar-Han Tan Rogerio Feris Jingyi Yu† Matthew Turk
Mitsubishi Electric Research Labs (MERL) UC Santa Barbara MIT UC Santa Barbara
Figure 1: (a) A photo of a car engine (b) Stylized rendering highlighting boundaries between geometric shapes. Notice the four spark plugs
and the dip-stick which are now clearly visible (c) Photo of a flower plant (d) Texture de-emphasized rendering.
Abstract duce visual clutter such as shadows and texture details [Gooch and
Gooch 2001]. The result is useful for imaging low contrast and
We present a non-photorealistic rendering approach to capture and geometrically complex scenes such as mechanical parts (Figure 1),
convey shape features of real-world scenes. We use a camera with plants or the internals of a patient (in endoscopy).
multiple flashes that are strategically positioned to cast shadows When a rich 3D model of the scene is available, rendering sub-
along depth discontinuities in the scene. The projective-geometric sets of view-dependent contours is a relatively well-understood task
relationship of the camera-flash setup is then exploited to detect in NPR [Saito and Takahashi 1990]. Extending this approach to real
depth discontinuities and distinguish them from intensity edges due scenes by first creating 3D scene models, however, remains diffi-
to material discontinuities. cult. In this paper, we show that it is possible to bypass geometry
We introduce depiction methods that utilize the detected edge acquisition, and directly create stylized renderings from images. In
features to generate stylized static and animated images. We can the place of expensive, elaborate equipment for geometry acquisi-
highlight the detected features, suppress unnecessary details or tion, we propose using a camera with a simple extension: multiple
combine features from multiple images. The resulting images more strategically positioned flashes. Instead of having to estimate the
clearly convey the 3D structure of the imaged scenes. full 3D coordinates of points in the scene, and then look for depth
We take a very different approach to capturing geometric fea- discontinuities, our technique reduces the general 3D problem of
tures of a scene than traditional approaches that require reconstruct- depth edge recovery to one of intensity step edge detection.
ing a 3D model. This results in a method that is both surprisingly Exploiting the imaging geometry for rendering results in a sim-
simple and computationally efficient. The entire hardware/software ple and inexpensive solution for creating stylized images from real
setup can conceivably be packaged into a self-contained device no scenes. We believe that our camera will be a useful tool for pro-
larger than existing digital cameras. fessional artists and photographers, and we expect that it will also
Keywords: non-photorealistic rendering, image enhancement, enable the average user to easily create stylized imagery.
depth edges
1.1 Overview
1 Introduction Our approach is based on taking successive photos of a scene, each
with a different light source close to and around the camera’s center
Our goal is to create stylized images that facilitate viewer com- of projection. We use the location of the shadows abutting depth
prehension of the shape contours of the objects depicted. Non- discontinuities as a robust cue to create a depth edge map in both
photorealistic rendering (NPR) techniques aim to outline the shapes static and dynamic scenes.
of objects, highlight the moving parts to illustrate action, and re- Contributions Our main contribution is a set of techniques for
detecting and rendering shape contours of scenes with low-contrast
∗ e-mail: [raskar,tan]@merl.com,[rferis,turk]@cs.ucsb.edu or high geometric complexity. Our technical contributions include
† email: jingyi@graphics.csail.mit.edu the following.
• A robust edge classification scheme to distinguish depth edges
from texture edges
• A collection of rendering and reconstruction techniques for
creating images highlighting shape boundaries from 2D data
without creating 3D representations, using qualitative depths
• An image re-synthesis scheme that allows abstraction of tex-
tured regions while preserving geometric features
• A technique to detect depth edges in dynamic scenes
rather than to detect depth edges. Depth discontinuities present dif-
ficulties for traditional stereo: it fails due to half-occlusions, i.e.,
occlusion of scene points in only one of the two views, which con-
fuse the matching process [Geiger et al. 1992]. Few techniques
try to model the discontinuities and occlusions directly [Birchfield
1999; Kang et al. 2001; Scharstein and Szeliski 2002]. Active il-
lumination methods, which generally give better results, have been
used for depth extraction, shape from shading, shape-time stereo
and photometric stereo but are unfortunately unstable around depth
discontinuities [Sato et al. 2001]. An interesting technique has
been presented to perform logical operations on detected inten-
sity edges, captured under widely varying illumination, to preserve
shape boundaries [Shirai and Tsuji 1972] but it is limited to uniform
Figure 2: Traditional image enhancement by improving (Left) albedo scenes. Using photometric stereo, it is possible to analyze
brightness and (Right) contrast. Low contrast depth edges remain the intensity statistics to detect high curvature regions at occluding
difficult to perceive. contours or folds [Huggins et al. 2001]. But the techniques assume
that the surface is locally smooth which fails for a flat foreground
We introduce the concept of a self-contained stylized imaging de- object like a leaf or piece of paper, or view-independent edges such
vice, a ‘non-photorealistic camera’, which can directly generate im- as corner of a cube. They detect regions near occluding contours
ages highlighting contours of geometric shapes in a scene. It con- but not the contours themselves.
tains a traditional camera and embedded flashes, and can be readily
Techniques for shape from shadow (or darkness) build a contin-
and inexpensively built. We attempt to address two important is-
uous representation (shadowgram) from a moving light source from
sues in NPR [Gooch and Gooch 2001] [Strothotte and Schlechtweg
which continuous depth estimates are possible [Raviv et al. 1989;
2002], detecting shape contours that should be enhanced and iden-
Langer et al. 1995; Daum and Dudek 1998]. However, it involves
tifying features that should be suppressed. We propose a new ap-
a difficult problem of estimating continuous heights and requires
proach to take image-stylization beyond the processing of a photo-
accurate detection of start and end of shadows. Good reviews of
graph, to actively changing how the photographs are taken.
shadow-based shape analysis methods are available in [Yang 1996]
The output images or videos can be rendered in many ways,
[Kriegman and Belhumeur 2001] [Savarese et al. 2001].
e.g., technical illustration, line art or cartoon-like style. We high-
A common limitation of existing active illuminations methods is
light depth discontinuities, suppress material and illumination tran-
that the light sources need to surround the object, in order to create
sitions, and create renderings with large, smoothly colored regions
significant shading and shadow variation from (estimated or known
outlined with salient contours [Durand 2002]. We describe several
3D) light positions. This necessitates a fixed lighting rig, which
applications: imaging complex mechanical parts, improving im-
limits the application of these techniques to industrial settings, and
ages for endoscopes, anatomical drawings and highlighting changes
they are impossible to build into a self-contained camera.
in a scene. Our approach shares the disadvantages of NPR: rele-
vant details may be lost as an image is simplified, so tunable ab- We believe our proposed method for extracting depth edges is
straction is needed (Section 3.3), and the usefulness of the output is complementary with many existing methods for computing depth
often difficult to quantify. and 3D surface shape, as depth edges often violate smoothness as-
sumptions inherent in many techniques. If the locations of depth
discontinuities can be reliably detected and supplied as input, we
1.2 Related Work believe that the performance of many 3D surface reconstruction al-
gorithms can be significantly enhanced.
NPR from images, rather than 3D geometric models has recently To find depth edges, we avoid the dependence on solving a corre-
received a great deal of attention. The majority of the available spondence problem or analyzing pixel intensity statistics with mov-
techniques for image stylization involve processing a single image ing lights, and we do not attempt to estimate any continuous value.
as the input applying morphological operations, image segmenta- In our search, we have not seen a photometric or other type of stereo
tion, edge detection and color assignment. Some of them aim for method successfully applied to complex scenes where the normals
stylized depiction [DeCarlo and Santella 2002] [Hertzmann 1998] change rapidly– such as a potted plant, or a scene with high depth
while others enhance legibility. Interactive techniques for stylized complexity or low intensity changes, such as a car engine or bone.
rendering such as rotoscoping have been used as well [Waking Life
2001; Avenue Amy 2002], but we aim to automate tasks where
meticulous manual operation was previously required. Our work 1.3 Outline
belongs to an emerging class of techniques to create an enhanced
image from multiple images, where the images are captured from Our method for creating a stylized image of a static scene consists
the same viewpoint but under different conditions, such as under of the following.
different illumination, focus or exposure [Cohen et al. 2003; Akers Capture a series of images of the scene under shifted light
et al. 2003; Raskar et al. 2004]. positions
Aerial imagery techniques find shadow evidence by threshold- Process these images to automatically detect depth edges
ing a single intensity image, assuming flat ground and uniform Identify the subset of intensity edges that are illumination and
albedo to infer building heights [Huertas and Nevatia 1988; Irvin texture edges
and McKeown 1989; Lin and Nevatia 1998]. Some techniques im- Compute qualitative depth relationships
prove shadow capture with novel shadow extraction techniques to Enhance or simplify detected features for rendering
compute new shadow mattes [Chuang et al. 2003] or remove them Insert processed scene appearance for stylization
to improve scene segmentation [Toyama et al. 1999]. Some other
techniques remove shadows without explicitly detecting them, such We use the term depth edges to refer to the C0 discontinuities
as using intrinsic images [Weiss 2001]. in a depth map. Depth edges correspond to internal or external oc-
Stereo techniques including passive and active illumination are cluding contours (or silhouettes) or boundaries of physical objects.
generally designed to compute depth values or surface orientation The depth edges recovered are signed: in the local neighborhood,
Shadow e3
Object
Shadowed
by 1 but
Ik not by 2 and 3
Pk e1 e2
ek Camera
Image
2.2 Material Edges Tradeoff in choosing the baseline A larger baseline distance
In addition to depth edges, we also need to consider illumination between the camera and the flash is better to cast a wider detectable
and material edges in the image. Illumination edges are boundaries shadow in the image, but a smaller baseline is needed to avoid sep-
between lit and shadowed regions due to ambient light source(s), aration of shadow from the associated depth edge.
rather than the flashes attached to our camera. Since the individual The width of the abutting shadow in the image is d =
images Ik , are free of ambient illumination, they are free of ambient f B (z2 − z1 ) (z1 z2 ), where f is the focal length, B is baseline in
illumination edges. In general, since material edges are indepen-
dent of illumination direction, they can be easily classified by a
process of elimination. Material edges are intensity edges of Imax
minus the depth edges.
This edge classification scheme works well and involves a mini-
mal number of parameters for tuning. The only parameters we need
are those for intensity edge detection of ratio images and Imax im-
age, to detect depth and material edges, respectively.
2.3 Issues
The technique we presented to detect depth edges is surprisingly Figure 7: (Left) Minimum composite of image with flash FS and
robust and reliable. We discuss the few conditions in which the ba- FL . (Right) Plot of intensity along a scanline due to FS , FL and
sic algorithm fails: a false negative when a negative transition at a min(IS , IL ).
Figure 8: Specularities and lack of background. First column: Imax and corresponding result showing artifacts. Second column: For the
yellow line marked on dumbbell (x=101:135); Top plot, Ile f t (red) with Imax (light blue). Bottom plot, ratio Rle f t . Note the spurious negative
transition in Rle f t , at the arrow, which gets falsely identified as a depth edge. Third column: Top plot, gradient of Ile f t (red), Iright (green),
Itop (blue) and Median of these gradients (black). Bottom plot, reconstructed intrinsic image (black) compared with Imax (light blue). Fourth
column: Top, intrinsic image. Bottom, resulting depth edge map. Fifth column: Top, Scene without a background to cast shadow. Bottom,
Edges of I0 /Imax , in white plus detected depth edges in red.
mm, and z1 , z2 are depths, in mm, to the shadowing and shadowed ages as seen in Figure 8. Although methods exist to detect specu-
edge. (See Figure 6) larities in a single image [Tan et al. 2003], detecting them reliably
Shadow detachment occurs
when the width, T , of the object is in textured regions is difficult.
smaller than (z2 − z1 ) B z2 . So a smaller baseline, B, will allow Our method is based on the observation that specular spots shift
narrower objects (smaller T) without shadow separation. Fortu- according to the shifting of light sources that created them. We need
nately, with rapid miniaturization and sophistication of digital cam- to consider three cases of how specular spots in different light posi-
eras, we can choose small baseline while increasing the pixel reso- tions appear in each image: (i) shiny spots remain distinct (e.g., on
lution (proportional to f ), so that the product fB remains constant, highly specular surface with a medium curvature) (ii) some spots
allowing depth detection of narrow objects. overlap and (iii) spots overlap completely (e.g., on a somewhat
When camera resolutions are limited, we can exploit a hierar- specular, fronto-parallel planar surface). Case (iii) does not cause
chical baseline method to overcome this tradeoff. We can detect spurious gradients in ratio images.
small depth discontinuities (with larger baselines) without creat- We note that although specularities overlap in the input images,
ing shadow separation at narrow objects (using narrow baselines). the boundaries (intensity edges) around specularities in general do
In practice, we found two different baselines were sufficient. We, not overlap. The main idea is to exploit the gradient variation in the
however, now have to deal with spurious edges due to shadow sep- n images at a given pixel (x,y). If (x,y) is in specular region, in cases
aration in the image with larger baseline flash FL . The image with (i) and (ii), the gradient due to specularity boundary will be high in
smaller baseline flash, FS , may miss small depth discontinuities. only one or a minority of the n images under different lighting. The
How can we combine the information in those two images? There median of the n gradients at that pixel will remove this outlier(s).
are essentially four cases we need to consider at depth edges (Fig- Our method is motivated by the intrinsic image approach by [Weiss
ure 7) (a) FS creates a undetectable narrow shadow, FL creates a de- 2001], where the author removes shadows in outdoor scenes by not-
tectable shadow (b) FS creates a detectable small width shadow and ing that shadow boundaries are not static. We reconstruct the image
FL creates a larger width shadow. (c) FS creates detectable shadow by using median of gradients of input images as follows.
but FL creates a detached shadow that overlaps with FS shadow and • Compute intensity gradient, Gk (x, y) = ∇Ik (x, y)
(iv) same as (d) but the shadows of FS and FL do not overlap.
Our strategy is based on simply taking the minimum composite • Find median of gradients, G(x,y) = mediank (Gk (x, y))
of the two images. In the first three cases, this conveniently in- • Reconstruct image I which minimizes |∇I − G|
creases the effective width of the abutting shadow without creating Image reconstruction from gradients fields, an approximate invert-
any artifacts, and hence can be treated using the basic algorithm ibility problem, is still a very active research area. In R2 , a modified
without modifications. For the fourth case, a non-shadow region gradient vector field G may not be integrable. We use one of the di-
separates the two shadows in the min composite, so that the shadow rect methods recently proposed [Elder 1999] [Fattal et al. 2002].
in FL appears spurious. The least square estimate of the original intensity function, I , so
Our solution is as follows. We compute the depth edges using FS that G ≈ ∇I’, can be obtained by solving the Poisson differential
and FL (Figure 7). We then traverse the epipolar ray. If the depth equation ∇2 I’ = div G, involving a Laplace and a divergence opera-
edge appears in FS (at D1 ) but not in FL we traverse the epipolar ray tor. We use the standard full multigrid method [Press et al. 1992] to
in FL until the next detected depth edge. If this depth edge in FL , solve the Laplace equation. We pad the images to square images of
there is no corresponding depth edge in FS , we mark this edge as a size the nearest power of two before applying the integration, and
spurious edge. then crop the result image back to the original size [Raskar et al.
The solution using min-composite, however, will fail to detect 2004]. We use a similar gradient domain technique to simplify sev-
minute depth discontinuities where even FL does not create a de- eral rendering tasks as described later.
tectable shadow. It will also fail for very thin objects where even FS The resultant intrinsic image intensity, I (x, y) is used as the de-
creates a detached shadow. nominator for computing the ratio image, instead of the max com-
posite, Imax (x, y). In specular regions, the ratio Ik (x, y) I (x, y) now
Specularities Specular highlights that appear at a pixel in one is larger than 1.0. This is clamped to 1.0 so that the negative transi-
image but not others can create spurious transitions in the ratio im- tions in the ratio image do not lie in specular parts.
Figure 9: (a) A edge rendering with over-under style. (b) Rendering edges with width influenced by orientation. (c) and (d) Normal
Interpolation for toon rendering exploiting over-under mattes.
Lack of Background Thus far we assumed that depth edges edge orientation. Since the edge orientation in 3D is approximately
casting shadows on a background are within a finite distance. What the same as the orientation of its projection in image plane, the
if the background is significantly far away or not present? This turns thickness is simply proportional to the dot product of the image
out to be a simple situation to solve because in these cases only the space normal with a desired light direction (Figure 9(b)).
outermost depth edge, the edge shared by foreground and distant Color variation We can indicate color of original object by ren-
background, is missed in our method. This can be easily detected dering the edges in color. From signed edges, we pick up a fore-
with a foreground-background estimation technique. In Imax image ground color along the normal at a fixed pixel distance, without
the foreground pixels are lit by at least one of the flashes but in the crossing another depth or intensity edge. The foreground colored
ambient image, I0 , neither the foreground nor the background is lit edges can also be superimposed onto a segmented source image as
by any flash. Hence, the ratio of I0 /Imax , is near 1 in background seen in Figure 10(c).
and close to zero in interior of the foreground. Figure 8 shows
intensity edges of this ratio image combined with internal depth
edges. 3.2 Color Assignment
Since there is no 3D model of the scene, rendering non-edge pixels
requires different ways of processing captured 2D images.
3 Image Synthesis Normal interpolation For smooth objects, the depth edge cor-
responds to the occluding contour where the surface normal is per-
Contour-based comprehensible depiction is well explored for 3D
pendicular to the viewing direction. Hence the normals at depth
input models [DeCarlo et al. 2003] but not for photographs. In the
edges lie in the plane of the image and we can predict normals at
absence of a full 3D representation of the scene, we exploit the
other pixels. We solve this sparse interpolation problem by solving
following 2D cues to develop novel rendering algorithms.
a 2D Poisson differential equation. Our method is inspired by the
(a) The sign of the depth edge, Lumo [Johnston 2002] where the over-under mattes are manually
(b) Relative depth difference based on shadow width, created. In our case, signed depth edges allow normal interpolation
(c) Color near the signed edges, and while maintaining normal discontinuity at depth edges.
(d) Normal of a smooth surface at the occluding contour Image attenuation We accentuate the contrast at shape bound-
aries using an image attenuation maps (Figure 10(a)) as follows.
We aim to automate tasks for stylized rendering where meticulous Depth edges are in white on a black background. We convolve with
manual operation was originally required, such as image editing or a filter that is the gradient of an edge enhancement filter. Our filter
rotoscoping [Waking Life 2001] . is a Guassian minus an impulse function. When we perform a 2D
integration on the convolved image, we get a sharp transition at the
3.1 Rendering Edges depth edge.
Depicting Change Some static illustrations demonstrate action
We create a vectorized polyline representation of the depth edges e.g., changing oil in a car, by making moving parts in the fore-
by linking the depth edge pixels into a contour. The polyline is ground brighter. Foreground detection via intensity-based schemes,
smoothed and allows us to stylize the width and color of the con- however, is difficult when the colors are similar and texture is lack-
tour maintaining spatial coherency. While traversing the marked ing, e.g., detecting hand gesture in front of other skin colored parts
depth edge pixels to create a contour, at T-junctions, unlike tradi- (Figure 11). We take two separate sets of multi-flash shots, with-
tional methods that choose the next edge pixel based on orientation out and with the hand in front of the face to capture the reference
similarity, we use the information from the shadows to resolve the and changed scene. We note that any change in a scene is bounded
connected component. Two edge pixel are connected only if they by new depth edges introduced. Without explicitly detecting fore-
are connected in the intensity edges of all the n ratio images. ground, we highlight interiors of regions that contribute to new
Signed edges At the negative transition along the epipolar ray depth edges.
in the ratio image, Rk, the side of edge with higher intensity is the We create a gradient field where pixels marked as depth edges
foreground and lower intensity (corresponding to shadowed region) in changed scene but not in reference, are assigned a unit magni-
is background. This qualitative depth relationship can be used to tude gradient. The orientation matches the image space normal to
clearly indicate foreground-background separation at each edge. the depth edge. The gradient at other pixels is zero. The recon-
We emulate the over-under style used by artists in mattes. The structed image from 2D integration is a pseudo-depth map – least
foreground side is white while the background side is black. Both squared error solution via solving Poisson equation. We threshold
are rendered by displacing depth contour along the normal (Figure this map at 1.0 to get the foreground mask which is brightened.
9(a)). Note, the shadow width along the epipolar ray is proportional to
Light direction We use a commonly known method to convey the ratio of depth values on two sides of the edge. Hence instead
light direction by modifying the width of edges depending on the of a unit magnitude gradient, we could assign a value proportional
Figure 10: Color assignment. (a) Attenuation Map (b) Attenuated Image (c) Colored edges on de-emphasized texture
This ratio image is free of unwanted transitions and the same epipo-
lar ray traversal method can be applied to localize the depth edges.
Figure 13 shows the algorithm in action. We tested the algorithm
with synthetic sequences to investigate the set of conditions under
which the algorithm is able to correctly localize the depth edges
and also experimented with this algorithm in real dynamic scenes.
An example frame from a dynamic sequence is shown in Figure 14.
A full stylized example with human subjects can be seen in the ac-
companying video. While we are very encouraged by the simplicity
of the algorithm as well as the results we were able to achieve with
it, the simplifying assumptions made about the monotonicity and
magnitude of motion are still fairly restrictive. For thin objects or
objects with high frequency texture, large motions between succes-
Figure 13: Depth edge detection for dynamic scenes. (Top) Three sive frames creates spurious edges. We plan to continue our in-
frames from multi-flash sequence of a toy example showing a red vestigation in this area and designing algorithms that require fewer
square with a green triangle texture moving from left to right. We assumptions and work under a wider range of conditions.
are interested in detecting the depth edge in frame m. A single
scan line shown in blue is used for the plots. (Middle) The three
scan lines plots. The position of the correct depth edge position
is indicated with a vertical blue line. (Bottom) Plot of minimum
composite and ratio images computed using the static and dynamic
algorithms. The motion induced unwanted edges in the static ratio
image but not in the dynamic ratio image. The correct depth edge
can then be detected from the ratio image using the same traversal
procedure as before.
6 Results
We show a variety of examples of real scenes, from millimeter scale
objects to room sized environments.
8 Conclusion H UERTAS , A., AND N EVATIA , R. 1988. Detecting buildings in aerial images. Com-
puter Vision, Graphics and Image Processing 41, 2, 131–152.
We have presented a simple yet effective method to convey shape H UGGINS , P., C HEN , H., B ELHUMEUR , P., AND Z UCKER , S. 2001. Finding Folds:
boundaries by rendering new images and videos of real world On the Appearance and Identification of Occlusion . In IEEE CVPR, vol. 2, 718–
725.
scenes. We exploit the epipolar relationship between light sources
and cast shadows to extract geometric features from multiple im- I RVIN , R., AND M C K EOWN , D. 1989. Methods for exploiting the relationship be-
ages of a scene. By making use of image space discontinuity rather tween buildings and their shadows in aerial imagery. IEEE Transactions on Sys-
tems, Man and Cybernetics 19, 6, 1564–1575.
than relying on 3D scene reconstruction, our method can robustly
capture the underlying primitives for rendering in different styles. J OHNSTON , S. F. 2002. Lumo: Illumination for cel animation. In Proceedings of
We have presented basic prototypes, related feature capturing NPAR, ACM Press, 45–52.
and rendering algorithms, and demonstrated applications in tech- K ANG , S. B., S ZELISKI , R., AND C HAI , J. 2001. Handling occlusions in dense
nical illustration and video processing. Finally, since a depth edge multi-view stereo. In IEEE CVPR, vol. 1, 102–110.
is such a basic primitive, we have suggested ways in which this K RIEGMAN , D., AND B ELHUMEUR , P. 2001. What Shadows Reveal About Object
information can be used in applications beyond NPR. Structure . Journal of the Optical Society of America, 1804–1813.
Minor modification to camera hardware enables this method to L ANGER , M., D UDEK , G., AND Z UCKER , S. 1995. Space Occupancy using Multiple
be implemented in a self-contained device no larger than existing Shadow Images. International Conference on Intelligent Robots and Systems, 390–
digital cameras. We have proposed one possible approach to lever- 396.
aging the increasing sophistication of digital cameras to easily pro- L IN , C., AND N EVATIA , R. 1998. Building detection and description from a single
duce useful and interesting stylized images. intensity image. Computer Vision and Image Understanding: CVIU 72, 2, 101–
121.
Acknowledgements We thank the anonymous reviewers for PAPADEMETRIS , X., AND B ELHUMEUR , P. N. 1996. Estimation of motion boundary
location and optical flow using dynamic programming. In Proc. Int. Conf. on Image
useful comments and guidance. We thank Adrian Ilie, Hongcheng
Processing.
Wang, Rebecca Xiong, Paul Beardsley, Darren Leigh, Paul Dietz,
Bill Yerazunis and Joe Marks for stimulating discussions, James P RESS , W. H., T EUKOLSKY, S., V ETTERLING , W. T., AND F LANNERY, B. P. 1992.
Numerical Recipes in C: The Art of Scientific Computing . Pearson Education.
Kobler (MEEI), Takashi Kan and Keiichi Shiotani for providing
motivating applications, Narendra Ahuja and Beckman Institute R ASKAR , R., I LIE , A., AND Y U , J. 2004. Image Fusion for Context Enhancement
Computer Vision and Robotics Lab for suggestions and support, and Video Surrealism. In Proceedings of NPAR.
and many members of MERL for help in reviewing the paper. R AVIV, D., PAO , Y., AND L OPARO , K. A. 1989. Reconstruction of Three-dimensional
Surfaces from Two-dimensional Binary Images. In IEEE Transactions on Robotics
and Automation, vol. 5(5), 701–710.
References S AITO , T., AND TAKAHASHI , T. 1990. Comprehensible Rendering of 3-D Shapes.
A KERS , D., L OSASSO , F., K LINGNER , J., AGRAWALA , M., R ICK , J., AND H AN - In ACM SIGGRAPH, 197–206.
RAHAN , P. 2003. Conveying Shape and Features with Image-Based Relighting. In S ATO , I., S ATO , Y., AND I KEUCHI , K. 2001. Stability issues in recovering illumina-
IEEE Visualization. tion distribution from brightness in shadows. IEEE Conf. on CVPR, 400–407.
AVENUE A MY, 2002. Curious Pictures. S AVARESE , S., RUSHMEIER , H., B ERNARDINI , F., AND P ERONA , P. 2001. Shadow
Carving. In ICCV.
B IRCHFIELD , S. 1999. Depth and Motion Discontinuities. PhD thesis, Stanford
University. S CHARSTEIN , D., AND S ZELISKI , R. 2002. A taxonomy and evaluation of dense
two-frame stereo correspondence algorithms. In International Journal of Computer
C HUANG , Y.-Y., G OLDMAN , D. B., C URLESS , B., S ALESIN , D. H., AND S ZELISKI ,
Vision, vol. 47(1), 7–42.
R. 2003. Shadow matting and compositing. ACM Trans. Graph. 22, 3, 494–500.
S HIRAI , Y., AND T SUJI , S. 1972. Extraction of the Line Drawing of 3-Dimensional
C OHEN , M. F., C OLBURN , A., AND D RUCKER , S. 2003. Image stacks. Tech. Rep.
Objects by Sequential Illumination from Several Directions. Pattern Recognition
MSR-TR-2003-40, Microsoft Research.
4, 4, 345–351.
DAUM , M., AND D UDEK , G. 1998. On 3-D Surface Reconstruction using Shape from
S TROTHOTTE , T., AND S CHLECHTWEG , S. 2002. NonPhotorealistic Computer
Shadows. In CVPR, 461–468.
Graphics: Modeling, Rendering and Animation. Morgan Kaufmann, San Fran-
D E C ARLO , D., AND S ANTELLA , A. 2002. Stylization and Abstraction of Pho- cisco.
tographs. In Proc. Siggraph 02, ACM Press.
TAN , P., L IN , S., Q UAN , L., AND S HUM , H.-Y. 2003. Highlight Removal by
D E C ARLO , D., F INKELSTEIN , A., RUSINKIEWICZ , S., AND S ANTELLA , A. 2003. Illumination-Constrained Inpainting . In Ninth IEEE International Conference on
Suggestive contours for conveying shape. ACM Trans. Graph. 22, 3, 848–855. Computer Vision.
D URAND , F. 2002. An Invitation to Discuss Computer Depiction. In Proceedings of TAN , K., KOBLER , J., D IETZ , P., F ERIS , R., AND R ASKAR , R. 2004. Shape-
NPAR 2002. Enhanced Surgical Visualizations and Medical Illustrations with Multi-Flash Imag-
E LDER , J. 1999. Are Edges Incomplete? . International Journal of Computer Vision ing. In MERL TR/38.
34, 2/3, 97–122. T OYAMA , K., K RUMM , J., B RUMITT, B., AND M EYERS , B. 1999. Wallflower:
FATTAL , R., L ISCHINSKI , D., AND W ERMAN , M. 2002. Gradient Domain High Principles and Practice of Background Maintenance. In ICCV, 255–261.
Dynamic Range Compression. In Proceedings of SIGGRAPH 2002, ACM SIG- WAKING L IFE, 2001. Waking Life, the movie.
GRAPH, 249–256. W EISS , Y. 2001. Deriving intrinsic images from image sequences. In Proceedings of
F ERIS , R., T URK , M., R ASKAR , R., TAN , K., AND O HASHI , G. 2004. Exploit- ICCV, vol. 2, 68–75.
ing Depth Discontinuities for Vision-based Fingerspelling Recognition. In IEEE YANG , D. K.-M. 1996. Shape from Darkness Under Error. PhD thesis, Columbia
Workshop on Real-time Vision for Human-Computer Interaction (in conjunction University.
with CVPR’04).
Eurographics Symposium on Rendering (2005)
Kavita Bala, Philip Dutré (Editors)
Ankit Mohan1† , Jack Tumblin1 , Bobby Bodenheimer2 , Cindy Grimm3 , Reynold Bailey3
Abstract
We apply simplified image-based lighting methods to reduce the equipment, cost, time, and specialized skills
required for high-quality photographic lighting of desktop-sized static objects such as museum artifacts. We place
the object and a computer-steered moving-head spotlight inside a simple foam-core enclosure, and use a camera
to quickly record low-resolution photos as the light scans the box interior. Optimization guided by interactive user
sketching selects a small set of frames whose weighted sum best matches the target image. The system then repeats
the lighting used in each of these frames, and constructs a high resolution result from re-photographed basis
images. Unlike previous image-based relighting efforts, our method requires only one light source, yet can achieve
high resolution light positioning to avoid multiple sharp shadows. A reduced version uses only a hand-held light,
and may be suitable for battery-powered, field photography equipment that fits in a backpack.
Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Computer Graphics]: Three-Dimensional
Graphics and Realism I.4.1 [Image Processing and Computer Vision]: Digitization and Image Capture I.3.3 [Com-
puter Graphics]: Picture/Image Generation
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
Figure 1: Light placement for obtaining high quality photographs can be extremely tedious and time consuming (left). Our
system use a simple setup with a steerable spotlight and an uncalibrated enclosure (center) to obtain results comparable to
professional lighting even when used by novice users (right).
image-based lighting ideas to reduce the required equip- the human face [DHT∗ 00], museum artifacts measured by a
ment to a single light source and single camera; we replace rotating-arm light stage [HCD01], an ingenious but exten-
trial-and-error light repositioning with optimization and on- sive system by Debevec et al. [DWT∗ 02] for real-time video
screen painting; and we reduce the need for high dynamic playback and measurement of light fields, a dome of elec-
range photography, thus reducing the capture time. The re- tronic flashes for real time image relighting [MGW01], a
sult is a novel and inexpensive system that a novice can use free form light stage to enable portable gathering of light-
to intuitively describe and obtain the desired lighting for a field data with some calibration [MDA02], and full 4D inci-
photograph. dent light measurements by Masselus et al. [MPDW03]. In
all of these cases, data-gathering required either customized
equipment or collection times much longer than would be
2. Related Work practical for photographic lighting.
Lighting has long been recognized as a hard problem in com- Three recent systems also offered novel sketch guided re-
puter graphics and many papers have explored optimization lighting from basis images. Akers et al. [ALK∗ 03] used a
for light placement and other parameters [SDS∗ 93, KPC93, robotic light-positioning gantry to gather precisely lit im-
PF95, CSF99, SL01]. Some of these systems used painting ages, and like us, provided a painting interface to guide re-
interfaces to specify desired lighting in a 3D scene [SDS∗ 93, lighting. But unlike us they used spatially varrying weights
PF95, PRJ97], and we use a similar approach to make light- that could produce physically impossible lighting. Digital
ing for photography more intuitive. The system by Shacked Photomontage [ADA∗ 04] used sketch guided graph-cut seg-
et al. [SL01] was even able to provide fully automatic mentation coupled with gradient domain fusion to seam-
lighting by applying image quality metrics. Marschner et lessly merge several photographs. They demonstrated merg-
al. [MG97] used inverse rendering techniques to estimate ing differenlty lit photographs to create novel illumination
and alter the directional distribution of incident light in a conditions. Though their interaction scheme worked well
photograph. However, all these systems require 3D informa- for a small number of images (∼10), it may be impracti-
tion unavailable in our photographic application. cal for the hundreds of images required for complete control
Several commercial photographic products have also used over lighting directions. Also, their system does nothing to
lighting enclosures similar to ours, but they achieve very help the user with light placement, and may produce phys-
soft lighting with limited user controls. Moreover, they do ically unrealizable results. Anrys and Dutre [AD04] used a
not help users solve light placement problems. These sys- Debevec-style light stage with around 40 fixed, low pow-
tems include diffusive tents [Pho], photo-boxes [MK ] and ered light sources and a painting interface to guide light-
translucent back-lit platforms with an array of individually ing. Their optimization only found light intensities, and light
dimmed light sources [Ast]. placement was still left up to the user. Also, their point
light sources could cause multiple shadows and highlights
Image-based methods have also been used to permit arbi- which might be undesirable for archival purposes. The data
trary relighting of well-measured objects. Most methods, in- capture time was high since they captured high-dynamic-
cluding ours, perform relighting using a weighted sum of range (HDR) photos for every light location.
differently lit basis images, done first by [NSD94]. How-
ever, prior efforts used more elaborate and expensive equip- Unlike previous attempts, our system does not require users
ment because their goals were different from ours. These in- to decide on correct or complete light source placement. This
clude measurement of a 4D slice of the reflectance field of is possible because our capture process is significantly dif-
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
camera on the light source and used camera calibration tech- target, the user guides the optimization, iteratively improv-
niques to estimate lighting directions with reasonable accu- ing the target until she gets the desired output. Figure 5(b)
racy. While these methods try to meet the ambitious goal of shows how simple approximate sketching on the target im-
incident light field capture, they would tax anyone’s patience age can give an interesting sidelighting effect. Figure 5(c)
to record more than just a few items. We present a faster and shows how the highlight can bring out the underlying tex-
simpler variant that serves our purposes better. ture in a surface.
In the method of Section 4, we required repeatable light Figure 5(d) shows lighting for a highly specular object. Good
source positioning. However, if we record all of our ‘aim- lighting for such smooth, highly reflective objects is always
ing images’ at the final output resolution, and if we either difficult, as the light source itself is visible in the reflec-
ignore over-exposed specular highlights or record high dy- tion. Our system produces results similar to the target image
namic range images when needed, then repeatability is not without large, objectionable saturated regions. In future sys-
needed. This allows us to use a hand-held light source in- tems we may hide the enclosure seams by constructing wide
stead. As shown in Figure 4, we use a small 250W hand-held smooth rounded corners resembling a photographer’s ‘cyc’.
light intended for television news cameras, attached to a dif-
Figure 5(f) shows results from the handheld method of Sec-
fuse reflector (foam core again), and limit the beam width
tion 5. The data gathering time was under 3 minutes, and
with barn-doors to form a well-defined area light source.
the results are comparable to the moving-head light method.
To gather all photos, we hold the light outstretched and While the handheld method is not practical for photograph-
“dance” (see video). We sample the hemisphere of lighting ing a large collection of objects, it can be an invaluable tool
directions by a polar-coordinate scan in φ -major order as the for well-lit photography in the field.
camera takes sequential photographs. A Nikon D70 camera,
takes a steady stream of photos at about 3 frames per second
7. Discussion and Future Work
using autoexposure for each frame. The user stands facing
the object, and holds the light at arms’ length while moving The ability to have large area light sources is crucial for pho-
the lamp in an arc that passes directly over the object. The tographing highly specular objects. Light source size also
user moves the lamp from one side of the table to the other, affects the sharpness of shadows and highlights. Our system
scanning by π radians in θ axis with constant φ , and the nat- has a unique advantage in that larger area light sources can
ural alignment of their shoulders helps aim the light’s cen- be simulated by combining pictures illuminated with over-
terline directly at the object. After each pass over the object lapping light sources. We could extend our optimization to
with the light, the user steps sideways to change the φ angle penalize each distinct light source cluster, thus preventing
for the next scan, and makes enough of these passes to cover disjoint highlights. The softness of the light can also be con-
0 ≤ φ < π radians. In practice the user can be more careless trolled by varying the beam width between a point-source
with the light, as long as the hemisphere of light directions and a large area source as it quickly sweeps over the hemi-
is well-sampled and the images are not over-exposed. After sphere of lighting directions. More advanced moving-head
the image capture dance is complete, we downsample all im- spotlights usually provide controllable spot sizes suitable for
ages to construct aiming photos, and proceed with the sketch this purpose.
guided lighting design as before.
Even though our system is aimed primarily at non-
We find this process is quite simple and pleasing, and in professional photographers, a few simple additions can make
under three minutes we can gather around 150 high-quality it a flexible tool for a creative expert to experiment with dif-
aiming/basis photos. An experienced user might not need to ferent lighting designs more easily. For example, the user
scan the whole hemisphere, but can quickly illuminate just might specify a simple weighting mask to set the impor-
from the useful and interesting lighting directions. tance of different image regions and influence the optimiza-
tion process. While weighting masks would make the sys-
tem more flexible, they would complicate the target sketch-
6. Results
ing process. We do not know yet if the results would warrant
Images in Figure 5 show results from our sketch guided the increase in complexity. Also, tools to directly tweak the
lighting system. Both the moving-head light and the hand- light position and size on a virtual hemisphere around the
held methods are equally successful at creating arbitrary object might also aid expert users.
cleanly-lit images of desktop-sized objects. The data sets
There are several possible ways of dealing with the ambient
gathered by either method is sufficiently dense to allow easy
light in the reflective enclosure. Underexposing all images
lighting design. Additionally, our system yields reasonable
using exposure compensation on the camera, using a larger
results even when presented with unrealistic targets or highly
enclosure or one made of materials with special reflective
reflective objects.
properties would greatly minimize the ambient component.
Figure 5(a), demonstrates a user interaction sequence with Finally, it might also be possible to explicitly subtract the
the system. Starting from a uniform grayscale image as the ambient term from the basis images.
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
This paper takes the problem of good lighting for desktop [HCD01] H AWKINS T., C OHEN J., D EBEVEC P.: A pho-
photography and finds a simple and practical solution us- tometric approach to digitizing cultural artifacts. In Pro-
ing image-based relighting techniques. More sophisticated ceedings of conference on Virtual reality, archeology, and
image-based measurements might also be achievable while cultural heritage (2001), pp. 333–342. 1, 2
maintaining the simplicity and elegance of the system. For [KPC93] K AWAI J. K., PAINTER J. S., C OHEN M. F.:
example, we could estimate the incoming light direction by Radioptimization: goal based rendering. In SIGGRAPH
calibrating the ad-hoc enclosure setup with a light-probe, or (1993), pp. 147–154. 2
by using dimensionality reduction [WMTG05] for the hand-
held case. Combined with surface normals, such calibration [MDA02] M ASSELUS V., D UTRÉ P., A NRYS F.: The
might suffice for image-based estimates of BRDF. free-form light stage. In Proceedings of the 13th Euro-
graphics workshop on Rendering (2002), pp. 247–256. 2,
5
8. Acknowledgements
[MG97] M ARSCHNER S. R., G REENBERG D. P.: Inverse
We would like to thank the computer graphics group at lighting for photography. In Proceedings of IS&T/SID
Northwestern University for their support and feedback and Fifth Color Imaging Conference (1997), pp. 262–265. 2
the anonymous reviewers for their helpful comments and
[MGW01] M ALZBENDER T., G ELB D., W OLTERS H.:
suggestions. Thanks to Jingjing Meng and Xing Hu for help
Polynomial texture maps. In SIGGRAPH (2001), ACM
in early stages of the project, Kelli Johnson and Nathan Mat-
Press, pp. 519–528. 2
suda for help with the construction of the enclosure, Vincent
Masselus and Amy Gooch for help with Figure 2, and Hol- [MK ] MK D IGITAL: Photo light box. http://www.
ger Winnemöller for help with the video. This research was mkdigitaldirect.com. 2
funded in part by NSF grants 0238062 and 0237621. [MPDW03] M ASSELUS V., P EERS P., D UTRÉ P.,
W ILLEMS Y. D.: Relighting with 4d incident light fields.
References In SIGGRAPH (2003), vol. 22, pp. 613–620. 1, 2
[AD04] A NRYS F., D UTRÉ P.: Image based lighting de- [NSD94] N IMEROFF J. S., S IMONCELLI E., D ORSEY J.:
sign. In The 4th IASTED International Conference on Vi- Efficient re-rendering of naturally illuminated environ-
sualization, Imaging, and Image Processing (2004). 2 ments. In Proceedings of the Fifth Eurographics Work-
shop on Rendering (1994), pp. 359–373. 2, 3
[ADA∗ 04] AGARWALA A., D ONTCHEVA M.,
AGRAWALA M., D RUCKER S., C OLBURN A., C URLESS [NW99] N OCEDAL J., W RIGHT S. J.: Numerical Opti-
B., S ALESIN D., C OHEN M.: Interactive digital pho- mization. Springer-Verlag New York, 1999. 5
tomontage. In SIGGRAPH (2004), vol. 23, pp. 294–302. [PF95] P OULIN P., F OURNIER A.: Painting surface char-
2 acteristics. In Proceedings of Eurographics Workshop on
[ALK∗ 03] A KERS D., L OSASSO F., K LINGNER J., Rendering (1995), pp. 160–169. 2
AGRAWALA M., R ICK J., H ANRAHAN P.: Conveying [Pho] P HOTEK: Digital lighthouse. http://www.
shape and features with image-based relighting. In IEEE photekusa.com. 2
Visualization (2003). 2
[PRJ97] P OULIN P., R ATIB K., JACQUES M.: Sketching
[Ast] A STRON S YSTEMS: Orbiculight lighting system. shadows and highlights to position lights. In Proceed-
http://www.astronsys.com. 2 ings of Conference on Computer Graphics International
[CSF99] C OSTA A. C., S OUSA A. A., F ERREIRA F. N.: (1997), pp. 56–63. 2
Lighting design: a goal based approach using optimisa- [PWH04] PANG W.-M., W ONG T.-T., H ENG P.-A.: Esti-
tion. In Proceedings of Eurographics Workshop on Ren- mating light vectors in real time. IEEE Computer Graph-
dering (1999), vol. 10, pp. 317–328. 2 ics and Applications 24, 3 (2004), 36–43. 5
[DHT∗ 00] D EBEVEC P., H AWKINS T., T CHOU C., [SDS∗ 93] S CHOENEMAN C., D ORSEY J., S MITS B.,
D UIKER H.-P., S AROKIN W., S AGAR M.: Acquiring the A RVO J., G REENBURG D.: Painting with light. In SIG-
reflectance field of a human face. In SIGGRAPH (2000), GRAPH (1993), pp. 143–146. 2
pp. 145–156. 1, 2
[SL01] S HACKED R., L ISCHINSKI D.: Automatic light-
[DM97] D EBEVEC P. E., M ALIK J.: Recovering high dy- ing design using a perceptual quality metric. Computer
namic range radiance maps from photographs. In SIG- Graphics Forum 20, 3 (2001), 215–226. 2
GRAPH (1997), vol. 31, pp. 369–378. 3, 4
[WMTG05] W INNEMÖELLER H., M OHAN A., T UM -
[DWT∗ 02] D EBEVEC P., W ENGER A., T CHOU C.,
BLIN J., G OOCH B.: Light waving: Estimating light posi-
G ARDNER A., WAESE J., H AWKINS T.: A lighting re-
tions from photographs alone. Computer Graphics Forum
production approach to live-action compositing. In SIG-
24, 3 (2005), to appear. 7
GRAPH (2002), pp. 547–556. 1, 2
c The Eurographics Association 2005.
Mohan, Tumblin, Bodenheimer, Grimm, Bailey / Computed Lighting for Digital Photography
(a) Sequence showing successive sketching/optimization iterations to get the desired lighting. The first result uses a constant grayscale target, while
the others use previous results as starting points for the target image.
(b) Strategic placement of highlights in the target result in an interesting side-lit (c) Positioning of highlights reveals underlying texture in the
image. surface.
(d) Lighting a highly specular object by forcing the (e) Target results in image suggesting illumination from the
background to be dark. right.
(f) Data captured by the handheld method. Image on the left uses a smooth grayscale gradient as the target image.
1 Multi-Sampled Imaging
Currently, vision algorithms rely on images with 8 bits of brightness or color at
each pixel. Images of such quality are simply inadequate for many real-world
applications. Significant advances in imaging can be made by exploring the fun-
damental trade-offs that exist between various dimensions of imaging (see Figure
1). The relative importances of these dimensions clearly depend on the applica-
tion at hand. In any practical scenario, however, we are given a finite number
of pixels (residing on one or more detectors) to sample the imaging dimensions.
Therefore, it is beneficial to view imaging as the judicious assignment of resources
(pixels) to the dimensions of imaging that are relevant to the application.
Different pixel assignments can be viewed as different types of samplings of
the imaging dimensions. In all cases, however, more than one dimension is simul-
taneously sampled. In the simplest case of a gray-scale image, image brightness
and image space are sampled, simultaneously. More interesting examples result
from using image detectors made of an assortment of pixels, as shown in Figure
2 Nayar and Narasimhan
brightness
spectrum
space
time
polarization
depth
2. Figure 2(a) shows the popular Bayer mosaic [Bay76] of red, green and blue
spectral filters placed adjacent to pixels on a detector. Since multiple color mea-
surements cannot be captured simultaneously at a pixel, the pixels are assigned
to specific colors to trade-off spatial resolution for spectral resolution. Over the
last three decades various color mosaics have been suggested, each one resulting
in a different trade-off (see [Dil77], [Dil78], [MOS83], [Par85], [KM85]).
Historically, multi-sampled imaging has only been used in the form of color
mosaics. Only recently has the approach been used to explore other imaging
dimensions. Figure 2(b) shows the mosaic of neutral density filters with different
transmittances used in [NM00] to enhance an image detector’s dynamic range.
In this case, spatial resolution is traded-off for brightness resolution (dynamic
range). In [SN01], spatially varying transmittance and spectral filters were used
with regular wide FOV mosaicing to yield high dynamic range and multi-spectral
mosaics. Figure 2(c) shows how space, dynamic range and color can be sampled
simultaneously by using a mosaic of filters with different spectral responses and
transmittances. This type of multi-sampling is novel and, as we shall show,
results in high dynamic range color images. Another example of assorted pixels
was proposed in [BE00], where a mosaic of polarization filters with different
orientations is used to estimate the polarization parameters of light reflected by
scene points. This idea can be used in conjunction with a spectral mosaic, as
shown in Figure 2(d), to achieve simultaneous capture of polarization and color.
Multi-sampled imaging can be exploited in many other ways. Figures 2(e)
shows how temporal sampling can be used with exposure sampling. This ex-
ample is related to the idea of sequential exposure change proposed in [MP95],
[DM97] and [MN99] to enhance dynamic range. However, it is different in that
the exposure is varied as a periodic function of time, enabling the generation of
high dynamic range, high framerate video. The closest implementation appears
to be the one described in [GHZ92] where the electronic gain of the camera is
varied periodically to achieve the same effect. A more sophisticated implementa-
tion may sample space, time, exposure and spectrum, simultaneously, as shown
in Figure 2(f).
The above examples illustrate that multi-sampling provides a general frame-
work for designing imaging systems that extract information that is most per-
Assorted Pixels 3
R G R G R G R G R G R G
G B G B G B G B G B G B
R G R G R G R G R G R G
G B G B G B G B G B G B
R G R G R G R G R G R G
G B G B G B G B G B G B
R G R G R G R G R G R G
G B G B G B G B G B G B
R G R G R G R G R G R G
G B G B G B G B G B G B
R G R G R G R G R G R G
G B G B G B G B G B G B
(a) (b)
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B G B G B G B G B
(c) (d)
time R G R G R G R G R G R G
time
R G R
B G R
B G R
B G B
R G
G R
B G B
R G
G B R G B
R G B R G B
R G R G B
G B R G
G R
B G R
B G B G R G R
B G B G R
B G
G B
R G B
R G R G B
G B R G R G B
G B R G R G
G B
G R
B G B G R
R B G
G R
B G R B G B
B G R
R G R
G B R G
G B R G B
G B R G R G B
G B R G
G B G B G
R G B G R
B G
G B G R
B G B
R G B
R G R G B
G B R G R G B
G B R G B
R G
G R
B G R B G
B G R G R
B G R
B G R
B G B
R G R G B
R G R G B
G B R G B
R G B
R G
G R
B G R G R
B G B G R
B G
G R
B G R
B G B
R G B
R G R G B
G B R G B
R G B
R G B
G B G B G B G B G B G B
(e) (f)
Fig. 2. A few examples of multi-sampled imaging using assorted pixels. (a) A color
mosaic. Such mosaics are widely used in solid-state color cameras. (b) An exposure
mosaic. (c) A mosaic that includes different colors and exposures. (d) A mosaic using
color and polarization filters. (e), (f) Multi-sampling can also involve varying exposure
and/or color over space and time.
4 Nayar and Narasimhan
tinent to the application. Though our focus is on the visible light spectrum,
multi-sampling is, in principle, applicable to any form of electromagnetic radia-
tion. Therefore, the pixel assortments and reconstruction methods we describe in
this paper are also relevant to other imaging modalities such as X-ray, magnetic
resonance (MR) and infra-red (IR). Furthermore, the examples we discuss are
two-dimensional but the methods we propose are directly applicable to higher-
dimensional imaging problems such as ones found in tomography and microscopy.
We are interested in a general method that can interpolate not just color
mosaic images but any type of multi-sampled data. For this, we propose the use
of a structural model where each reconstructed value is a polynomial function
of the image brightnesses measured within a local neighborhood. The size of the
neighborhood and the degree of the polynomial vary with the type of multi-
sampled data being processed. It turns out that the model of Wober and Soini
[WS95] is a special instance of our model as it is a first-order polynomial applied
to the specific case of color mosaic images. As we shall see, our polynomial
model produces excellent results for a variety of multi-sampled images. Since it
uses polynomials, our method is very efficient and can be easily implemented
in hardware. In short, it is simple enough to be incorporated into any imaging
device (digital still or video camera, for instance).
Fig. 3. Some of the 50 high quality images (2000 x 2000 pixels, 12 bits per color
channel) used to train the local structural models described in Sections 4, 5 and 6.
Some of the 50 high quality images we have used in our experiments are
shown in Figure 3. Each of these is a 2000 x 2000 color (red, green, blue) image
with 12 bits of information in each color channel. These images were captured
using a film camera and scanned using a 12-bit slide scanner. Though the total
6 Nayar and Narasimhan
and blue ones. Specifically, the spatial resolutions of green, red and blue are
50%, 25% and 25%, respectively. Note that the entire mosaic is obtained by
repeating the 2 × 2 pattern shown on the right in Figure 4. Therefore, given a
neighborhood size, all neighborhoods in a Bayer mosaic must have one of four
possible sampling patterns. If the neighborhood is of size 3 × 3, the resulting
patterns are p1,p2, p3 and p4 shown in Figure 4.
R G R G R G R G
p1
G B G B G B G B
R G R G R G 2 x 2 Pattern
p2
G B G B G B
p3
R G R G R G p4
G B G B G B
Fig. 4. Spatially varying color (SVC) pattern on a Bayer mosaic. Given a neighborhood
size, all possible sampling patterns in the mosaic must be one of four types. In the case
of a 3 × 3 neighborhood, these patterns are p1, p2, p3 and p4.
Sp (i, j) is the neighborhood of pixel (i, j), Np is the order of the polynomial
and Cp are the polynomial coefficients for the pattern p. The coefficient indices
(a, b, c, d) are equal to (x − i, y − j, k − i, l − j).
The product Mp (x, y) Mp (k, l) explicitly represents the correlation between
different pixels in the neighborhood. For efficiency, we have not used these cross-
terms in our implementations. We found that very good results are obtained
8 Nayar and Narasimhan
Neighborhood S p ( i, j )
Reconstructed Values
R G R G Hp ( i+0.5, j+0.5, λ )
B G B B
G G
Mapping Function C R
R G R G
G B G B ( i+0.5, j+0.5 )
Mp ( x, y )
Fig. 5. The measured data Mp in the neighborhood Sp (i, j) around pixel (i, j) are
related to the high quality color values Hp (i + 0.5, j + 0.5, λ) via a polynomial with
coefficients Cp .
Mp ( with powers upto Np )
Cp(λ)
Hp(λ) Ap
Fig. 6. The mapping function in (3) can be expressed as a linear system using matrices
and vectors. For a given pattern p and color λ, Ap is the measurement matrix, Cp (λ)
is the coefficient vector and Hp (λ) is the reconstruction vector.
even when each desired value is expressed as just a sum of polynomial functions
of the individual pixel measurements:
Np
Hp (i + 0.5, j + 0.5, λ) = Cp (a, b, λ, n)Mp n (x, y). (3)
(x,y)∈Sp (i,j) n=0
The mapping function (3), for each color λ and each local pattern type p,
can be conveniently rewritten using matrices and vectors, as shown in Figure 6:
For a given pattern type p and color λ, Ap is the measurement matrix. The
rows of Ap correspond to the different neighborhoods in the image that have the
pattern p. Each row includes all the relevant powers (up to Np ) of the measured
data Mp within the neighborhood. The vector Cp (λ) includes the coefficients
of the polynomial mapping function and the vector Hp (λ) includes the desired
high quality values at the off-grid neighborhood centers. The estimation of the
model parameters Cp can then be posed as a least squares problem:
Cp (λ) = (AT
p Ap )
−1 T
Ap Hp (λ) , (5)
When the signal-to-noise characteristics of the image detector are known, (5) can
be rewritten using weighted least squares to achieve greater accuracy [Aut01].
Assorted Pixels 9
4.4 Experiments
A total of 30 high quality training images (see Figure 3) were used to compute
the structural model for SVC image interpolation. Each image is downgraded
to obtain a corresponding Bayer-type SVC image. For each of the four sampling
patterns in the Bayer mosaic, and for each of the three colors, the appropriate
image neighborhoods were used to compute the measurement matrix Ap and the
reconstruction vector Hp (λ). While computing these, one additional step was
taken; each measurement is normalized by the energy within its neighborhood
to make the structural model insensitive to changes in illumination intensity
and camera gain. The resulting Ap and Hp (λ) are used to find the coefficient
vector Cp (λ) using linear regression (see (5)). In our implementation, we used
the parameter values P = 4 (Bayer), Np = 2, u = v = 6 and Λ = 3 (R,G, B),
to get a total of 876 coefficients.
The above structural model was used to interpolate 20 test SVC images that
are different from the ones used for training. In Figure 7(a), a high quality (8-
bits per color channel) image is shown. Figure 7(b) shows the corresponding
(downgraded) SVC image. This is really a single channel 8-bit image and its
pixels are shown in color only to illustrate the Bayer pattern. Figure 7(c) shows
a color image computed from the SVC image using bi-cubic interpolation. As is
usually done, the three channels are interpolated separately using their respective
data in the SVC image. The magnified image region clearly shows that bi-cubic
interpolation results in a loss of high frequencies; the edges of the tree branches
and the squirrels are severely blurred. Figure 7(d) shows the result of applying
structural interpolation. Note that the result is of high quality with minimal loss
of details.
10 Nayar and Narasimhan
(a)
(b)
(c)
Pixels x 10
4 (d)
12
10
Structural Model
Bi-Cubic Interpolation
8
0 RMS
0 2 4 6 8 10 12 14 16 18 20 Error
(e)
Fig. 7. (a) Original (high quality) color image with 8-bits per color channel. (b) SVC
image obtained by downgrading the original image. The pixels in this image are shown
in color only to illustrate the Bayer mosaic. Color image computed from the SVC
image using (c) bi-cubic interpolation and (d) structural interpolation. (e) Histograms
of luminance error (averaged over 20 test images). The RMS error is 6.12 gray levels
for bi-cubic interpolation and 3.27 gray levels for structural interpolation.
Assorted Pixels 11
p3
p2
Fig. 8. The dynamic range of an image detector can be improved by assigning different
exposures to pixels. In this special case of 4 exposures, any 6 × 6 neighborhood in the
image must belong to one of four possible sampling patterns shown as p1 . . . p4.
5.2 Experiments
The SVE structural model was trained using 12-bit gray-scale versions of 6 of
the images shown in Figure 3 and their corresponding 8-bit SVE images. Each
SVE image was obtained by applying the exposure pattern shown in Figure 8
(with e4 = 4e3 = 16e2 = 64e1) to the original image, followed by a downgrade
from 12 bits to 8 bits. The structural model was tested using 6 test images, one
of which is shown in Figure 9. Figure 9(a) shows the original 12-bit image, Figure
9(b) shows the downgraded 8-bit SVE image, Figure 9(c) shows a 12-bit image
obtained by bi-cubic interpolation of the SVE image, and Figure 9(d) shows the
12-bit image obtained by structural interpolation. The magnified images shown
on the right are histogram equalized to bring out the details (in the clouds and
walls) that are lost during bi-cubic interpolation but extracted by structural
interpolation. Figure 9(e) compares the error histograms (computed using all
6 test images) for the two cases. The RMS errors were found to be 33.4 and
25.5 gray levels (in a 12-bit range) for bi-cubic and structural interpolations,
respectively. Note that even though a very small number (6) of images were
used for training, our method outperforms bi-cubic interpolation.
(a)
(b)
(c)
(d)
4
Pixels x 10
12 Structural Model
10 Bi-Cubic Interpolation
0
RMS
0 10 20 30 40 50 60 70 80 90 100 Error
(e)
Fig. 9. (a) Original 12-bit gray scale image. (b) 8-bit SVE image. (c) 12-bit (high
dynamic range) image computed using bi-cubic interpolation. (d) 12-bit (high dynamic
range) image computed using structural models. (d) Error histograms for the two case
(averaged over 6 test images). The RMS error for the 6 images are 33.4 and 25.5 gray
levels (on a 12-bit scale) for bi-cubic and structural interpolation, respectively. The
magnified image regions on the right are histogram equalized.
14 Nayar and Narasimhan
p1 p6 p2 p7 p3 p8
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G
p5 G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
p12
p13
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
p15
p10
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
p16
R G R G R G R G R G R G R G R G
p9
G B G B G B G B G B G B G B G B
R G R G R G R G R G R G R G R G
G B G B G B G B G B G B G B G B
p4 p14 p11
Fig. 10. A combined mosaic of 3 spectral and 4 neutral density filters used to simulta-
neously sample space, color and exposures using a single image detector. The captured
8-bit SVEC image can be used to compute a 12-bit (per channel) color image by
structural interpolation. For this mosaic, for any neighborhood size, there are only 16
distinct sampling patterns. For a 4 × 4 neighborhood size, the patterns are p1 . . . p16.
Color and exposure filters can be used to construct an SVEC sensor in many
ways. All possible mosaics that can be constructed from Λ colors and E exposures
are derived in [Aut01]. Here, we will focus on the mosaic shown in Figure 10,
where 3 colors and 4 exposures are used. For any given neighborhood size, it is
shown in [Aut01] that only 16 different sampling patterns exist (see Figure 10).
6.2 Experiments
The SVEC structural model was trained using 6 of the images in Figure 3. In
this case, the 12-bit color images in the training set were downgraded to 8-bit
SVEC images. The original and SVEC images were used to compute the 3504
coefficients of the model. The model was then used to map 10 different test
SVEC images to 12-bit color images. One of these results is shown in Figure
11. The original 12-bit image shown in Figure 11(a) was downgraded to obtain
the 8-bit SVEC image shown in Figure 11(b). This image has a single channel
Assorted Pixels 15
(a)
(b)
(c)
(d)
4
x 10
Pixels 4.5
4
Structural Model
3.5
Bi-Cubic Interpolation
3
2.5
1.5
0.5
0 RMS
0 20 40 60 80 100 120 140 160 180 Error
(e)
Fig. 11. (a) Original 12-bit color image. (b) 8-bit SVEC Image. 12-bit color images
reconstructed using (c) bi-cubic interpolation and (d) structural interpolation. (e) Lu-
minance error histogram computed using 10 test images. RMS luminance errors were
found to be 118 and 80 (on a 12-bit scale) for bi-cubic and structural interpolation,
respectively. The magnified images on the right are histogram equalized.
16 Nayar and Narasimhan
and is shown here in color only to illustrate the effect of simultaneous color and
exposure sampling. Figures 11(c) and (d) show the results of applying bi-cubic
and structural interpolation, respectively. It is evident from the magnified images
on the right that structural interpolation yields greater spectral and spatial
resolution. The two interpolation techniques are compared in Figure 11(e) which
shows error histograms computed using all 10 test images. The RMS luminance
errors were found to be 118 gray-levels and 80 gray-levels (on a 12 bit scale) for
bi-cubic and structural interpolations, respectively.
Acknowledgements: This research was supported in parts by a ONR/ARPA
MURI Grant (N00014-95-1-0601) and an NSF ITR Award (IIS-00-85864).
References
[Aut01] Authors. Resolution Enhancement Along Multiple Imaging Dimensions Using
Structural Models. Technical Report, Authors’ Institution, 2001.
[Bay76] B. E. Bayer. Color imaging array. U.S. Patent 3,971,065, July 1976.
[BE00] M. Ben-Ezra. Segmentation with Invisible Keying Signal. In CVPR, 2000.
[BK00] S. Baker and T. Kanade. Limits on Super-Resolution and How to Break
Them. In Proc. of CVPR, pages II:372–379, 2000.
[Bra65] R. N. Bracewell. The Fourier Transform and Its Applications. McGraw Hill,
1965.
[Bra94] D. Brainard. Bayesian method for reconstructing color images from trichro-
matic samples. In Proc. of IS&T 47th Annual Meeting, pages 375–380,
Rochester, New York, 1994.
[Dil77] P. L. P. Dillon. Color imaging array. U.S. Patent 4,047203, September 1977.
[Dil78] P. L. P Dillon. Fabrication and performance of color filter arrays for solid-
state imagers. In IEEE Trans. on Electron Devices, volume ED-25, pages
97–101, 1978.
[DM97] P. Debevec and J. Malik. Recovering High Dynamic Range Radiance Maps
from Photographs. Proc. of ACM SIGGRAPH 1997, pages 369–378, 1997.
[FP99] W. Freeman and E. Pasztor. Learning Low-Level Vision. In Proc. of ICCV,
pages 1182–1189, 1999.
[GHZ92] R. Ginosar, O. Hilsenrath, and Y. Zeevi. Wide dynamic range camera. U.S.
Patent 5,144,442, September 1992.
[KM85] K. Knop and R. Morf. A new class of mosaic color encoding patterns for
single-chip cameras. In IEEE Trans. on Electron Devices, volume ED-32,
1985.
[MN99] T. Mitsunaga and S. K. Nayar. Radiometric Self Calibration. In Proc. of
CVPR, pages I:374–380, 1999.
[MOS83] D. Manabe, T. Ohta, and Y. Shimidzu. Color filter array for IC image sensor.
In Proc. of IEEE Custom Integrated Circuits Conference, pages 451–455, 1983.
[MP95] S. Mann and R. Picard. Being ‘Undigital’ with Digital Cameras: Extending
Dynamic Range by Combining Differently Exposed Pictures. Proc. of IST’s
48th Annual Conf., pages 442–448, May 1995.
[NM00] S. K. Nayar and T. Mitsunaga. High Dynamic Range Imaging: Spatially
Varying Pixel Exposures. In Proc. of CVPR, pages I:472–479, 2000.
[Par85] K. A. Parulski. Color filters and processing alternatives for one-chip cameras.
In IEEE Trans. on Electron Devices, volume ED-32, 1985.
[SN01] Y. Y. Schechner and S. K. Nayar. Generalized mosaicing. In ICCV, 2001.
[WS95] M. A. Wober and R. Soini. Method and apparatus for recovering image data
through the use of a color test pattern. U.S. Patent 5,475,769, Dec. 1995.
Code A: Matlab Code for Poisson Image Reconstruction from Image Gradients
% Read Input Gray Image
imgstr = ’test.png’; disp(sprintf(’Reading Image %s’,imgstr));
img = imread(imgstr); [H,W,C] = size(img); img = double(img);
% Find gradinets
gx = zeros(H,W); gy = zeros(H,W); j = 1:H-1; k = 1:W-1;
gx(j,k) = (img(j,k+1) - img(j,k)); gy(j,k) = (img(j+1,k) - img(j,k));
figure;imagesc(img);colormap gray;colorbar;title(’Image’)
figure;imagesc(img_rec);colormap gray;colorbar;title(’Reconstructed’);
figure;imagesc(abs(img_rec-img));colormap gray;colorbar;title(’Abs error’);
% Laplacian
gyy(j+1,k) = gy(j+1,k) - gy(j,k); gxx(j,k+1) = gx(j,k+1) - gx(j,k);
f = gxx + gyy; clear j k gxx gyy gyyd gxxd
%divide
f3 = f2sin./denom; clear f2sin x y
% put solution in inner points; outer points obtained from boundary image
img_direct = boundary_image;
img_direct(2:end-1,2:end-1) = 0;
img_direct(2:end-1,2:end-1) = img_tt;
function b=dst(a,n)
%DST Discrete sine transform (Used in Poisson reconstruction)
% Y = DST(X) returns the discrete sine transform of X.
% The vector Y is the same size as X and contains the
% discrete sine transform coefficients.
% Y = DST(X,N) pads or truncates the vector X to length N
% before transforming.
% If X is a matrix, the DST operation is applied to each
% column. This transform can be inverted using IDST.
error(nargchk(1,2,nargin));
if min(size(a))==1
if size(a,2)>1
do_trans = 1;
else
do_trans = 0;
end
a = a(:);
else
do_trans = 0;
end
if nargin==1, n = size(a,1); end
m = size(a,2);
function b=idst(a,n)
%IDST Inverse discrete sine transform (Used in Poisson reconstruction)
%
% X = IDST(Y) inverts the DST transform, returning the
% original vector if Y was obtained using Y = DST(X).
% X = IDST(Y,N) pads or truncates the vector Y to length N
% before transforming.
% If Y is a matrix, the IDST operation is applied to
% each column.
if nargin==1
if min(size(a))==1
n=length(a);
else
n=size(a,1);
end
end
nn=n+1; b=2/nn*dst(a,n);
Code B: Matlab Code for Graph Cuts on Images
% Read gray scale image
I = imread(’test.png’); [H,W,C] = size(I);
figure;imagesc(I);colormap gray;title(’Image’);
figure;imagesc(Ncut);colormap gray;title(’Segmentation’);
ws = 2*pad + 1;
if(ws <= 3) ws = 3; end
WM = zeros(H*W,H*W); countWM = 0;
for kk = 1:W
for jj = 1:H
mask = logical(zeros(H,W));
cs = kk-pad; ce = kk+pad; rs = jj-pad; re = jj+pad;
if(cs<1) cs = 1; end;
if(ce>W) ce = W; end;
if(rs<1) rs = 1; end;
if(re>H) re = H; end;
mask(rs:re,cs:ce) = 1;
idx = find(mask==1);
p = abs(I(idx) - I(jj,kk)); p = exp(a*p);
countWM = countWM + 1; WM(countWM,idx) = p(:)’;
end
end
ttime = toc; disp(sprintf(’Time for generating weight matrix = %f’,ttime)); clear countWM
% Bilateral Filtering(img,winsize,sigma)
% Input -> Image img
% -> winsize: spatial filter width
% -> sigma for intensity diff gaussain filter
% -> sigma for spatial filter = winsize/6
% Output -> Filtered Image
% Author: Amit Agrawal, 2004
disp(’Bilateral Filtering’);
[H,W] = size(img);
A = padarray(img,[padnum padnum],’replicate’,’both’);
img1 = zeros(size(img));
for jj = padnum+1:(padnum+1+H-1)
for kk = padnum+1:(padnum+1+W-1)
t = sum(sum(newW));
if(t>0)
newW = newW/t;
end
img1(jj-padnum,kk-padnum) = sum(sum(imgwin.*newW));
end
end
Bibliography
AKERS, D., LOSASSO, F., KLINGNER, J., AGRAWALA, M., RICK, J., AND HANRAHAN, P. 2003.
Conveying shape and features with image-based relighting. In IEEE Visualization, 349–354.
BURT, P., AND KOLCZYNSKI, R. 1993. Enhanced image capture through fusion. In International Conference on
Computer Vision (ICCV 93), 173–182.
LEVIN, A., ZOMET, A., PELEG, S., AND WEISS, Y. 2004. Seamless image stitching in the gradient domain. In
European Conference on Computer Vision (ECCV 04).
MASSEY, M., AND BENDER, W. 1996. Salient stills: Process and practice. IBM Systems Journal 35, 3&4, 557–
574.
MUTTER, S., AND KRAUSE, M. 1992. Surrational Images: Photomontages. University of Illinois Press.
ROBINSON, H. P. 1869. Pictorial Effect in Photography: Being Hints on Composition and Chiaroscuro for
Photographers. Piper & Carter.
AGARWALA, A., DONTCHEVA, AGRAWALA, M., DRUCKER, COLBURN, CURLESS, SALESIN AND
COHEN, M. Interactive Digital Photomontage. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2004),
2004.
Time
BRAUN, M. 1992. Picturing Time: The Work of Etienne-Jules Marey. The University of Chicago Press.
FREEMAN, W. T., AND ZHANG, H. 2003. Shape-time photography. In Conference on Computer Vision and
Pattern Recognition (CVPR 03), 151–157.
Exposure
FATTAL, R., LISCHINSKI, D., AND WERMAN, M. 2002. Gradient domain high dynamic range compression.
ACM Transactions on Graphics 21, 3, 249–256.
REINHARD, E., STARK, M., SHIRLEY, P., AND FERWERDA, J. 2002. Photographic tone reproduction for
digital images. ACM Transactions on Graphics 21, 3, 267– 276.
DEBEVEC, AND MALIK. 1997. Recovering high dynamic range radiance maps from photographs. In Proc.
SIGGRAPH.
DURAND, AND DORSEY. 2002. Fast bilateral filtering for the display of high-dynamic-range images. ACMTrans.
on Graphics 21, 3.
MANN, AND PICARD. 1995. Being ’undigital’ with digital cameras: Extending dynamic range by combining
differently exposed pictures. In Proc. IS&T 46th ann. conference.
TUMBLIN, AND TURK. 1999. LCIS: A boundary hierarchy for detail-preserving contrast reduction. In Proc.
SIGGRAPH.
DICARLO, J., AND WANDELL, B. 2000. Rendering high dynamic range images. Proc. SPIE: Image Sensors
3965, 392–401.
Focus
MORGAN MCGUIRE, MATUSIK, PFISTER, HUGHES, AND DURAND, Defocus Video Matting, ACM
Transactions on Graphics, Vol 24, No 3, July 2005, (Proceedings of ACM SIGGRAPH 2005).
Passive Illumination
RASKAR, R., ILIE, A., AND YU, J. 2004. Image fusion for context enhancement and video surrealism. In NPAR
2004: Third International Symposium on Non- Photorealistic Rendering.
WEISS, Y. 2001. Deriving intrinsic images from image sequences. In International Conference On Computer Vision
(ICCV 01), 68–75.
Polorization
S. K. NAYAR, X. FANG, and T. E. BOULT, Removal of Specularities using Color and Polarization, Proceedings
of IEEE Conference on Computer Vision and Pattern Recognition,
Wavelength
D. A. SOCOLINSKY, “Dynamic range constraints in image fusion and realization.” Proc. IASTED Int. Conf.
Signal and
Image Process, 349-354 (2000).
Location
B. Wilburn, N. Joshi, V. Vaish, E. Talvala, E. Antunez, A. Barth, A. Adams, M. Horowitz, M. Levoy, High-
Performance Imaging Using Large Camera Arrays.. ACM Transactions on Graphics, Vol 24, No 3, July 2005, pp
765-776 (Proceedings of ACM SIGGRAPH 2005).
Matting
CHUANG, Y.-Y., CURLESS, B., SALESIN, D., AND SZELISKI, R. 2001. A Bayesian approach to digital
matting. In Proceedings of Computer Vision and Pattern Recognition (CVPR 2001), vol. II, 264 – 271.
PORTER, T., AND DUFF, T. 1984. Compositing digital images. In Computer Graphics (Proceedings of ACM
SIGGRAPH 84), vol. 18, 253–259.
SMITH, A. R., AND BLINN, J. F. 1996. Blue screen matting. In Proceedings of ACM SIGGRAPH 96, 259–268.
Jian SUN, Jiaya JIA, Chi-Keung TANG and Heung-Yeung SHUM, Poisson Matting, ACM Transactions on
Graphics, also in SIGGRAPH 2004, vol. 23, no. 3, July 2004, pages 315-321.
Techniques
General
DANIELSSON, P.-E. 1980. Euclidean distance mapping. Computer Graphics and Image Processing 14, 227–248.
LUCAS, B. D., AND KANADE, T. 1981. An iterative image registration technique with an application to stereo
vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI ’81), 674–679.
MORTENSEN, E. N., AND BARRETT, W. A. 1995. Intelligent scissors for image composition. In Proceedings of
SIGGRAPH 95, Computer Graphics Proceedings, Annual Conference Series, 191–198.
Graph Cuts
BOYKOV, Y., VEKSLER, O., AND ZABIH, R. 2001. Fast approximate energy minimization via graph cuts. IEEE
Transactions on Pattern Analysis and Machine Intelligence 23, 11, 1222–1239.
KWATRA, V., SCH ¨ODL, A., ESSA, I., TURK, G., AND BOBICK, A. 2003. Graphcut textures: Image and video
synthesis using graph cuts. ACM Transactions on Graphics 22, 3, 277–286.
SHI, J., AND MALIK, J. Normalized Cuts and Image Segmentation. IEEE Conf. Computer Vision and Pattern
Recognition (CVPR), June 1997, Puerto Rico
Gradient Domain
PEREZ, P., GANGNET, M., AND BLAKE, A. 2003. Poisson image editing. ACM Transactions on Graphics 22, 3,
313–318.
C. TOMASI, AND R. MANDUCHI, Bilateral Filtering of gray and colored images, Proc. IEEE Intl. Conference on
Computer Vision, pp. 836-846, 1998.
CHOUDHURY, P., TUMBLIN, J., "The Trilateral Filter for High Contrast Images and Meshes", Proc. of the
Eurographics Symposium on Rendering, Per. H. Christensen and Daniel Cohen eds., pp. 186-196, 2003
Feature Extraction
Shape/Material/Illumination, Surface normals
BASRI, R. JACOBS, D. Photometric stereo with general, unknown lighting, Computer Vision and Pattern
Recognition, 2001
B. K. P. HORN, "Shape from shading: A method for obtaining the shape of a smooth opaque object from one view,"
MIT Project MAC Int. Rep. TR-79 and MIT AI Lab, Tech. Rep. 232, Nov. 1970.
TODD ZICKLER, PETER N. BELHUMEUR, AND DAVID J. KRIEGMAN, "Helmholtz Stereopsis: Exploiting
Reciprocity for Surface Reconstruction." Proc. 7th European Conference on Computer Vision, May 2002. Vol. III,
pp 869-884.
Depth edges
Ramesh RASKAR , Karhan TAN, Rogerio FERIS, Jingyi YU, Matthew TURK, Non-photorealistic Camera: Depth
Edge Detection and Stylized Rendering Using a Multi-Flash Camera, SIGGRAPH 2004
Depth
S. K. NAYAR and Y. NAKAGAWA,, Shape from Focus, IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 16, No. 8, pp. 824-831,
Elmar EISEMANN and Frédo DURAND, Flash Photography Enhancement Via Intrinsic Relighting, SIGGRAPH
2004
Georg PETSCHNIGG, Maneesh AGRAWALA, Hugues HOPPE, Richard SZELISKI, Michael COHEN, Kentaro
TOYAMA. Digital Photography with Flash and No-Flash Image Pairs.ACM Transactions on Graphics (Proceedings
of SIGGRAPH 2004), 2004.
DICARLO, J. M., XIAO, F., AND WANDELL, B. A. 2001. Illuminating illumination. In 9th Color Imaging
Conference, 27–34.
Noise
P. MEER, J. JOLION, AND A. ROSENFELD, "A Fast Parallel Algorithm For Blind Estimation of Noise Variance,"
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 2, pp. 216-223, 1990.
Eric P. BENNETT and Leonard McMILLAN "Video Enhancement Using Per-Pixel Virtual Exposures"
SIGGRAPH 2005
Geometric Operations
Panorama
DAVIS, J. 1998. Mosaics of scenes with moving objects. In Computer Vision and Pattern Recognition (CVPR 98),
354–360.
UYTTENDAELE, M., EDEN, A., AND SZELISKI, R. 2001. Eliminating ghosting and exposure artifacts in image
mosaics. In Conference on Computer Vision and Pattern
Recognition (CVPR 01), 509–516.
SZELISKI, R., AND SHUM, H.-Y. 1997. Creating full view panoramic mosaics and environment maps. In
Proceedings of SIGGRAPH 97, Computer Graphics Proceedings,
Annual Conference Series, 251–258.
Synthetic Aperture
Marc LEVOY, Billy CHEN, Vaibhav VAISH, Mark HOROWITZ, Ian MCDOWALL, Mark BOLAS,
Synthetic Aperture Confocal Imaging. ACM SIGGRAPH 2004.
A. STERN and B. JAVIDI, "3-D computational synthetic aperture integral imaging (COMPSAII)," Opt. Express 11,
2446-2451 (2003),
C. OLIVER and S. QUEGAN, Understanding Synthetic Aperture Radar Images. London: Artech House, 1998.
M. BEN-EZRA AND S. K. NAYAR , Motion Deblurring using Hybrid Imaging, In Proc. IEEE Computer Vision
and Pattern Recognition (CVPR), Wisconsin, June 2003.
O. LANDOLT, A. MITROS, AND C. KOCH, “Visual Sensor with Resolution Enhancement by Mechanical
Vibrations,” Proc. 2001 Conf. Advanced Research in VLSI, pp. 249-264, 2001.
MEMS Technology
S. K. NAYAR, V. BRANZOI, AND T. BOULT. Programmable Imaging using a Digital Micromirror Array,
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington DC, June 2004.
B. WANDELL, P. CATRYSSE, J. DICARLO, D. YANG AND A. EL GAMAL Multiple Capture Single Image
Architecture with a CMOS Sensor , In Proceedings of the International Symposium on Multispectral Imaging and
Color Reproduction for Digital Archives, pp. 11-17, Chiba, Japan, October 21-22 1999. (Society of Multispectral
Imaging of Japan.)
S. KLEINFELDER, S.H. LIM, X.Q. LIU AND A. EL GAMAL A 10,000 Frames/s CMOS Digital Pixel Sensor, In
IEEE Journal of Solid State Circuits, Vol.36, No.12, Pages 2049-2059, December 2001
X.Q. LIU AND ABBAS EL GAMAL, Synthesis of High Dynamic Range Motion Blur Free Image From Multiple
Captures, In IEEE Transactions on circuits and systems (TCASI), VOL. 50, NO. 4, pp 530-539, APRIL 2003
Ramesh RASKAR, Amit AGRAWAL, Jack TUMBLIN, Coded Exposure Photography: Motion Deblurring using
Fluttered Shutter, ACM SIGGRAPH 2006.
Programmable SIMD
JOHANSSON, R., LINDGREN, L., MELANDER, J., AND MOLLER, B. 2003. A multi-resolution 100 gops 4
gpixels/s programmable cmos image sensor for machine vision. In Proceedings of the 2003 IEEE Workshop on
Charge-Coupled Devices and Advanced Image Sensors, IEEE.
Advanced, Programmable, Demodulating Cameras and Temporal Correlation
F. XIAO, J. DICARLO, P. CATRYSSE AND B. WANDELL, Image Analysis using Modulated Light Sources, In
Proceedings of the SPIE Electronic Imaging '2001 conference, Vol. 4306, San Jose, CA, January 2001.
ANDO, S., K. NAKAMURA, AND T. SAKAGUCHI. Ultrafast Correlation Image Sensor: Concept, Design, and
Applications,. in Proc. IEEE CCD/AIS Workshop. 1997. Bruges, Belgium: IEEE.
ANDO, S. AND A. KIMACHI. Time-Domain Correlation Image Sensor: First CMOS Realization of Demodulator
Pixels Array. in Proc. '99 IEEE CCD/AIS Workshop. 1999.
Computational Photography
SIGGRAPH 2006 Course 15 Notes
Organizers
Ramesh Raskar
Senior Research Scientist
MERL - Mitsubishi Electric Research Labs
201 Broadway
Cambridge, MA 02139
Email: raskar@merl.com
http://www.merl.com/people/raskar/
Jack Tumblin
Assistant Professor
Computer Science Dept., Northwestern University
1890 Maple Avenue, Evanston IL 60201,
Email: jet@cs.northwestern.edu
http://www.cs.northwestern.edu/~jet
http://www.merl.com/people/raskar/photo/