Computer Vision: Cse 576 Ali Farhadi

Computer
Vision

CSE 576
Ali Farhadi
Many slides from Steve Seitz, Larry Zitnick, Yang Wang

Course InformaGon
•  Time:
–  Monday, Wednesday 1:30-‐2:50
•  LocaGon:
–  MGH 238
•  Contact:
–  ali@cs.uw.edu , CSE 652
•  TA:
–  Dun-‐Yu Hsiao
–  dyhsiao@cs.washington.edu
•  Website:
–  hWp://www.cs.washington.edu/educaGon/courses/cse576/15sp/
What does it mean to see?
The car is in front of the pole
Sky
Person
Road
White
Horse
Car
Shadow
1m
Wheel
Computer Vision
•  Low Level Vision

–  Measurements
–  Enhancements
–  Region segmentaGon
–  Features
•  Mid Level Vision
–  ReconstrucGon
–  Depth
–  MoGon EsGmaGon
•  High Level Vision
–  Category detecGon
–  AcGvity recogniGon
–  Deep understandings
Computer Vision

–  Measurements
–  Enhancements
–  Features
White
–  Depth Shadow
–  MoGon EsGmaGon 1m
Vision as Measurement Device
Real-time stereo on Mars

Physics-based Vision
Structure from Motion Virtualized Reality

Slide Credit: Alyosha Efros
Measurement
Brightness
Measurement
Brightness

Measurement
Length
Müller-‐Lyer Illusion
hWp://www.michaelbach.de/ot/sze_muelue/index.html Slide Credit: Alyosha Efros
Image Enhancement
Image Inpainting, M. Bertalmío et al.

hWp://www.iua.upf.es/~mbertalmio//restoraGon.html
Image Enhancement

Image Enhancement

Seam Carving
[Shai & Avidan, SIGGRAPH 2007]

Tradi5onal resizing
Content-‐aware resizing
[Shai & Avidan, SIGGRAPH 2007]

Computer Vision

–  Measurements
–  Enhancements The car is in front of the pole
–  Features
–  Depth
Input Image (1 of 45) Reconstruction
Reconstruction Reconstruction Source: S. Seitz

Input Image
(1 of 100)
Views of Reconstruction
Yasutaka Furukawa and Jean Ponce, Carved Visual Hulls for Image-‐Based Modeling,
ECCV 2006.
Google’s 3D Maps
Structure estimation from tourist photos
Apple’s 3D maps
Computer Vision

–  Measurements
–  Enhancements Sky Person
•  Features
Road
•  Mid Level Vision Car Horse
–  Depth
–  Pose esGmaGon
Visual RecogniGon?
•  What does it mean to “see”?
–  “What” is “where”, Marr 1982
•  Get computers to “see”

Visual Recognition
Verification
Is this a car?
Visual Recognition
Classification:
Is there a car in this picture?
Visual Recognition
Detection:
Where is the car in this picture?
Visual Recognition
Pose Estimation:
Visual Recognition
Activity Recognition:
What is he doing? What is he doing?

Visual Recognition
Object Categorization:
Sky
Person
Tree
Horse
Car
Person
Bicycle
Road
Visual Recognition
Segmentation
Sky
Tree
Car
Person
How hard is computer vision?
“In 1966, Minsky hired a first-year
undergraduate student and assigned him
a problem to solve over the summer:
connect a television camera to a
computer and get the machine to
describe what it sees.”
Crevier 1993, pg. 88
Marvin Minsky, MIT

Turing award,1969
Marvin Minsky, MIT Gerald Sussman, MIT
Turing award,1969
“You’ll notice that Sussman never worked
in vision again!” – Berthold Horn
Why vision is so hard?
Why is vision so hard?
•  Ill-‐posed problem
[Sinha and Adelson 1993]

Challenges 1: view point variation
Michelangelo 1475-1564 slide by Fei Fei, Fergus & Torralba

Challenges 2: illumination
slide credit: S. Ullman

Challenges 3:
occlusion
Magritte, 1957 slide by Fei Fei, Fergus & Torralba

Challenges 4: scale
slide by Fei Fei, Fergus & Torralba

Challenges 5: deformation
slide by Fei Fei, Fergus & Torralba Xu, Beihong 1943

Challenges 6: background clutter
Klimt, 1913 slide by Fei Fei, Fergus & Torralba

Challenges 7: object intra-class variation
slide by Fei-Fei, Fergus & Torralba

Challenges 8: local ambiguity
slide by Fei-Fei, Fergus & Torralba

Challenges 9: the world behind the image

What Works Today?
•  Reading license plates, zip codes, checks
Svetlana Lazebnik
Biometrics
Fingerprint scanners on Face recogniGon systems now beginning

many new laptops, to appear more widely
other devices hWp://www.sensiblevision.com/

Source: S. Seitz

Mobile visual search: Google Goggles
Face detecGon
•  Many new digital cameras now detect faces

–  Canon, Sony, Fuji, …

Source: S. Seitz
Smile detecGon
Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz

Face recogniGon: Apple iPhoto,
Facebook, Google, etc
Object recogniGon (in supermarkets)
LaneHawk by EvoluGonRoboGcs

“A smart camera is flush-‐mounted in the checkout lane, conGnuously watching
for items. When an item is detected and recognized, the cashier verifies the
quanGty of items that were found under the basket, and conGnues to close the
transacGon. The item can remain under the basket, and with LaneHawk,you
are assured to get paid for it… “
Safety
Security
AutomoGve safety
•  Mobileye: Vision systems in high-‐end BMW, GM, Volvo models

–  Pedestrian collision warning
–  Forward collision warning
–  Lane departure warning
–  Headway monitoring and warning
Source: A. Shashua, S. Seitz
Google cars
Oct 9, 2010. "Google Cars Drive Themselves, in Traffic". The New York Times. John Markoff
June 24, 2011. "Nevada state law paves the way for driverless cars". Financial Post.
ChrisGne Dobby
Aug 9, 2011,
"Human error blamed auer Google's driverless car sparks five-‐vehicle crash". The
Star (Toronto)
Vision-‐based interacGon: Xbox Kinect
Kinect Fusion
Augmented reality, consumer products
hWp://nconnex.com/wp/
Special effects: shape and moGon capture
Source: S. Seitz

Vision for roboGcs, space exploraGon
NASA'S Mars ExploraGon Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
Vision systems (JPL) used for several tasks

•  Panorama sGtching
•  3D terrain modeling
•  Obstacle detecGon, posiGon tracking
•  For more, read “Computer Vision on Mars” by MaWhies et al.
Source: S. Seitz
Medical imaging
Image guided surgery

3D imaging
Grimson et al., MIT
MRI, CT
Computer vision in other scienGfic
fields

Computer vision research in biology
hWp://www.vision.caltech.edu/visipedia/
hWp://leafsnap.com/
Computer vision in cosmology
hWp://astrometry.net/
Computer vision research in
healthcare
assisted living, paGent monitoring

auGsm screening
[Lan et al, PAMI 2012]
hWp://www.gatech.edu/newsroom/
release.html?nid=60509
Computer vision in the real-‐world
•  Most examples are less than 5 years old
•  Very acGve research area. Many new
applicaGons to come.
•  A website of computer vision industries
maintained by Prof. David Lowe (UBC):
hWp://www.cs.ubc.ca/~lowe/vision.html
TentaGve Syllabus
•  Image Processing (2 weeks)
•  filtering, convoluGon
•  image pyramids
•  edge detecGon
•  feature detecGon (corners, lines)
•  hough transform
•  Image Transforma5on (2 weeks)

•  image warping (parametric transformaGons, texture mapping)
•  image composiGng (alpha blending, color mosaics)
•  segmentaGon and ma|ng (snakes, scissors)
•  Mo5on Es5ma5on (1 week)

•  opGcal flow
•  image alignment
•  image mosaics
•  feature tracking
Syllabus
3D Modeling (1 weeks)
•  projecGve geometry
•  camera modeling
•  single view metrology
•  camera calibraGon
•  stereo
•  Computa5onal Photography (1 week)

•  Super resoluGon
•  Alpha Ma|ng
•  Blur removal
•  Poisson Blending
•  Visual Recogni5on (3 week)

•  Eigenfaces
•  Category RecogniGon
•  Object DetecGon
•  Kinect
Grading
•  Four assignments (10 each+ extra points)
–  Mix of coding and wriWen answers.
–  Using Qt (cross pla•orm UI in c++) qt.nokia.com
–  Use of interacGve UIs for exploring and gaining
intuiGon
1.  Filters and edge detecGon
2.  CreaGng panoramas
3.  CompuGng depth from stereo
4.  Face detecGon
•  FINAL PROJECT (60 points + 20 extra points)
Assignment 1: Image Filtering
10 Points
Assignment 2: Panorama SGtching
10 Points
Assignment 3: Stereo ReconstrucGon
10 Points
Assignment 4: Face DetecGon
10 Points
Final Project
60 Points + 20 Extra points
•  Big Project
–  BeWer if related to your own research
–  Demo is a BIG plus

•  Proposal is due on 4/6
–  One Paragraph,
–  Crisp final outcome/deliverable
•  Progress Reports are due on
–  4/15, 4/29,5/13,5/27
–  What has changed since last report
•  Final PresentaGon will be on 6/3,
–  Demo/Posters @ CSE atrium
Sample Projects
From Taskar Center for Accessible Technology
Project: My Kingdom

Sample Projects
Project: Curb Alert

Sample Projects
Project: Silent Movie

Samples of Previous Projects
•  Visual Calculator
•  Seam Carving
•  X-‐ray bone fracture detecGon
•  Pipe leak detecGon
•  Is it gonna be viral?
•  Deep learning for object recogniGon
•  …
Project Ideas
•  Seam Carving
•  Video StabilizaGon
•  DetecGng Shadows
•  RGBD object DetecGon
•  Features
–  Learning Features •  Object DetecGon in Videos
–  Features for regions •  Video Google

–  Comparison of features in
the literature •  Matching Images and Videos in the
wild
•  AcGon RecogniGon •  Reading Street Signs
–  Human pose •  Wearable Cameras for visually impaired users
–  Objects and InteracGons •  Auto Zooming
–  Using Kinect
–  DetecGng unaWended •  Visual Odometer
luggages •  Smart stop lights
–  Egocentric
•  Language & Vision
•  Grab cut
Books
CalibraGon
•  How many of you
–  have taken an undergrad vision course?
–  have taken an ML course?
–  have taken a Graphics course?
–  Remember your linear algebra course in your

undergrad?
–  have any concerns about programming?

Do these words remind you of
something?
Interest Point SIFT

Laplacian Eigenvalue
SVD SVM
MRF STEREO
Random Graph cut
Forest
Preferences
•  Low level vision?
•  Mid level vision?

•  High level vision?

Computer Vision: Cse 576 Ali Farhadi

Uploaded by

Copyright:

Available Formats

Computer Vision: Cse 576 Ali Farhadi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Vision: Cse 576 Ali Farhadi

Uploaded by

Copyright:

Available Formats

Computer

Many slides from Steve Seitz, Larry Zitnick, Yang Wang

• Low Level Vision

• Low Level Vision

Real-time stereo on Mars

Structure from Motion Virtualized Reality

Slide Credit: Alyosha Efros

Image Inpainting, M. Bertalmío et al.

Image Inpainting, M. Bertalmío et al.

Image Inpainting, M. Bertalmío et al.

[Shai & Avidan, SIGGRAPH 2007]

[Shai & Avidan, SIGGRAPH 2007]

• Low Level Vision

Reconstruction Reconstruction Source: S. Seitz

• Low Level Vision

• Get computers to “see”

What is he doing? What is he doing?

Marvin Minsky, MIT

[Sinha and Adelson 1993]

Michelangelo 1475-1564 slide by Fei Fei, Fergus & Torralba

slide credit: S. Ullman

Magritte, 1957 slide by Fei Fei, Fergus & Torralba

slide by Fei Fei, Fergus & Torralba

slide by Fei Fei, Fergus & Torralba Xu, Beihong 1943

Klimt, 1913 slide by Fei Fei, Fergus & Torralba

slide by Fei-Fei, Fergus & Torralba

slide by Fei-Fei, Fergus & Torralba

Slide Credit: Alyosha Efros

Fingerprint scanners on Face recogniGon systems now beginning

Source: S. Seitz

• Many new digital cameras now detect faces

Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz

LaneHawk by EvoluGonRoboGcs

• Mobileye: Vision systems in high-­‐end BMW, GM, Volvo models

Source: S. Seitz

Vision systems (JPL) used for several tasks

Image guided surgery

assisted living, paGent monitoring

• Image Transforma5on (2 weeks)

• Mo5on Es5ma5on (1 week)

• Computa5onal Photography (1 week)

• Visual Recogni5on (3 week)

– Demo is a BIG plus

– have taken an ML course?

– have taken a Graphics course?

– Remember your linear algebra course in your

– have any concerns about programming?

Interest Point SIFT

• Mid level vision?

You might also like

•  Low Level Vision

•  Low Level Vision

•  Low Level Vision

•  Low Level Vision

•  Get computers to “see”

•  Many new digital cameras now detect faces

•  Mobileye: Vision systems in high-‐end BMW, GM, Volvo models

•  Image Transforma5on (2 weeks)

•  Mo5on Es5ma5on (1 week)

•  Computa5onal Photography (1 week)

•  Visual Recogni5on (3 week)

–  Demo is a BIG plus

–  have taken an ML course?

–  have taken a Graphics course?

–  Remember your linear algebra course in your

–  have any concerns about programming?

•  Mid level vision?