AIS412 - Lecture 1

Fall 24
Deep Learning
for Computer
Vision
AIS412
MUSTAFA ELATTAR
*This course material is sourced from Carnegie Mellon

University for Computer Vision and Stanford University for the
CNN for Visual Recognition course.
What is computer vision?
AIS412 - DEEP LEARNING FOR COMPUTER VISION MUSTAFA ELATTAR 2

What is Computer Vision?
AIS412 - DEEP LEARNING FOR COMPUTER VISION MUSTAFA ELATTAR 3

Photo by Svetlana Lazebnik
What a person sees

What a computer sees
Photo by Svetlana Lazebnik
Why are we able to interpret this image?

The goal of computer vision is to give
computers
(super) human-level perception
7
typical perception pipeline
representation
‘fancy math’
output
representation
what should we look

at? (image
features)
‘fancy math’
output what can we understand?

(semantic segmentation)
representation
what should we look

at? (image
features)
easy to get lost in

the techniques
‘fancy math’

representation
The parts that we are

what should we look
most interested in at? (image
features)
‘fancy math’

Important note:
In general, computer vision does not
work
12
Important note:
In general, computer vision does not
work
(except in certain situations/conditions)
13
Applications of computer vision
14
Machine Vision
Automated visual inspection

Object Recognition
Toshiba Tech IS-910T 2013
DataLogic LaneHawk LH4000 2012

Face detection
Age recognition
Sony Cyber-shot
Smile recognition
Face makeovers
BMW 5 series
BMW night vision

“Around view” camera
Infiniti EX
Image stitching
Photosynth
Tango
Ball Possession
Virtual Fitting
Deep Face
Deep Dream
Facebook video style transfer 2016
Industry aggressively hiring
CV faculty from universities
Industry aggressively
hiring CVgraduates, or
even students!
(strong dominant industrial presence
at conferences for recruitment)
ITCS Vision and Mission
Vision:
To be a world-class school, recognized as one of the top in
the region in research, education and entrepreneurship.
Mission:
The mission of the school is to contribute to the development
of cultural values and to information technology-driven
economies in the region through the pursuit of education,
research, innovation and entrepreneurship at the highest
levels of excellence.
ADD FOOTER HERE 33

AIS412 - Course Aim
The aim of this course is to provide students with a
comprehensive understanding of deep learning methods
and their practical applications in Computer Vision. By the
end of the course, students will have gained a solid
foundation in the basic components of deep learning
algorithms, and will be able to apply them to real-life
scenarios. The course aims to equip students with the skills
and knowledge necessary to contribute to cutting-edge
developments in the field of computer vision and deep
learning, and to inspire them to think creatively and
innovatively in their use of these methods.
ADD FOOTER HERE 34

Fall 24
AIS412 - ILOs
Explain the basic components and workings of deep learning algorithms, including CNNs,
A1
RNNs, attention mechanisms, encoder-decoder models, and generative models.
List and describe real-life use cases of deep learning, including voice, NLP, and vision
A2
applications.
Identify the limitations of traditional machine learning and the advantages of deep learning over
A3
machine learning.
Apply deep learning algorithms to real-life scenarios, including developing and training neural
B1
networks for various applications.
Investigate and analyze the performance of deep learning models and identify areas for
B2
improvement.
Combine and test different deep learning techniques to solve complex problems and explore
B3
creative applications.
C1 Design and suggest deep learning solutions for specific use cases and evaluate their effectiveness.
Present and report on the results of deep learning projects, including analysis of performance and
C2
areas for improvement.
Link and judge the ethical and social implications of deep learning, including issues of bias and
C3
fairness in algorithmic decision-making.
D1 Collaborate with peers in developing and testing deep learning solutions.
D2 Apply entrepreneurial skills to develop and explore creative applications of deep learning.
ADD FOOTER HERE 35

Fall 24
AIS412 - Syllabus
Date Topics Lab Assignments
Topic 1: Course Introduction
30 Sep
Topic 2: Advanced Image processing - Hough transforms
Topic 2: Advanced Image processing - Hough transforms
7 Oct Lab 1
Topic 3: Feature Detection - Corner detection
Topic 3: Feature Detection - Corner detection
14 Oct Lab 2 Assignment 1
Topic 4: Feature Detection - Feature descriptors
21 Oct Topic 5: Stereo Vision Lab 3 Assignment 1 Deadline
Topic 6: Motion Tracking: Optical flow (LK, HS) - Tracking (KLT, Mean-Shift)
28 Oct Lab 4 Assignment 2
Topic 7: Image Registration: Correspondence Finding
4 Nov Topic 8: Object Features Learning: K-means, Bag of words Lab 5 Assignment 2 Deadline
11 Nov Midterm Week
Topic 9: Classification, Loss Functions and Optimization
18 Nov Topic 10: Convolutional Neural Networks Lab 6 Project registered for each five students
Topic 11: Training Neural Networks
Topic 12: CNN Architectures (VGG, GoogLeNet, ResNet, etc)
25 Nov Lab 7 Assignment 3
Topic 13: Recurrent Neural Networks
2 Dec Topic 14: Detection and Segmentation Lab 8 Assignment 3 Deadline
9 Dec Topic 15: Generative Models Lab 9
16 Dec
23 Dec Backup Week
30 Dec
ADD FOOTER HERE
Project Submission, Discussion, and Oral Presentation 36
6 Jan Study Week
AIS412 – Grading Scheme and Resources
Grading Policy:
● Attendance 3%
● 3 Assignments 18% (3*6)
● Tutorial and Lab 14% (9*1.5)
● Quizzes 5%
● Project 15%
● Midterm 15%
● Final 30%
Handouts:
● Lectures + Labs
● Textbook
● Computer Vision: Algorithms and Applications, 2nd ed. by Richard
Szeliski
● Deep Learning by Ian Goodfellow
ADD FOOTER HERE 37

Team
Instructor:
Dr. Mustafa A. Elattar
Associate Professor
melattar@nu.edu.eg
Office: 210
Office hours: Tuesday 8:30 to 10:00
TA:
Eng. Aly Abdelmegeid
alymohamed@nu.edu.eg
Office: 220
Office hours: Monday 11:00 to 1:00
ADD FOOTER HERE 38

AIS412
Lecture 2: Hough
Transform
MUSTAFA ELATTAR
*This course material is sourced from Carnegie Mellon

University for Computer Vision and Stanford University for the
CNN for Visual Recognition course.
Hough transform
LECTURE 2: HOUGH TRANSFORM AND CORNER DETECTION 40

Slide Credits
Most of these slides were adapted from:
• Kris Kitani (15-463, Fall 2016).
Some slides were inspired or taken from:
• Fredo Durand (MIT).
• James Hays (Georgia Tech).
LECTURE 2: HOUGH TRANSFORM AND CORNER DETECTION 41

Lecture Overview
Finding boundaries
Line fitting
Line parameterizations
Hough transform
Hough circles
TEACH A COURSE 42
Finding Boundaries
43
Where are the object boundaries?
Human annotated boundaries
Edge detection
Multi-scale edge detection
Edge strength does not necessarily correspond to our
perception of boundaries
Where are the object boundaries?
Human annotated boundaries
Edge detection
Defining boundaries
are hard for us too
Where is the boundary of the mountain top?
Applications
Autonomous Vehicles tissue engineering behavioral genetics

(lane line detection) (blood vessel counting) (earthworm contours)
Autonomous Vehicles Computational Photography

(semantic scene segmentation) (image inpainting)
Line Fitting
55
Line fitting
Given: Many pairs
Find: Parameters
Minimize: Average square distance:
Using:
Note:
What are some problems with the approach?

Problems with parameterizations
Where is the line that minimizes E?
Huge E!
Problems with parameterizations
Where is the line that minimizes E?
Line that minimizes E!!

Problems with noise
Least-squares error fit Squared error heavily penalizes outliers

Model fitting is difficult because…
• Extraneous data: clutter or multiple models
– We do not know what is part of the model?
– Can we pull out models with a few parts from much larger
amounts of background clutter?
• Missing data: only some parts of model are present
• Noise
• Cost:
– It is not feasible to check all combinations of features by
fitting a model to each possible subset
So what can we do?

Line parameterizations
61
Slope intercept form
slope y-intercept
Double intercept form
x-intercept y-intercept
Derivation:
(Similar slope)
Normal Form
Derivation:
plug into:
Hough transform
65
Hough transform
• Generic framework for detecting a parametric model
• Edges don’t have to be connected
• Lines can be occluded

• Key idea: edges vote for the possible models
66
Image and parameter space
variables variables
parameters parameters
a line
becomes a
point
Image space Parameter space

variables
parameters
What would a point in image space

become in parameter space?
Image space
variables variables
a point
becomes a
line

variables variables
two points
become
?

variables variables
two points
become
?

variables variables
three points
become
?

variables variables
three points
become
?

variables variables
four points
become
?

variables variables
four points
become
?

How would you find the best fitting line?
Is this method robust to measurement noise?

Is this method robust to outliers?
Line Detection by Hough Transform
Al g o r i t h m:
1 . Quant i ze P aramet er Space
2 . C r e a t e Accumulator Array Parameter Space
3. Set
1 1
4 . F o r e a c h image e d g e 1
1 1
1
For each element i n 2
If l i e s on t h e l i n e :
1 1
1 1
Increment 1 1
5 . F i n d l o c a l maxima i n
Problems with parameterization
How big does the accumulator need to be for the parameterization ?
1 1
1 1
1 1
2
1 1
1 1
1 1
Problems with parameterization
How big does the accumulator need to be for the parameterization ?
1 1
1 1
1 1
2
1 1
1 1
1 1
The space of m is huge! The space of c is huge!

Better Parameterization
Use normal form:
Given points find
Image Space
Hough Space Sinusoid
?
(Finite Accumulator Array Size)
Hough Space
variables parameters
parameters variables
a point
becomes?

variables parameters
parameters variables
a point
becomes a
wave

variables
parameters
a line
becomes?

variables
parameters
a line
becomes a
point

variables
parameters
a line
becomes?

variables
parameters
a line
becomes a
point

variables
parameters
a line
becomes a
point

variables
parameters
a line
becomes a
point

variables
parameters
a line
becomes a
point

variables
parameters
a line
becomes a
point

variables
parameters
Wait …why is rho negative?
a line
becomes a
point

variables
same line through the point
parameters a line becomes a point

There are two ways to
write the same line:
Positive rho version:
Negative rho version:
Recall:
variables
same line through the point
a parameters
line becomes a point

variables
parameters
two points
become
?

variables
parameters
three points
become
?

variables
parameters
four points
become
?

Implementation
NOTE: Watch your coordinates. Image origin is top left!

Image space Votes
Basic shapes
(in parameter space)
can you guess the shape?

Basic shapes
(in parameter space)
line rectangle circle

More complex image
In practice, measurements are noisy…
Image space Votes

Too much noise …
Image space Votes

Effects of noise level
Number of votes for a line of 20 points with increasing noise
15
Maximum number of votes
10
5
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
Noise level
More noise, fewer votes (in the right bin)

Effects of noise level
12
Maximum number of votes

9
0
20 40 60 80 100 120 140 160 180 200
Number of noise points
More noise, more votes (in the wrong bin)

Real-world example
Original Edges parameter space Hough Lines

Hough Circles
108
Let’s assume radius known
variables variables
What is the dimension of the parameter space?

variables variables
What does a point in image space correspond to in parameter space?

variables variables
variables variables
variables variables
variables variables
What if radius is unknown?
variables variables
If radius is not known: 3D Hough Space!
UseAccumulator array
Surface shape in Hough space is

complicated
Using Gradient Information
Gradient information can save lot of
computation:
Edge Location
Edge Direction
Assume radius is known:
Need to increment only one point in accumulator!

variables variables
variables variables
Pennie Hough detector Quarter Hough detector
Pennie Hough detector Quarter Hough detector
The Hough transform
Deals with occlusion well?
Detects multiple instances?
Robust to noise?
Good computational complexity?
Easy to set parameters?

Application of Hough
transforms
123
Detecting shape features
F. Jurie and C. Schmid, Scale-invariant shape features for

recognition of object categories, CVPR2004
Original
images
Laplacian circles Hough-like circles
Which feature detector is more consistent?

Robustness to scale and clutter

AIS412 - Lecture 1

Uploaded by

Copyright:

Available Formats

AIS412 - Lecture 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AIS412 - Lecture 1

Uploaded by

Copyright:

Available Formats

Fall 24

*This course material is sourced from Carnegie Mellon

AIS412 - DEEP LEARNING FOR COMPUTER VISION MUSTAFA ELATTAR 2

AIS412 - DEEP LEARNING FOR COMPUTER VISION MUSTAFA ELATTAR 3

What a person sees

Why are we able to interpret this image?

what should we look

output what can we understand?

what should we look

easy to get lost in

output what can we understand?

The parts that we are

output what can we understand?

Automated visual inspection

Toshiba Tech IS-910T 2013

DataLogic LaneHawk LH4000 2012

BMW night vision

ADD FOOTER HERE 33

ADD FOOTER HERE 34

ADD FOOTER HERE 35

ADD FOOTER HERE 37

ADD FOOTER HERE 38

*This course material is sourced from Carnegie Mellon

LECTURE 2: HOUGH TRANSFORM AND CORNER DETECTION 40

• Kris Kitani (15-463, Fall 2016).

Some slides were inspired or taken from:

• Fredo Durand (MIT).

• James Hays (Georgia Tech).

LECTURE 2: HOUGH TRANSFORM AND CORNER DETECTION 41

Autonomous Vehicles tissue engineering behavioral genetics

Autonomous Vehicles Computational Photography

Minimize: Average square distance:

What are some problems with the approach?

Line that minimizes E!!

Least-squares error fit Squared error heavily penalizes outliers

So what can we do?

• Edges don’t have to be connected

• Lines can be occluded

Image space Parameter space

What would a point in image space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Is this method robust to measurement noise?

1 . Quant i ze P aramet er Space

2 . C r e a t e Accumulator Array Parameter Space

For each element i n 2

The space of m is huge! The space of c is huge!

Given points find

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space

Image space Parameter space