0% found this document useful (0 votes)
6 views

Lecture 2 Introduction 2D ProjectiveGeometry

The document outlines a course on Computer Vision led by Dr. Syed Faisal Bukhari at the University of the Punjab, detailing the syllabus, grading criteria, and essential readings including textbooks and reference materials. It covers fundamental concepts such as vector spaces, projective geometry, image acquisition, and digital image processing, emphasizing the importance of sampling and quantization in image quality. Additionally, it highlights applications in various fields like medical imaging and autonomous vehicles.

Uploaded by

Eisha Ghazal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture 2 Introduction 2D ProjectiveGeometry

The document outlines a course on Computer Vision led by Dr. Syed Faisal Bukhari at the University of the Punjab, detailing the syllabus, grading criteria, and essential readings including textbooks and reference materials. It covers fundamental concepts such as vector spaces, projective geometry, image acquisition, and digital image processing, emphasizing the importance of sampling and quantization in image quality. Additionally, it highlights applications in various fields like medical imaging and autonomous vehicles.

Uploaded by

Eisha Ghazal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Computer Vision

Dr. Syed Faisal Bukhari


Associate Professor
Department of Data Science
Faculty of Computing and Information Technology
University of the Punjab
Textbook

Multiple View Geometry in Computer Vision,


Hartley, R., and Zisserman

Richard Szeliski, Computer Vision: Algorithms and


Applications, 2nd edition, 2022

Dr. Syed Faisal Bukhari, DDS, PU


Reference books
Readings for these lecture notes:
Hartley, R., and Zisserman, A. Multiple View Geometry
in Computer Vision, Cambridge University Press, 2004,
Chapters 1-3.

Forsyth, D., and Ponce, J. Computer Vision: A Modern


Approach, Prentice-Hall, 2003, Chapter 2.

 Linear Algebra and its application by David C Lay

These notes contain material from the above resources.


Dr. Syed Faisal Bukhari, DDS, PU
References
These notes are based from the following resources:

Dr. Matthew N. Dailey's course: AT70.20: Machine Vision


for Robotics and HCI

Dr. Sohaib Ahmad Khan CS436 / CS5310 Computer Vision


Fundamentals at LUMS

http://rimstar.org/science_electronics_projects/pinhole_camera.ht
m

Dr. Syed Faisal Bukhari, DDS, PU


Grading breakup
I. Midterm = 35 points
II. Final term = 40 points
III. Quizzes = 𝟔 𝐩𝐨𝐢𝐧𝐭𝐬 (A total of 6 quizzes)
IV. Group project = 15 points
a. Pitch your project idea = 2 points
b. Research paper presentation relevant to your
project = 3 points
c. Project prototype and its presentation = 5 points
d. Research paper in IEEE conference template = 5
points
V. OpenCV based on Python presentation = 2.5
points
VI. Matlab presentation = 2.5 points
Dr. Syed Faisal Bukhari, DDS, PU
Some top tier conferences of
computer vision
I. Proceedings of the IEEE International Conference
on Computer Vision and Pattern Recognition
(CVPR).
II. Proceedings of the European Conference on
Computer Vision (ECCV).
III. Proceedings of the Asian Conference on
Computer Vision (ACCV).
IV. Proceedings of the International Conference on
Robotics and Automation (ICRA).
V. Proceedings of the IEEE/RSJ International
Conference on Intelligent Robots and Systems
(IROS).
Dr. Syed Faisal Bukhari, DDS, PU
Some well known Journals
I. International Journal of Computer Vision (IJCV).
II. IEEE Transactions on Pattern Analysis and
Machine Intelligence (PAMI).
III. Image and Vision Computing.
IV. Pattern Recognition.
V. Computer Vision and Image Understanding.
VI. IEEE Transactions on Robotics.
VII. Journal of Mathematical Imaging and Vision

Dr. Syed Faisal Bukhari, DDS, PU


Recall: Vector Space
A vector space is a nonempty set V of objects, called
vectors, on which are defined two operations, called
addition and multiplication by scalars (real
numbers), subject to the ten axioms (or rules) listed
in the next slide.

Dr. Syed Faisal Bukhari, DDS, PU


Recall: Vector Space
The axioms must hold for all vectors u, v, and w in V and
for all scalars c and d. The sum of u and v, denoted by u
+ v, is in V .
1. The sum of u and v, denoted by u + v, is in V .
2. u + v = v + u.
3. (u + v) + w = u + (v + w).
4. There is a zero vector 0 in V such that u + 0 = u.
5. For each u in V , there is a vector - u in V such that u + (-u) = 0.
6. The scalar multiple of u by c, denoted by cu, is in V .
7. c(u + v) = cu + cv.
8. (c + d) u = cu + du.
9. c(du) = (cd)u
10. 1u = u.
Dr. Syed Faisal Bukhari, DDS, PU
𝒏
Subspaces of ℝ
Definition: A subspace of ℝ𝒏 is any set H in ℝ𝒏 that
has three properties:

a) The zero vector is in H.

b) For each u and v in H, the sum u + v is in H.

c) For each u in H and each scalar c, the vector


cu is in H.

In words, a subspace is closed under addition and


scalar multiplication
Dr. Syed Faisal Bukhari, DDS, PU
Subspace vs. vector space
Every subspace is a vector space.

Conversely, every vector space is a subspace (of


itself and possibly of other larger spaces).

 The term subspace is used when at least two


vector spaces are in mind, with one inside the
other, and the phrase subspace of V identifies V as
the larger space. See figure in the next slide.

Dr. Syed Faisal Bukhari, DDS, PU


A subspace of V

Dr. Syed Faisal Bukhari, DDS, PU


Introduction: Vision systems

The kind of information we want is application


specific:
3D models Object categories
Object poses Camera poses
Dr. Syed Faisal Bukhari, DDS, PU
Parts of the system [1]
The “vision system” includes:

Image acquisition hardware


oAnalog camera plus digital frame grabber
or
oDigital camera with a fast serial interface (Firewire, USB,
etc.)

Image processing support software.

Computer vision algorithms.

Dr. Syed Faisal Bukhari, DDS, PU


Parts of the system [2]
Different applications have varying requirements for the
type of information that needs to be captured and
processed.

To achieve this, we will acquire images, which requires


specific hardware.

Depending on the setup, we may use:

An analog camera with a digital frame grabber.

Digital cameras equipped with serial interfaces such as


FireWire, USB 3.0, or other high-speed connections.
Dr. Syed Faisal Bukhari, DDS, PU
Parts of the system [3]
Modern IP cameras use Ethernet, with some models
supporting Gigabit Ethernet (1 GigE or 1 Gbps) or 10 Gigabit
Ethernet (10 GigE or 10 Gbps) for high-speed image
transmission.

We will focus more on our image processing support


software and vision algorithms to build these systems.

Dr. Syed Faisal Bukhari, DDS, PU


Types of Ethernet
Ethernet is a wired networking technology that allows
devices to communicate over a local area network (LAN)
using cables.

Type Speed Cable Type


Fast Ethernet 100 Mbps Cat5
Gigabit Ethernet (1 GigE) 1 Gbps Cat5e, Cat6
10 Gigabit Ethernet (10 GigE) 10 Gbps Cat6a, Cat7
40/100 Gigabit Ethernet 40-100 Fiber Optic, Cat8
Gbps

Dr. Syed Faisal Bukhari, DDS, PU


Our focus in this semester
This semester we focus on algorithms for 3D
reconstruction.

To understand modern 3D reconstruction


techniques, we need to understand how cameras
transduce the world into images.

This requires a deep understanding of projective geometry


and camera models, which serve as the mathematical
foundation of computer vision.

Once we grasp these concepts, we can begin to invert the


camera’s transformation process and develop methods to
reconstruct 3D scenes from 2D images.
Dr. Syed Faisal Bukhari, DDS, PU
2D Projective Geometry
We begin with 2D projective geometry because it’s simple,
then we’ll generalize to 3D.

In particular, we consider what happens when we


perform projective transformations of the plane.

Projective transformations model the distortions


introduced by projective cameras (more on cameras later).

In projective cameras, funny things happen.


Although straight lines stay straight, parallel lines are no
longer parallel. Projective geometry gives us mathematics
for these kinds of transformations.
Dr. Syed Faisal Bukhari, DDS, PU
Introduction to 2D Projective Geometry

oWe start with 2D projective geometry because it is


simpler, then extend to 3D.

oOur focus is on projective transformations of the


plane.

Dr. Syed Faisal Bukhari, DDS, PU


Projective Transformations
oThese transformations model the distortions
introduced by projective cameras.

oThey preserve straight lines, but parallel lines may


no longer remain parallel

Dr. Syed Faisal Bukhari, DDS, PU


Effects of Projective Cameras
oPerspective distortion occurs, leading to changes in
how objects appear

oStraight lines remain straight, but parallel lines


converge at vanishing points.

oProjective geometry provides the mathematical


framework to handle these transformations.

Dr. Syed Faisal Bukhari, DDS, PU


Parallel lines (in blue) appear to converge at a vanishing
point (red) due to perspective distortion

The horizon line (dashed gray) represents where all vanishing


points lie for lines parallel to the ground.
Dr. Syed Faisal Bukhari, DDS, PU
Left (Original Grid, Blue): Represents a standard Euclidean grid.

Right (Transformed Grid, Red): The same grid after applying a


projective transformation (homography).
Dr. Syed Faisal Bukhari, DDS, PU
Key Takeaways:
oStraight lines remain straight, but the overall shape
distorts.

oPerspective warping occurs, resembling how a


camera would capture an image with depth.

oThis is a fundamental principle in computer vision


and image rectification.

Dr. Syed Faisal Bukhari, DDS, PU


2D Projective Geometry

The thing we have to take care of 2D projective geometry


is projective transformations.

In projective transformations, we take some arbitrary


dimensional environment, and we capture images of it on
(in this case, as we talk about) 2D plane.

Projective Transformations. A transformation that


maps lines to lines (but does not necessarily preserve
parallelism) is a projective transformation.

Dr. Syed Faisal Bukhari, DDS, PU


2D projective geometry
The 2D projective plane: points in ℝ𝟐
A point in the plane can be represented as a pair (x, y) in
ℝ𝟐 .

We consider ℝ𝟐 as a vector space, and we write the point


(x, y) as a vector. This step makes it possible to write
transformations of points as matrices.

Generally, we write transformations on the left and points


on the right, so we need to write points as column vectors,
i.e.,(x, y)𝑇 .

We will typically write column vectors using bold upright


symbols, e.g., x = (x, y)𝑇 or 𝑥Ԧ = (x, y)𝑇 .
Dr. Syed Faisal Bukhari, DDS, PU
Representing a Digital Image
y

f(0,0) c

x
(0,0)

rx
r cy
90 cc rotation
f(8,15)

Dr. Syed Faisal Bukhari, DDS, PU


Image Acquisition

Dr. Syed Faisal Bukhari, DDS, PU Figure Credit: Gonzales & Woods: Digital
Image Processing
Generating a Digital Image
• Digitization of coordinate
values is called sampling

• Digitization of amplitude
values is called
quantization

Dr. Syed Faisal Bukhari, DDS, PU Figure Credit: Gonzales & Woods: Digital
Image Processing
Generating a Digital Image
• How many samples to
use?
• How many
quantization levels to
use?

Dr. Syed Faisal Bukhari, DDS, PU Figure Credit: Gonzales & Woods: Digital
Image Processing
Image Size and Resolution

256 x 192

128 x 96

64 x 48
These images were produced simply by
32 x 24 picking every n-th sample horizontally and
vertically and replicating the value n x n
times 32
Resolution
oSpatial Resolution is the smallest discernable detail
in an image
o What is the minimum width of lines W such that
they are discernable in the image?
oGray-level resolution is the smallest discernable
change in gray-level

33
Different Number of Gray Levels

256 32 16

8 4 2
Contouring
34
How many gray-levels are required?
Contouring…
32

64

128

256

Digital Images are usually quantized to 256 gray-levels


35
Examples of Common Image
Resolutions

Resolution Common Usage


640 × 480 Old VGA displays
1280 × 720 HD (720p) videos
1920 × 1080 Full HD (1080p) videos
3840 × 2160 4K Ultra HD
7680 × 4320 8K resolution

Dr. Syed Faisal Bukhari, DDS, PU


Introduction to Digital Image
Processing
Digital images are formed by converting analog
signals into discrete values.

This conversion involves two key processes: Sampling


and Quantization.

Understanding these concepts is essential for image


processing and computer vision applications.

Dr. Syed Faisal Bukhari, DDS, PU


What is Sampling?
oSampling is the process of converting a continuous image
into a discrete image by selecting values at fixed intervals.

o Determines the spatial resolution of an image (i.e., the


number of pixels per unit area).

o Higher sampling results in more details and sharper images.

o Measured in pixels per inch (PPI) or total pixel count (e.g.,


1920 ×1080 pixels).

Dr. Syed Faisal Bukhari, DDS, PU


What is Quantization?
oQuantization is the process of mapping pixel intensity
values to a finite number of discrete levels.

oDetermines the number of colors or shades that can be


represented in an image.

oHigher quantization levels result in smoother color


transitions.

oMeasured in bits per pixel (bpp), e.g., 8-bit (256 levels),


16-bit, 24-bit (True Color).

Dr. Syed Faisal Bukhari, DDS, PU


Effects of Low Sampling and
Quantization
o Low Sampling: Results in pixelated and blocky
images due to insufficient resolution.

oLow Quantization: Causes color banding and loss of


smooth transitions in grayscale or color images.

o High-quality images require both high sampling


and high quantization.

Dr. Syed Faisal Bukhari, DDS, PU


Applications in Computer Vision
• Medical Imaging: High-resolution and high-bit-
depth images are crucial for accurate diagnosis.

• Facial Recognition: Image quality impacts feature


extraction and recognition accuracy.

• Autonomous Vehicles: Object detection models


require clear and high-quality image inputs.

• Satellite Imaging: High sampling and quantization


improve remote sensing and analysis.
Dr. Syed Faisal Bukhari, DDS, PU
Key Differences Between Sampling
and Quantization
o Sampling controls the image resolution (spatial
details), while quantization controls color depth.

o Low sampling leads to pixelation, whereas low


quantization causes color banding.

o Higher sampling and quantization improve image


quality but require more storage and processing
power.

Dr. Syed Faisal Bukhari, DDS, PU


Introduction to Image Types
oDigital images are classified based on how they
store pixel information.

oThe main types are Binary, Grayscale, and Color


images.

oEach type has different applications in image


processing and computer vision.

Dr. Syed Faisal Bukhari, DDS, PU


What Are Binary Images?
o Binary images use only two colors: black and
white (0 and 1).

oEach pixel is either ON (white) or OFF (black).

oUsed in document scanning, edge detection, and


barcodes.

oRequires very low storage space.

Dr. Syed Faisal Bukhari, DDS, PU


What Are Grayscale Images?
o Grayscale images use shades of gray ranging from
black (0) to white (255).

o Each pixel has a single intensity value.

oCommonly used in medical imaging, pattern


recognition, and image processing.

oCaptures more details than binary images.

Dr. Syed Faisal Bukhari, DDS, PU


What Are Color Images?
oColor images use multiple channels (e.g., RGB:
Red, Green, Blue).

oEach pixel has multiple values to define its color.

oTypically 24-bit (8-bit per channel) or higher (e.g.,


HDR images).

oUsed in photography, graphics design, and displays.

Dr. Syed Faisal Bukhari, DDS, PU


Applications of Each Image Type
Binary Images: OCR (Optical Character Recognition),
QR codes, edge detection.

Grayscale Images: Medical imaging, security


cameras, image enhancement.

Color Images: Photography, video processing,


augmented reality.

Choice of image type depends on the application and


required detail.
Dr. Syed Faisal Bukhari, DDS, PU
Summary and Conclusion
o Binary images are the simplest, storing only black and
white.

o Grayscale images provide better detail using intensity


levels.

o Color images use multiple channels to represent full


color.

o Understanding these differences is key in image


processing and computer vision.

Dr. Syed Faisal Bukhari, DDS, PU


Binary Image

Dr. Syed Faisal Bukhari, DDS, PU


Grayscale Image

Dr. Syed Faisal Bukhari, DDS, PU


Color Image

Dr. Syed Faisal Bukhari, DDS, PU


Representing a Digital Image
y

f(0,0) c

x
(0,0)

rx
r cy
90 cc rotation
f(8,15)

Dr. Syed Faisal Bukhari, DDS, PU


Point in 2D

𝑥 𝑥
𝑥Ԧ =𝑦
or 𝑦
This point 𝑥Ԧ is the same as the two Cartesian coordinates x
and y. We're doing something funny when we're dealing with
images.

Dr. Syed Faisal Bukhari, DDS, PU


Understanding 2D Points in Images
oA point in 2D projective geometry is represented as (x,
y), similar to Cartesian coordinates.

o However, in digital images, the origin (0,0) is at the


top-left corner.

oThe x-axis increases horizontally to the right, and the y-


axis increases vertically downward.

oThis is different from the traditional Cartesian system,


where the y-axis increases upward.

oUnderstanding this coordinate system is crucial in


computer vision and image processing.
Dr. Syed Faisal Bukhari, DDS, PU
Dr. Syed Faisal Bukhari, DDS, PU
Point in 2D

We can select the origin of our image coordinate


system whatever we like, but usually, we talk about the
upper left corner and having the coordinates (0, 0).

Usually, in image processing, when we talk about a point (x,


y), then we are talking about displacement to the right by
x and displacement down by y. We will write these vectors
as row vectors or column vectors.

Generally, we assume that we are modeling some


points in the image plane in most cases.

Dr. Syed Faisal Bukhari, DDS, PU


Key Differences Between Cartesian
(Euclidean) and Projective Coordinates
Feature Cartesian Coordinates Projective Coordinates
Definition Standard (x, y) or (x, y, Homogeneous (x, y, w) or (x, y, z, w)
z) coordinates
Extra Dimension No extra dimension Includes an extra scale factor w
Usage Geometry, physics, Computer vision, 3D graphics,
machine learning camera modeling
Transformations Linear transformations Affine and projective
(rotation, translation, transformations (perspective,
scaling) homographies)
Representation Fixed point locations Defined up to a scale factor
Perspective No perspective effects Can represent perspective
Effect transformations
Dr. Syed Faisal Bukhari, DDS, PU

You might also like