Matrix Method
Matrix Method
Matrix Method
Recommended Reading
Nature of light
Until the time of Isaac Newton (1642 – 1727) most scientists believed that light consisted of stream of
tiny particles (called corpuscles) emitted by light sources, and this theory is called the corpuscular theory
of light. However, in attempts to describe various properties of light, it seemed more logical to consider light
as a type of waves.
One problem associated earlier with the wave theory of light was the lack of understanding of the medium
of propagation of light. To overcome this difficulty, an invisible medium called ‘aether’ was introduced.
The requirement for a medium disappeared when Clerk Maxwell introduced his electromagnetic theory, and
in 1873 he predicted the existence of electromagnetic waves which travels without the assistance of any
medium.
The wave theory of light encountered another problem in explaining the photoelectric effect. To explain
the experimental results associated with the photoelectric effect, again it was necessary to consider light as a
stream of tiny mass-less particles. Einstein, who introduced the photon theory of light in 1905, called these
particles photons.
With the introduction of photon theory, it became evident that light behaves like waves under some conditions
and like particles under other conditions. This became known as the dual nature of light. During the
development of modern quantum theory, this concept of dual nature (sometimes called wave-particle duality)
was assumed. According to the DE Broglie hypothesis, the wavelength of a particle of mass m is given by
ℎ
𝜆 = 𝑚𝑣 where h is the Plank’s constant.
Introduction to optics
We are familiar with the reflection and refraction of a beam of light (uniform plane electromagnetic
waves) when incident upon planar boundaries between dissimilar materials. In broad sense, it can
be considered as a result of the interaction of waves with matter. In general, interaction of waves
with matter may involve waves that are neither uniform nor plane. Also, the objects that waves
interact with may be of any shape and size. There are two fundamental approaches for describing
wave interactions with matter. In one approach, light wave is considered as a ‘ray’. From the wave
point of view, a ray is an imaginary line along the direction of travel of the wave energy. In the
second approach the light is considered purely as a wave having wave properties. These two
approaches are separately studied under two categories topics; geometrical optics and physical
optics.
In geometrical optics, also known as ray optics a wave is represented by a ray denoting the
direction of travel of the wave’s energy.
In physical optics otherwise known as wave optics, a wave is characterized mathematically by all
its attributes such as its amplitude, angular frequency, phase, and polarization vector.
Geometrical optics
In geometrical optics light propagation is described in terms of rays, and it can be used to explain
certain phenomena such as reflection, refraction of light when passing through different media.
However geometrical optics does not account for the effects such as interference, diffraction and
polarization of waves.
Wavefront
Direction of movement
Planes consisting
of wavefronts
3-Dimensional wave
A set of planes consisting of three dimensional wavefronts forms a plane wave moving in the
direction shown below.
A ray of light
In optics, a ray is an idealized model of light, obtained by choosing a line that is perpendicular to
the wavefronts of the actual light, and that points the direction of energy flow.
Light rays propagate in rectilinear paths when they travel in a homogeneous medium. They bend,
reflect or may split in two at an interface between two dissimilar media.
Ray of light
Spherical waverfont
(1) When the wavelength of incident light is very much less than the size of the object or slit
on which the light (a plane wave) is incident. 𝝀 ≪ 𝒅
Wavefronts
Opaque
screen Slit
plane
wave d
d
d≫𝜆
λ
(a) Object
Geometrical optics (ray optics) is applicable above situations.
(2) when the wavelength of light is of the order of the width of the slit or object. 𝒅 ≅ 𝝀
Slit Object
d=𝜆
(b)
λ d
(3) When 𝒅 ≪ 𝝀
Wavefronts
d≪𝜆
λ (c)
Refracted wavefront
Refracted ray
Incident ray
Incident wavefront
Reflection of light
Matrix techniques can be used to trace the paths of rays through optical elements (systems).
In this section, we formulate para-axial geometrical optics in terms of 2×2 matrices and associated
rays. These rays can be traced through the optical system by matrix-vector multiplication. This
scheme allows complex, multi-element, optical systems to be analyzed and simplified.
In this course, we limit ourselves to paraxial ray tracing. This means that all optical systems that
we work with have elements with common optical axis, and all light rays we use for calculations
travel nearly parallel to the optical axis. Therefore, for all angles subtended by the rays on the
optical axis, we can make the approximation sin 𝜃 ≃ 𝑡𝑎𝑛𝜃 ≃ 𝜃. In order to understand the
technique let us consider a ray of light making an angle 𝜃1 with the optical axis and entering an
optical element at a height 𝑦1 as shown in figure.
𝜃2
𝑦1 𝑦2
𝜃1
After passing through the optical element the ray leaves at a height 𝑦2 with an angle 𝜃2 .
Considering the optical element as a linear component which can only make a linear change to the
𝜃 and 𝑦 coordinates, we can relate the input and output coordinates with two linear equations as
follows.
𝑦2 = 𝐴𝑦1 + 𝐵𝜃1
𝜃2 = 𝐶𝑦1 + 𝐷𝜃1
Matrix operator
Output matrix Input matrix
𝑦2 𝐴 𝐵 𝑦1
ቂ𝜃 ቃ = ቂ ቃ ቂ𝜃 ቃ
2 𝐶 𝐷 1
𝐴 𝐵
The entity, ቂ ቃ, is called a 2 × 2 matrix, and according to the above matrix equation, when it
𝐶 𝐷
operates on the input matrix related to input parameters, 𝑦1 , 𝜃1 , it will produce an output matrix
𝐴 𝐵
relating the output parameters 𝑦2 , 𝜃2 . The 2 × 2 matrix, ቂ ቃ, is called the transformation matrix
𝐶 𝐷
of the optical element or the Matrix operator (M) for the optical element.
𝑌 𝑦1 𝑦1
[ 2 ] and ቂ𝜃 ቃ are column matrices, and when M operates on ቂ𝜃 ቃ it will produce the column
𝜃2 1 1
𝑌 𝑦1
If 𝑌2 = [ 2 ] and 𝑌1 = ቂ𝜃 ቃ , then 𝑌2 = 𝑀 𝑌1
𝜃2 1
𝑦1 𝑀1 𝑀2 𝑦3
ቂ𝜃 ቃ ቂ𝜃 ቃ
1 3
1 2
For element 1, 𝑦2
ቂ𝜃 ቃ
2 𝑦2 𝑦1
ቂ𝜃 ቃ = 𝑀1 ቂ𝜃 ቃ
2 1
For element 2,
𝑦3 𝑦2
ቂ𝜃 ቃ = 𝑀2 ቂ𝜃 ቃ
3 2
𝑦3 𝑦1
∴ ቂ𝜃 ቃ = 𝑀2 𝑀1 ቂ𝜃 ቃ
3 1
Suppose we have an optical system consisting of N such elements, each having a matrix operator
𝑦1 𝑦𝑁
𝑀𝑖 (i =1, 2, 3,……N), then for an input ray of ቂ𝜃 ቃthe final Rayቂ𝜃 ቃis given by the matrix
1 𝑁
multiplication of the matrix operators 𝑀𝑁 , 𝑀𝑁−1 , 𝑀𝑁−2 … … . 𝑀1 ,which can be written as,
𝑦𝑁 𝑦1
ቂ𝜃 ቃ = 𝑀 ቂ𝜃 ቃ
𝑁 1
In general, a set of m × n numbers, real or complex, arranged in a rectangular array of m rows and
n columns like
𝑎11 𝑎12 … … … … . 𝑎1𝑛
𝑎21 𝑎22 … . … … . . 𝑎2𝑛
[. … …… … . . … …]
𝑎𝑚1 𝑎𝑚2 … … … . 𝑎𝑚𝑛
is called a matrix of 𝑚 × 𝑛.
m represents the number of rows in the matrix, and n represent the number of columns.
In case𝒎 = 𝒏the rectangular array becomes square and so the matrix is called a Square Matrix of
order n. The mn numbers, 𝑎𝑖𝑗 (where 𝑖 = 1, 2, … . . 𝑛) and𝑗 = 1, 2, … … … 𝑚) are called its
elements or constituents.
0 0 0
0 0 0
E.g. [0 0 0] orቂ ቃ
0 0 0
0 0 0
It is possible that a matrix may have only a single row or single column.
𝑎1
𝑎2
E.g. [𝑎1 𝑎2 … … … 𝑎𝑝 ], and …
…
𝑎
[ 𝑞]
The above row matrix is a 1 × 𝑝 matrix, and the column matrix is a 𝑞 × 1 matrix
Illustrative examples
2 3 −1
1. ቂ ቃ is a matrix of order 2 × 3
4−5 6
2. [2 3 − 5] is a 1 × 3 row matrix
2
1
3. [ ] is a 4 × 1 column matrix
5
7
Equality of Matrices
Two matrices A and B are said to be equal if both are of the same order 𝑚 × 𝑛 i.e A has the same
number of rows and columns as B and each element 𝑎𝑖𝑗 of A is equal to the corresponding element
𝑏𝑖𝑗 of B. i.e 𝑎𝑖𝑗 = 𝑏𝑖𝑗 for each pair of subscripts i and j.
Illustrative examples
1 2 1 2 5
1. The matrices ቂ ቃ and ቂ ቃ are not equal as their orders are different. One is
3 −4 3 −4 0
2 × 2 and the other is 2 × 3.
2 3 2 3
2. The matrices [ 0 1 ] and [6 1] are of the same order but they are not equal as one of
5 4 5 4
the elements in the second row is not equal.
Addition of matrices
Two matrices A and B are said to be conformable for addition if they are of the same order i.e. if
they have the same number of rows and the same number of columns then they can be added as
follows.
since both have equal numbers of rows and columns, they are eligible for addition. Then,
Examples
1. Add A+ A
2. Add A+ A+ A
3. Hence show that if k is a number and A is a matrix then the matrix kA can be obtained by
multiplying every element of A by k.
Illustrative example
2 0 3 −1 2 0
If A = ቂ ቃ and B=ቂ ቃ
−4 1 5 3 −4 −5
Example
2 3 0 1 −2 3 2 −1 5
If A= ቂ ቃ, B = ቂ ቃ, and C = ቂ ቃ then
4 −1 2 0 4 5 3 0 4
find 3A - 4B + 2C
Multiplication of matrices
Two matrices A and B are conformable for multiplication if and only if number of columns in A
is equal to the number of rows in B. The product of the two matrices A and B denoted by AB is
then defined as the matrix whose elements in the ith row andjth column is the algebraic sum of the
products of the elements in the ith row of A by the corresponding elements in the jth column of B.
Furthermore, if A = a matrix of 𝑚 × 𝑛,
and B = a matrix of 𝑛 × 𝑝,
then C = AB is a matrix of 𝑚 × 𝑝
1. i.e. if AB = C then the element Cij in the matrix C is given by 𝑐𝑖𝑗 = ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑗 𝑖 =
1, 2, … … … … . 𝑚 and 𝑗 = 1, 2, … … … … . . 𝑝
Illustrative examples
𝑐11 𝑐12
∴ C = [ 21 𝑐22 ]
𝑐
𝑐31 𝑐32
Find expressions for the rest of the terms in the matrix
It is worth noting that although the product AB is defined, the product BA is not defined
since the number of columns in B is not equal to the number of rows in A.
Examples
𝑎11 𝑎12 𝑏11 𝑏12
3. If A =ቂ 𝑎 ቃ and B = [ ]
21 𝑎22 𝑏21 𝑏22
(i) Check whether the product AB can be defined
(ii) If so find the matrix C which is equal to AB
(iii) Check if the product BA is also possible, and if so find the resultant matrix
(iv) Is AB = BA?
1 −2
1 0 2
4. Consider the matrices A = [2 3 ], and B = ቂ ቃ
0 1 3
−3 1
(i) Check the conformability for multiplications of AB and BA
(ii) Check whether AB = BA
𝑏 𝑏
𝑎11 𝑎12 𝑎13 11 12
5. If C = AB = ቂ 𝑎 𝑎 𝑎 ቃ [𝑏21 𝑏22 ]
21 22 23
𝑏31 𝑏32
(i) Find the order of the matrix C
(ii) Write down the elements of the matrix C in the form of ∑𝑛𝑖=1 𝑎𝑖𝑘 𝑏𝑘𝑖
……………………………………………………………
Transformation Matrices (Matrix operators) for optical elements
Consider a ray of light going through a segment of free space (optical element) of length L as
shown in figure.
𝜃1
𝜃1
𝑦1 𝑦2
𝜃1
For the ray passing through this optical element we can write,
𝑦2 = 𝑦1 + 𝐿 tan𝜃1 ⋍ y1 + 𝐿𝜃1
𝑦2 = 1𝑦1 + 𝐿𝜃1
𝜃2 = 𝜃1
𝜃2 = 0𝑦1 + 1𝜃1
In terms of matrix formulation this can be written as
𝑦2 1 𝐿 𝑦1
ቂ𝜃 ቃ = ቂ ቃቂ ቃ
2 0 1 𝜃1
Therefore, transformation matrix or the matrix operator for a segment (element) of free space is
1 𝐿
ቂ ቃ
0 1
2. Transformation matrix for refraction at a plane surface
𝑛1
𝜃2
𝜃1
𝑦1 𝑦2 𝑛2
𝜃1
Medium 1 Medium 2
If we take the plane surface as our optical element as shown below it has no thickness. Therefore
𝑦1 = 𝑦2
where 𝑛1 is the refractive index of the medium 1, and𝑛2 is the refractive index of the medium 2.
If we use the small angle approximation, 𝑛1 𝜃1 = 𝑛2 𝜃2 . Then
𝑦2 = 𝑦1
𝑛1
𝜃2 = 𝜃1
𝑛2
𝑦2 1 0 𝑦
Or, in terms of matrices, ቂ𝜃 ቃ = [ 0 𝑛1 ] ቂ 1 ቃ
2 𝑛2
𝜃1
1 0
∴The transformation matrix for this case is [0 𝑛1 ]
𝑛2
𝑛1 𝑛2
𝛼 𝛼1 𝛼2 𝜃2
𝜃1 𝛼
𝜃1 𝑦1 𝑦2
Medium 1 Medium 2
R – Radius of curvature
𝜶𝟏 = α + 𝜽𝟏 𝜶𝟐 = α + 𝜽𝟐
Figure shows a ray of light being refracted at a spherical surface.
𝑛2 sin𝛼 𝛼
Using the Snell’s law = sin𝛼1 ≃ 𝛼1
𝑛1 2 2
𝛼 + 𝜃1
=
𝛼 + 𝜃2
𝑦1
𝑛2 +𝜃1
= 𝑅
𝑦1 …………(1)
𝑛1 +𝜃2
𝑅
𝑛1 −𝑛2 𝑛
From equation (1) 𝜃2 = ( ) 𝑦1 + 𝑛1 𝜃1
𝑛2 𝑅 2
𝑦2 = 𝑦1
1 0
Therefore, the transformation matrix is [𝑛1 −𝑛2 𝑛1 ]
𝑛2 𝑅 𝑛2
Similarly, it is possible to show that the transformation matrix for a spherical surface with the
opposite curvature
1 0
[ − 𝑛1 −𝑛2 𝑛1 ]
𝑛2 𝑅 𝑛2
Note: Matrix operator for concave surface can be obtained by inserting a – sign in front of the
𝑛 −𝑛
matrix element, 𝑛1 𝑅 2 in the matrix operator for convex element.
2
4 Thin lenses – Thin convex lens
In a thin lens the light ray travels through two spherical surfaces one after the other as shown in
figure.
Medium 2 (𝑛2 )
𝜃2
𝜃3
𝑦1
𝜃1
𝑅1 𝑅2
Since the lens is thin the location of both surfaces can be same. In other words, we do not need to
consider the passing of the ray through the medium 2. However, if it is a thick lens the traveling
of the ray through the medium 2 must be considered separately and another matrix operator must
be devised for that space.
Consider the convex lens as a combination of two matrix elements, one with a convex surface 𝑀1
and the other with a concave surface 𝑀2 .
𝑦1 𝑛2 𝑦3
ቂ𝜃 ቃ 𝑀1 𝑀2 ቂ𝜃 ቃ
1 3
𝑛1 - air 𝑛1 𝑛1
𝑦2
𝑛2 - glass ቂ𝜃 ቃ
2
𝑦2 𝑦1
ቂ𝜃 ቃ = 𝑀1 ቂ𝜃 ቃ
2 1
𝑦3 𝑦2
ቂ𝜃 ቃ = 𝑀2 ቂ𝜃 ቃ
3 2
𝑦3 𝑦1
∴ ቂ𝜃 ቃ = 𝑀2 𝑀1 ቂ𝜃 ቃ
3 1
𝑦2 1 0 𝑦1
ቂ𝜃 ቃ = [𝑛1 −𝑛2 𝑛1 ] ቂ𝜃 ቃ
2 𝑛 𝑅 𝑛2 1
1 2
𝑦3 1 0 𝑦
ቂ𝜃 ቃ = [− 𝑛2 − 𝑛1 𝑛2
2
] ቂ𝜃 ቃ
3 2
𝑛1 𝑅2 𝑛1
We must remember that, for the second surface, the ray of light starts in the medium 2 having
refractive index 𝑛2, and ends up in medium 1 having refractive index 𝑛1 . By substituting for the
𝑦2
second matrix ቂ𝜃 ቃ in this equation from the previous one we have,
2
𝑦3
ቂ𝜃 ቃ
3
1 0 1 0
= [− 𝑛2 − 𝑛1 𝑛2 ] [𝑛1 − 𝑛2 𝑛1 ] ቂ𝑦1 ቃ
𝜃
𝑛1 𝑅2 𝑛1 𝑛2 𝑅1 𝑛2 1
∑2 𝑎 𝑏 ∑2 𝑎1𝑘 𝑏𝑘2
If C = AB then, C=[ 2𝑘=1 1𝑘 𝑘1 𝑘=1 ]
∑𝑘=1 𝑎2𝑘 𝑏𝑘1 ∑2𝑘=1 𝑎2𝑘 𝑏𝑘2
𝑦3 1 0 𝑦1
ቂ𝜃 ቃ= [ 𝑛1 −𝑛2 ( 1 + 1 ) 1 ] ቂ𝜃1 ቃ
3 𝑛 𝑅 𝑅
1 1 2
It is possible to derive the well-known lens equation from the above result as follows.
Consider a ray of light parallel to the optical axis coming from the left, and going through the focal
point F, as shown in figure.
h
F
𝑦1 𝑦4
ቂ𝜃 ቃ ቂ𝜃 ቃ
1 4
𝑅1 𝑅2
𝑦1 ℎ
In the matrix notation, the incident ray matrixቂ𝜃 ቃ is represented by a matrix ቂ ቃ, because 𝑦1 =
1 0
ℎ, and 𝜃1 = 0.
After passing through the convex lens, the emergent ray then goes through a free space segment
of length f before reaching the focal point F. Transformation matrix for a free space of length f is
1 𝑓
given by 𝑀3 = ቂ ቃ.
0 1
1 0
𝑀 = [𝑛1 − 𝑛2 ( 1 + 1 ) 1]
𝑛1 𝑅1 𝑅2
1 0
𝑦4 1 𝑓 𝑛1 − 𝑛2 1 1 ℎ
∴ ቂ𝜃 ቃ = ቂ ቃ[ ( + ) 1] ቂ0ቃ
4 0 1 𝑛1 𝑅1 𝑅2
(See the derivation of the transformation matrix for free space segment)
ℎ
The values of𝑦4 and 𝜃4 for the emerging beam at F are 0 and - 𝑓 respectively.
0 1 𝑓 𝑛 −𝑛 11 0 ℎ
∴ [− ℎ⁄ ]= ቂ
𝑓 0 1
ቃ[ 1 2( + 1 ) 1 ] ቂ0ቃ ……………….(1)
𝑛1 𝑅1 𝑅2
1 𝑓 1 0
Let A = ቂ ቃ and B = [ 𝑛1 −𝑛2 ( 1 + 1 ) 1]
0 1 𝑛 1 𝑅 𝑅1 2
∑2 𝑎 𝑏 ∑2 𝑎1𝑘 𝑏𝑘2
If C = AB then, C=[ 2𝑘=1 1𝑘 𝑘1 𝑘=1 ]
∑𝑘=1 𝑎2𝑘 𝑏𝑘1 ∑2𝑘=1 𝑎2𝑘 𝑏𝑘2
𝜃2 = 𝐶𝑦1 + 𝐷𝜃1
𝑦2 𝐴 𝐵 𝑦1
then ቂ𝜃 ቃ = ቂ ቃቂ ቃ
2 𝐶 𝐷 𝜃1
𝑛1 − 𝑛2 1 1
− ℎ⁄𝑓 = ( + )ℎ + 1 × 0
𝑛1 𝑅1 𝑅2
1 𝑛2 −𝑛1 1 1
∴𝑓= (𝑅 + 𝑅 ), which is the well-known lens equation.
𝑛1 1 2