Math Modelization HMU HANDOUT

Mathematics and modelization
Georges KOEPFLER
Georges.Koepfler@parisdescartes.fr
MAP5, UMR CNRS 8145
UFR de Mathématiques et Informatique

Université Paris Descartes - SPC
Hanoi Metropolitan University, June 2018
G. Koepfler Math. modelization HMU - June 2018 1

Plan:
1 Introduction
2 Curvature and crowd density
3 Modelling camera movement
4 Occlusion in video
5 Video segmentation
G. Koepfler Math. modelization HMU - June 2018 2

Mathematics and modelization
In this presentation we will focus on applying mathematics to

various problems associated to image and video processing.
There are often multiple possibilities for modelization of a given

problem. We cannot pretend to be complete in this presentation.
In practice the precision of the results, the complexity, the speed

and ease of use of the model must be studied and taken into
consideration.
The aim here is to show how mathematics can help to model

practical problems.
We first recall some notations and concepts which will be usefull

for the presentation.
G. Koepfler Math. modelization Introduction HMU - June 2018 4

Numerical Images
75
80
85
90
95
100
105
110
115
120
125
70 80 90 100 110 120 130 140 150
discrete grid
95
126 132 118 114 104 105 89 71 70 64 68 65 65 65 73 79 88 77 85 76
122 124 113 93 89 77 66 46 60 56 68 75 72 70 71 76 88 78 96 84
117 107 102 88 58 46 33 34 42 47 61 78 89 76 91 99 90 92 82 96
105 96 93 58 13 15 18 16 22 26 33 46 64 65 88 109 127 115 84 83
85 100 70 17 8 11 11 13 10 13 18 24 34 44 61 88 123 144 103 84
100
61 94 38 4 5 8 14 15 27 48 41 30 40 49 60 69 88 129 132 99
52 89 32 6 8 15 24 30 42 66 139 42 72 59 96 101 64 90 117 122
250
38 73 36 10 14 13 22 30 104 88 204 91 132 56 127 140 99 75 86 119
39 61 56 13 20 31 39 38 128 95 144 104 147 77 143 167 128 100 90 95
200
37 55 46 7 22 33 62 56 94 59 77 75 146 150 191 170 152 133 114 92
105
26 30 27 12 28 53 73 64 78 64 78 84 120 172 232 194 194 152 128 116
150
22 21 19 22 35 55 95 71 75 73 76 85 102 168 215 223 237 144 118 136
35 44 44 44 45 55 102 101 68 76 70 93 128 171 203 195 221 121 120 146
44 43 47 76 50 61 95 127 83 77 72 97 143 142 148 137 145 109 127 152 100
32 43 54 87 68 56 74 98 86 71 73 95 116 119 128 121 120 121 139 157 120

110
28 37 61 87 83 61 58 70 76 81 95 110 123 126 128 120 125 127 146 164 50 115
25 29 56 105 97 80 70 85 103 108 122 125 127 121 125 110 115 120 154 169 110
26 37 63 123 111 92 96 109 123 124 124 115 110 114 109 107 110 131 156 167 0 105
29 43 65 135 115 97 107 121 125 121 118 108 104 107 112 115 121 139 153 161 80
85 100
24 45 87 143 128 107 115 117 126 119 116 113 113 113 123 123 141 138 167 163 90
95
115 100 95
80 82 84 86 88 90 92 94 96 98 100 105
grid of numbers or pixels relief or 2D function

Representation of Numerical Images
Various possibilities for information representation :
image data relief topographic map

Functions and images
Consider a function of the two variables x ∈ [−2, 2] and y ∈ [−2, 2] .
Thus (x, y ) ∈ [−2, 2] × [−2, 2] and z = u(x, y ) ∈ R .
x2 + y2 2 2

− x +y
u(x, y ) = α 1 − e 2β .
β
1.4
1
1.2
z 50
1
0.8
0.8
z=f(x,y)
z=f(x,y) 0.6
0.6 100
0.4
0.2
0.4
0
150
0.2 −0.2
y
x −0.4
0 −0.6
2 (x,y)
(x,y) 2
200
O
1 2 1
1 2
0 0 1.5
0 1
0.5
−1 −1 0
−1 −0.5 250
−1
−2 −1.5
−2 −2 −2 50 100 150 200 250

Modeling crowd density
Question : How to distinguish between these images?
Empty platform Crowded platform
Without precise detection of people, faces,. . .

Find some characterization!
G. Koepfler Math. modelization Curvature and crowd density HMU - June 2018 9
Modeling crowd density
There are numerous possibilities to address this problem. We will

propose a modelization based on the following remarks :
Detection of objects and/or people might be complicated.

Contour detection needed, which might be a difficult problem.
Contours are contour lines (isocline, isophote) of the function
associated to a given image.
The background in a metro station has often a lot of linear
structures/contours.
Objects and people less. . .
Crowd density
Empty platformCrowded platform
Curvature
We propose here to use curvature of isophotes in order to
characterize an image with a lot of people.
Definitions
Let X (s) ∈ R2 be a plane curve parameterized by arc length.
Thus the tangent vector T (s) = X 0 (s) is unitary: kT (s)k = 1.
With N(s) the normal unit vector we get the Frenet frame, an
orthonormal basis spanning R2 .
dT
One has (s) = κ(s)N(s),
ds
where κ(s) is the signed curvature of the curve X .
x(s)

If X (s) = then
y (s)
x 0 (s)y 00 (s) − x 00 (s)y 0 (s)

κ(s) = 3
(x 0 (s)2 + y 0 (s)2 ) 2
Curvature of isophotes
But how obtain the contour lines, compute κ(s), from an image?
Let u : Ω ⊂ R2 −→ R be the image function.
Then the isophotes are defined as
Ic = {(x, y ) / u(x, y ) = c}
It it is easy to show that the scalar product < ∇u(x, y ), X 0 (s) >= 0
1 We have seen that u(x(s), y (s)) = c for an isophote.

2 Derivation yields ux (x, y )x 0 + uy (x, y )y 0 = h∇u, X 0 i = 0.
3 One more derivation yields
uxx (x 0 )2 + 2uxy x 0 y 0 + uyy (y 0 )2 = −(ux x 00 + uy y 00 )
x (s)
0
0
4 Now T (s) = X (s) = with kT k2 = (x 0 )2 + (y 0 )2 = 1
y 0 (s)
y (s)
0
and N(s) = with kNk = 1 and T ⊥ N.
−x 0 (s)
5 Thus as ∇u ⊥ T , we have ∇u = αN:
ux αy 0
q
= , with α = k∇uk = ux2 + uy2 .
uy −αx 0
6 If we inject these relations in 3 :
uxx uy2 + 2uxy ux uy + uyy uy2

= α(x 0 y 00 − y 0 x 00 ) = ακ(s)
α2
indeed as x 02 + y 02 = 1, κ(s) = x 0 y 00 − y 0 x 0 .
7 We thus obtain, by some more computations

uxx uy2 + 2uxy ux uy + uyy uy2 ∇u

∆
κ(s) = = div = curv(u)
k∇uk3 k∇uk
8 This gives a relation between the curvature of isophotes, which is

sometimes called curvature of the image, and the gradient of the
image.
Curvature in numerical images
Using finite difference schemes makes it thus possible to compute
the curvature of the isophotes of an image.
A regularization/smoothing is needed in order to overcome the
non smoothness of the numerical grid.
A possibility is to smooth using for example the mean curvature
pde
∂u
= k∇ukcurv(u)
∂t
This gives regularization at high curvature points, e.g. corners.
The finite difference scheme may be written
u n = u n−1 + δtk∇u n−1 kcurv(u n−1 )
We compute absolute value of curvature of a smoothed image:
Curv(u n ) = |curv(u n )|
X
x,y
Modelization of camera movement
In this section we will present a model of camera movement in 3D

space.
We will show how good algebraic and geometric concepts help to

construct a model that allows computations and reconstruction of
then movement.
For illustration we will apply the model to some video data. For
this approximations of the theoretical model have to be done.
G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 19

Pinhole Camera Model
Camera in C and direction k ;
M (X,Y,Z)
M (X , Y , Z ) in (C, i, j, k );
R
c m (x,y) (c, i, j) basis of R;
fc
k (i, j, k ) is an orthonormal basis of R
j
C
i
m projection of M onto R.
X Y
x = fc , y = fc
Z Z

Projection matrix
M has projective coordinates
(X : Y : Z : 1) = (λX : λY : λZ : λ) ;
m has projective coordinates
(x : y : 1) = (λx : λy : λ) ;
 X /Z
 
x fc 0 0 0 
  
y  =  0 fc 0 0 Y /Z 

 1 
1 0 0 1 0
| {z } 1/Z
P
P is the projection (or camera) matrix associated to the camera:

m = PM

General camera matrix
• Intrinsic parameters: if the camera coordinates are not in (c, i, j) but
in (o, u, v )
u mu 0 x uo

= + .
v 0 mv y vo
• Extrinsic parameters: if the basis of R3 is not (C, i, j, k ) but

(0, i 0 , j 0 , k 0 ),
than M0 = RMC + t with R a rotation and t a translation.
• Eventually
mu 0 uo fc 0 0 0
  
R t

P=  0 mv vo   0 fc 0 0 
0 1
0 0 1 0 0 1 0
Camera calibration consist in the determination of matrix P.

This topic will not be addressed here.
Modelization of camera movement
1 In order to simplify, we will not take into account intrinsic and

extrinsic parameters:
we work in orthonormal basis and set fc = 1;
2 The camera movement is considered as the displacement of the
image plane R;
3 A rigid displacement is R3 is a transformation D which preserves
distances and angles D : R3 → R3 with R a rotation,
M 7→ RM + t
det(R) = +1, and t a translation.
Note : it depends on 6 parameters (3 for rotation, 3 for
translation).

Influence of movement on projections
1 Motion parallax: the projection of a fixed 3D scene can lead to

ambiguities due to depth.
But not
for a objects in the same plane (same distance);
if the movement is a pure rotation.
2 Epipolar constraint: the projection m of M on R is on the epipolar

line.
That is the line passing through e = R ∩ CC 0 and R ∩ Cm0 .

Illustration of motion parallax
M1 and M2 have same projection m on R.

Plane background
Objects in the plane Π will not be occluded.

Rotation and pure translation
After a rotation, aligned points are projected on the point in the

image plane.
After a pure translation, there will be two different projections.

Epipolar constraint

Camera Motion: notations
S
M
c’
R(j) displacement D= (R, t),
R(k) −−→
m’
C’
translation t = CC 0 ,
R
c m
rotation R, axis containing C,
k R(i) M 0 = RM + t.
C
j
(C, i, j, k ) transformed into
i (C 0 , R(i), R(j), R(k ));
S = D(R) with basis

(c 0 , R(i), R(j))
−−→
where C 0 c 0 = R(k ).
f image in R and g image in S:
for m ∈ R and m0 ∈ S the projections of M
with constant illumination f (m) = g(m0 ).
Camera Motion: image transformation
−−→
Denote CM = Xi + Yj + Zk ,
a1 b1 c1
 
R = a2 b2 c2  with R −1 = R t and t = t1 i + t2 j + t3 k .

a3 b3 c3
−−→ −−→ −−→
From C 0 M = X 0 R(i) + Y 0 R(j) + Z 0 R(k ) = C 0 C + CM:
X 0 R(i) + Y 0 R(j) + Z 0 R(k ) = −(t1 i + t2 j + t3 k ) + (Xi + Yj + Zk )
a1 b1 c1 t1 X
         
0  0  0 
X a2 + Y b2 + Z c2 = − t2 + Y    
a3 b3 c3 t3 Z
X t1 X X X t1
 0      0    
t  t 
R Y
 0  = − t2 + Y ⇔ Y
     0  = R Y − R t2
Z0 t3 Z Z0 Z t3
Camera Motion: image transformation
Thus
X0 a1 X + a2 Y + a3 Z − (a1 t1 + a2 t2 + a3 t3 )

x0 = 0 =


Z c1 X + c2 Y + c3 Z − (c1 t1 + c2 t2 + c3 t3 )



 X0 b1 X + b2 Y + b3 Z − (b1 t1 + b2 t2 + b3 t3 )
 y0 = 0 =



Z c1 X + c2 Y + c3 Z − (c1 t1 + c2 t2 + c3 t3 )
and dividing by Z :
a1 t1 +a2 t2 +a3 t3

 a1 x + a2 y + a3 − Z (x,y )
x0 =


c1 t1 +c2 t2 +c3 t3

c1 x + c2 y + c3 −


Z (x,y )


b1 t1 +b2 t2 +b3 t3
b1 x + b2 y + b3 −


Z (x,y )

 y = c x +c y +c −

 0
c1 t1 +c2 t2 +c3 t3


1 2 3 Z (x,y )

We obtain also
t1

 X a1 x 0 + b1 y 0 + c1 + Z 0 (x 0 ,y 0 )
x=

 = t3
Z

a3 x 0 + b3 y 0 + c3 +


Z 0 (x 0 ,y 0 )


t2
a2 x 0 + b2 y 0 + c2 +

Y

Z 0 (x 0 ,y 0 )

y=

 = t3
Z

a3 x 0 + b3 y 0 + c3 +


Z 0 (x 0 ,y 0 )
If m0 ∈ S and m ∈ R are the two projections of M ∈ R3 , their intensity

is the same g(m0 ) = f (m).
We have two applications which associate points of image f to points
of image g:
g(x 0 , y 0 ) = f (x, y ) = f ◦ ϕ(x 0 , y 0 )

f (x, y ) = g(x 0 , y 0 ) = g ◦ ψ(x, y )

We have g(x 0 , y 0 ) = f (x, y ) = f ϕ(x 0 , y 0 )
a1 x 0 + b1 y 0 + c1 + t1 /Z 0 (x 0 , y 0 ) a2 x 0 + b2 y 0 + c2 + t2 /Z 0 (x 0 , y 0 )

=f ,
a3 x 0 + b3 y 0 + c3 + t3 /Z 0 (x 0 , y 0 ) a3 x 0 + b3 y 0 + c3 + t3 /Z 0 (x 0 , y 0 )

f (x, y ) = g(x 0 , y 0 ) = g ψ(x, y )
a1 x + a2 y + a3 − ht/Z (x, y ), R(i)i b1 x + b2 y + b3 − ht/Z (x, y ), R(j)i

=g ,
c1 x + c2 y + c3 − ht/Z (x, y ), R(k )i c1 x + c2 y + c3 − ht/Z (x, y ), R(k )i
where h., .i denotes the scalar product.
In order to simplify the equations defining ϕ and ψ one usually makes

the assumption of constant depth:
thus Z (x, y ) = Z0 and Z 0 (x 0 , y 0 ) = Z00 .
We will even assume that Z0 ∼ Z00 , as the following result shows.

Camera Motion: depth approximation
Hypotheses:
Assume
H1: variation of optical axes direction and translation along k small;
H2: bounded point displacement.
Theorem for depth approximation

For Zinf ≤ Z 0 (x 0 , y 0 ) ≤ Zsup ,

1 1 ktk 0
if
− ktkC(L) ≤ ε and C (L) ≤ ε then
Zinf Zsup Zinf
Z 0 (x 0 , y 0 ) can be replaced by a constant Z0 in the equations of ϕ with

an error bounded by ε.
Moreover Z (x, y ) can be replaced with the same Z0 .
Note: C(L), C 0 (L) are constants depending the image plane size.
Camera Motion: depth approximation
The constant depth assumption is very often used, the preceding

results gives formal frame for its validity.
In practice, if the filmed objects are rather far from the camera, the
hypothesis of the theorem are valid.

We thus can rewrite
!
0 0 a1 x 0 + b1 y 0 + c1 + et1 a2 x 0 + b2 y 0 + c2 + et2
ϕ(x , y ) = ,
a3 x 0 + b3 y 0 + c3 + et3 a3 x 0 + b3 y 0 + c3 + et3
where et = t/Z0 .
Thus ϕ is a projective transform with matrix
 
a1 b1 c1 + et1
Mϕ = a2 b2 c2 + et2 
 
a3 b3 c3 + et3
Simple computations show that

 
a1 b1 c1 het, R(i)i

1 0
Mϕ = a2 b2 c2  0 1 het, R(j)i  = RH
 
a3 b3 c3 0 0 1 + het, R(k )i
The factorization Mϕ = RH is unique.

We have also
!
a1 x + a2 y + a3 − het, R(i)i b1 x + b2 y + b3 − het, R(j)i
ψ(x, y ) = ,
c1 x + c2 y + c3 − het, R(k )i c1 x + c2 y + c3 − het, R(k )i
with et = t/Z0 .
Thus ψ is a projective transform with matrix
 
a1 a2 a3 − het, R(i)i
Mψ = b1 b2 b3 − het, R(j)i 
 
c1 c2 c3 − het, R(k )i
Some computations show that

 
a1 a2 a3 1 0 −et1

Mψ = b1 b2 b3  0 1 −et2  = R −1 H.
  e
c1 c2 c3 0 0 1 − et3
The factorization Mψ = R −1 H
e is unique.
Projective group : definition and properties
n
PG2 (R) = φ : R2 → R2 such that ∀(x, y ) ∈ R2 ,

α1 β1 γ1
α1 x + β1 y + γ1 α2 x + β2 y + γ2
o
φ(x, y ) =

, , with α2 β2 γ2 6= 0 .
α3 x + β3 y + γ3 α3 x + β3 y + γ3 α3

β3 γ3
1 PG2 (R) is a group for matrix multiplication;

2 the application φ defined by a matrix M is identical to the one
defined by the matrix λM, for all λ ∈ R∗ ;
3 thus an element of PG2 (R) depends on 8 parameters;
4 a displacement D depends only on 6 parameters,
we have to consider a subset of PG2 (R);
5 ψ is not the inverse of ϕ for matrix multiplication: M−1
ϕ 6= Mψ ;
we have to introduce a different composition law.

Registration Group : definition


A = ϕ : R2 → R2 such that ∀(x, y ) ∈ R2

a1 x + b1 y + c1 + A a2 x + b2 y + c2 + B

ϕ(x, y ) = , ,
a3 x + b3 y + c3 + C a3 x + b3 y + c3 + C
a1 b1 c1 A
    

with R =  a2 b2 c2 , det(R) = +1 and t =
  B  ∈R .3
a3 b3 c3 C

ϕ ∈ A ←→ displacement D = (R, t)
ϕ−1 = ψ ∈ A ←→ displacement D −1 = (R t , −R t t)

Registration Group : composition law
Composition of displacements D = D1 ◦ D2
for D1 = (R1 , t1 ) and D2 = (R2 , t2 ) :
D = D1 ◦ D2 = (R1 R2 , R1 t2 + t1 ) .
The composition of displacements can by isomorphism be

performed through matrix multiplication in R4 :
R1 t1 R2 t2 R1 R2 t1 + R1 t2

=
0 1 0 1 0 1
The registration group is (A, ?): the law ? is deduced from the
composition law ◦ of displacements through the isomorphism
defined above.

Registration Group : composition
Let ϕ1 and ϕ2 in A, representing displacements D1 = (R1 , t1 ) and
D2 = (R2 , t2 ) (notice that t1 , t2 ∈ R3 here).
Then ϕ = ϕ1 ? ϕ2 is associated to the displacement
D = (R, t) = (R1 R2 , t1 + R1 t2 ) and
 
1 0 hte1 + R1 te2 , R1 R2 (i)i
Mϕ = Mϕ1 ?ϕ2 = R1 R2 0 1 hte1 + R1 te2 , R1 R2 (j)i 
 
0 0 1 + hte1 + R1 te2 , R1 R2 (k )i
where te1 = t1 /Z0 and te2 = t2 /Z0 .

Let f1 , f2 , f3 , . . . , fn be a sequence of images.
Suppose that for all couples of successive images (fi , fi+1 ),
we know ϕi such that fi+1 = fi ◦ ϕi and thus Di = (Ri , ti ).
Then fn = fn−1 ◦ ϕn−1 = . . . = f1 ◦ (ϕ1 ? ϕ2 ? · · · ? ϕn−1 ) = f1 ◦ ϕ
with ϕ the application associated to the displacement D = (R, t)
with R = R1 R2 · · · Rn−1
and t = t1 + R1 t2 + R1 R2 t3 + · · · + R1 R2 · · · Rn−2 tn−1 .
Registration Group: Application
1 The two images f and g are linked by g = f ◦ ϕ and f = g ◦ ψ,
in the registration group ϕ−1 = ψ.
2 The registration group will be used for the composition of
elementary displacements.
3 We will use a parametric 2D registration algorithm (e.g.
“Motion2D” of Odobez and Bouthemy) to obtain the parameters
which define ϕ : R and t.
4 We will have to change parameters and make some
approximations in order to be able to use the data obtained by the
registration algorithm.
5 First we give a different representation of the rotation R which we
then approximate in the case of small rotations or camera
movements.
6 This is only to provide an application of the law ? of the registration
group, we cannot go into all technical detail in the following.
Decomposition of rotation
R
R(k)
k
R = R2 R1
R(j)
j R1 axis ∆ ∈ (C, i, j)
C
R(i) parameters θ and α
i
R1 = Rθ,α = Rθk Rαi R−θ
k
R R
projective deformation of f
1 2
R2 axis R(k ) = R1 (k )
k
R(k) α R(k)
R(j)
R 1 (j) parameter β
C j
C
R2 = Rβ = Rθk Rαi Rβk R−α
i Rk
−θ
i
θ
β R(i)
plane rotation of R1 (f )
R (i)
1
∆
R = Rβ Rθ,α = Rθk Rαi Rβk R−θ

k k
= Rθk Rαi R−θ Rβk = Rθ,α Rβk .

Recall: transformation of images
Displacement D = (R, t) = (Rθ,α Rβk , t) and t̃ = t/Z0 .

The matrix Mϕ associated to ϕ is:
a1 b1 c1 + t˜1 a1 b1 c1 < t̃, R(i) >

    
1 0
a2 b2 c2 + t˜2  = a2 b2 c2  0 1 < t̃, R(j) >  = Rθ,α Rβk H,
a3 b3 c3 + t˜3 a3 b3 c3 0 0 1+ < t̃, R(k ) >
(x, y ) ∈ R2 : g(x 0 , y 0 ) = f (ϕ(x 0 , y 0 )) = f (rθ,α ◦ s(x 0 , y 0 )) .
• rθ,α projective transform;

• s similarity Rβk H.
New parameters of displacement (θ, α, β, A, B, C),
with (−A, −B, −C) coordinates of t̃ in (R(i), R(j), R(k )).

Recall: transformation of images
We have g(x 0 , y 0 ) = f (ϕ(x 0 , y 0 )) = f (x, y ) and
!
a1 x 0 + b1 y 0 + c1 − t̃1 a2 x 0 + b2 y 0 + c2 − t̃2
ϕ(x 0 , y 0 ) = , = (x, y )
a3 x 0 + b3 y 0 + c3 − t̃3 a3 x 0 + b3 y 0 + c3 − t̃3
Moreover f (x, y ) = g(ψ(x, y )) = g(x 0 , y 0 ) and
a1 x + a2 y + a3 + A b1 x + b2 y + b3 + B

ψ(x, y ) = , = (x 0 , y 0 )
c1 x + c2 y + c3 + C c1 x + c2 y + c3 + C
New parameters:
t̃ = −A R(i) − B R(j) − CR(k ) and R = Rθ,α Rβk .

Approximations
parameter θ (radians) α (radians) β (radians) A,B (foc.length.) C (foc.length.)

values ] − π, π] [0, 0.03] [−0.05, 0.05] [−0.09, 0.09] [−0.03, 0.03]
For small α, β, A, B, C one can compute the following approximation:
x − x ' −Cx + A +βy +αx(cos θ y − sin θ x)

0
y 0 − y ' −Cy + B −βx +αy (cos θ y − sin θ x)

| {z } | {z } | {z }
t̃ Rβk Rθ,α
For x and y small, quadratic terms can be neglected.

Example
x 0 − x ' −Cx + A +βy

y 0 − y ' −Cy + B −βx

Rotation does not transform the center of the image.
movement flow similarity s rotation rθ,α

Parameters: θ = 3π/4, α = 5 · 10−4 , β = 5, 1 · 10−2 , A = 3, B = 2, λ = 0, 02.

Estimation of movement flow
Motion2D by B OUTHÉMY, O DOBEZ to estimate the 2D movement
between two images.
Parametric model:
x2
 
c1 a1 a2 x q1 q2 0  

uΘ (x, y ) = + + xy .
c2 −a2 a1 y 0 q1 q2
y2
The six parameters (c1 , c2 , a1 , a2 , q1 , q2 ) are related to the

displacement parameters α, β, θ, A, B et C.
The algorithm estimates the parameters by minimizing
X
ρ(DFD(Θ,ξ) (x, y ), Γ).
(x,y )∈f
With
DFD(Θ,ξ) (x, y ) = g (x, y ) + uΘ (x, y ) − f (x, y ) + ξ ,
using robust statistical estimator.
Example 1
θ = 0,8; α = 0,0005; β = −0,051; A = 3; B = 1; C = -0,01.
f g1/2(f + g)

Example 1
β A B C θ α
initial -0,051 2 1 -0,01 0,8 0,0005
estim. -0,05113 2,812 0,075 0,98821 π/4 =0,785 0,0004
flow V 1/2(f + g) 1/2

Example 2
I1

Example 3

Movies
Film incrustation
Film simulation

References
F. Dibos, C. Jonchery, G. Koepfler, Camera motion estimation through planar deformation determination, in Journal of
Mathematical Imaging and Vision, vol 32, 1, p. 73-87, September 2008.
C. Jonchery, PhD dissertation, Estimation d’un mouvement de caméra et problèmes connexes. ENS Cachan, November
2006.
F. Dibos, Du groupe projectif au groupe des recalages : une nouvelle modélisation, Comptes Rendus à l’Académie des
Sciences, 2001.
O. Faugeras, Three-Dimensional Computer Vision, MIT Press, Cambridge, 1993.
O. Faugeras, Q.-T. Luong, T. Papadopoulo, The Geometry of Multiple Images, MIT Press, Cambridge, 2000.
J.M. Odobez, P. Bouthemy, Robust Multiresolution Estimation of Parametric Motion Models, Journal of Visual
Communication and Image Representation, vol 6, 4, p. 348-365, 1995;

Occlusion video
Layer decomposition
An image of a natural scene is obtained by projection of the 3D scene.
This is modeled by the superposition of several layers.
An occlusion appears if a layer is projected upon another.
= + +
occluding occluded
image = + + background
person person
G. Koepfler Math. modelization Occlusion in video HMU - June 2018 56

Introduction
Problem:
extract and reconstruct layers from a video sequence;
even if total occlusion occurs during several frames;
propose a mathematical model for layers.
⇒ in front of
Example of sequence (occluding, occluded, hidden).

Layer model definition
Definition:
image I defined on the domain D;
layer of 3d moving object: (Ω, o)
- Ω ⊂ D region where the object is projected if no occlusions;
- o gray level function defined on Ω, giving the object’s gray level.
Consider N + 1-frame video sequence (Ii )i=0,...,N with:

a fixed background,
an occluded moving object and an occluding moving object.
Three layers:
background layer (D, B);
occluded object layer (Ωi , oΩi );
occluding object layer (Oi , oOi ).

Layer model definition
According to the layer model, the image Ii at pixel x ∈ D is given by:
 oOi (x)

if x ∈ Oi ;
Ii (x) ≈ o (x) if x ∈ Ωi \ Oi ;
 Ωi
B(x) else.
The notation ≈ allows for unknown noise effects.
Ii (x) ≈ oOi (x)IOi (x)

+ oΩi (x)IΩi (x) (1 − IOi (x))
+ B(x) (1 − IΩi (x)) (1 − IOi (x)) .

Layer deformation model
Idea: 3d object motion model yields deformation functions,
Ti for the occluded object and Ti0 for the occluding object.
Ti is such that:
pixel x0 ∈ I0 and pixel xi = Ti (x0 ) ∈ Ii
are the projection of the same moving 3d point X.
Ti and Ti0 take care of the perspective deformation of the objects.

0−1 0−1
 I0 (Ti (x)) if Ti (x) ∈ O0 ;

Assuming no occlusion in I0 : Ii (x) ≈ I (T −1 (x)) if Ti−1 (x) ∈ Ω0 ;

 0 i
B(x) else.
Image Ii with occlusion.Use Ti and Ω0 (= the car).Occlusion of Ωi can

be restored.

Hypotheses:
fixed camera and known background B;
no occlusion in first image;
moving objects are rigid;
movement is a uniform translation in 3d space.
Thus:
consider only short sequences;
parametric deformation deduced from 3d motion and not from 2d.

Construction of the layer deformation function Ti :
moving 3d point X(t) = (X (t), Y (t), Z (t));
uniform translation Ẋ(t) = (A, B, C);
projection x onto Ii by pin-hole camera model:
" X (ti ) # " X (0)+Ati
#
x(ti )

x(ti ) = = YZ (t(tii )) = Z (0)+Cti
Y (0)+Bti
y (ti )
Z (ti ) Z (0)+Cti
x0 a

= 1
cti +1 + ti .
y0 b
with a = A/Z (0), b = B/Z (0), c = C/Z (0) and ti = i · ∆t = i (up

to parameter change):
1
xi = (x0 + ia) = Ti (x0 ) .
1 + ci
We have a model for deformation of a region Ω0 in region Ωi in frame

number i:
Ωi = Ti (Ω0 ) = 1+ci
1
(Ω0 + ia)
−1
Ω0 = Ti (Ωi ) = (1 + ci)Ωi − ia
Using Ti we determine the position occupied by region Ω0 in image

number i.
This region will be completed by using information from the first image:
∀x ∈ Ωi , Ii (x) = I0 (x0 ) with x0 = (1 + ci)x − ia
∀x 6∈ Ωi , Ii (x) = B(x0 )
Or, without occlusion:

Ii (x) ≈ oΩ0 (Ti−1 (x))IΩ0 (Ti−1 (x)) + B(x) 1 − IΩ0 (Ti−1 (x))

Variational model
 oOi (x)

if x ∈ Oi ;
Ii (x) ≈ o (x) if x ∈ Ωi \ Oi ; with Oi = Ti0 (O0 ), Ωi = Ti (Ω0 )
 Ωi
B(x) else.
n o
and Ωi ∩ Oi = Ti x0 ∈ Ω0 / T 0 i−1 Ti (x0 ) ∈ O0 .
Terms used in the global energy:

the static difference: ∆0i (x) = Ii (x) − B(x) ;
the warp motion difference for the occluded object:
∆1i (x) = Ii (x) − I0 (Ti−1 x) ;
the warp motion difference for the occluding object:
−1
∆2i (x) = Ii (x) − I0 (Ti0 x) ;
the boundary detection for the moving objects:
g (|∇I0 |) = 1/(1 + |∇I0 |2 ) .
Variational model
Compute Ω0 , O0 and (a, b, c, a0 , b0 , c 0 ) from the sequence.
Energy
XZ
E = ρ ∆2i (x) dx
i>0 Oi
XZ
+ ρ ∆1i (x) dx
i>0 Ωi \Oi
XZ
+ ρ ∆0i (x) dx
i>0 D\(Oi ∪Ωi )
Z Z
0
+ λ g (|∇I0 |) ds + λ g (|∇I0 |) ds ,
∂Ω0 ∂O0
√
with robust estimator, e.g. ρ(s) = 2 + s2 .
Energy (after change of variables)
G. Koepfler Z Math. modelization Occlusion in video HMU - June 2018 66
X
Variational model: Minimization
Initialisation
get regions R1 and R2 : compare I0 and B;
parameters (a, b, c, a0 , b0 , c 0 ):
use parametric optical flow between I0 and I1 .
Depth order:
test (Ω0 , O0 ) = (R1 , R2 ) versus (Ω0 , O0 ) = (R2 , R1 )
and keep the one with the lowest energy E.
Iterate:
Minimize on regions Ω0 and O0 with I.C.M. ;
Minimize on parameters (a, b, c, a0 , b0 , c 0 ) with simplex method.
As we focus on modelling, we give no details here on the minimization,

we illustrate the possibilities of the model with a few examples in the
following slides.

Results on synthetic sequence
original sequence
restored sequence

Results on real sequences
Office Sequence
Outdoor Sequence.

Conclusion
extract layers of moving objects from a sequence;

reconstruct occluded parts;
3d motion model takes into account deformations;
variational formulation allows to recover depth information;
allows for restoration of sequences, inversion of layers.

References
S. Pelletier, PhD dissertation, Modèle multi-couches pour l’analyse de

séquences vidéo, University Paris Dauphine, 2007.
F. Dibos, G. Koepfler et S. Pelletier, Video layer extraction and

reconstruction, in Proc. IASTED International Conference on
Visualization, Imaging and Image Processing, 2008.

Introduction
Purpose:
Detection of moving objects/people in a video sequence.
We present a simple and fast method with few parameters and low
complexity.
A major hypothesis is that the background is fixed.
This is a common assumption in video surveillance applications.
Again our purpose is to illustrate the use of mathematical
modelling for a precise problem.
The topic of video surveillance has attracted a lot of attention the
last years and a lot of approaches have been proposed.
G. Koepfler Math. modelization Video segmentation HMU - June 2018 73

Real time approach
Simplification: fixed camera
Constraints:
real time ;
on line processing ;
few parameters.
Difficulties : The changes due to "noise" (e.g. shadows,

reflections,. . . ) must not be detected as objects in movement.
Example: Webcam movie

Displacement modelization
Difference between two successive images:
|I(x, t) − I(x, t + 1)|
Optical flow, compute vector v such that:

I(x, t) ≈ I(x + v , t + 1)
Comparison to a given background:

|I(x, t) − B(x)|
⊕ No minimal temporal frequency,

we also detect objects which move slowly.
Good estimation of the background.
Example: algorithm W4 (film),

Overview
Data: an image I and the background B
Step 1: image of differences
Step 2: significant regions

Image of differences
We have to determine where the background B and the current
image I are different by being robust to changes of luminosity;
Use isophotes :
the contour lines give the structure of the images;
the edges of the objects may be associated with large contrast
contour lines.
If an object hides the background, it “breaks” level lines ;

Thresholds
In each point x, we compute
nI (x) = |∇I|(x), nB (x) = |∇B|(x),
angle(x) = angle(∇I(x), ∇B(x)).
We use three thresholds:

e, if nI (x) ≥ e there is local structure;
E, if nB (x) < e and nI (x) ≥ E there is a "new" structure,

E > e increases the stability of the algorithm;
φ, if angle(x) ≥ φ there is a change in the local structure.

Compute the image of the differences
if there is a change in x, we set D(x) = white
[nI (x) ≥ e and nB (x) ≥ e and angle(x) ≥ φ]

or [nI (x) ≥ E and nB (x) < e] or [nB (x) ≥ E and nI (x) < e]
Change of level line or

“appearance/disappearance ” of a contrasted level line.
no change in x, then D(x) = black
Example of image of differences
The white points indicate objects in movement, but due to noise they
are distributed almost everywhere in the image.
With different values for φ:

Image of differences: comments
The value of E influences

S the behavior in case of changes of
contrast. If Ω = i Ωi and
∀Ωi , ∃αi > 0, ∃βi > 0, ∀x ∈ Ωi : I(x) = αi B(x) + βi

e E
then if E ≤ αi ≤ e : ∀x ∈ Ωi , D(x)= black pixel or gray pixel.
Shadows do not disturb the image of differences.
Objects in B which disappear in I are detected. They are called

phantoms.
Thus it is necessary to compute the image of differences in a non
symmetric way: the disappearance of a structure in
[nB (x) ≥ E and nI (x) < e] will give then gray pixels .

Example of shadow treatment

Example of shadow treatment

Example of phantom treatment
B D sym. D non sym.
I seg. sym. seg. non sym.

Detection of groupings
Moving object/person
= grouping of white pixels among the observable pixels
(white or black)
6= uniformly distributed white “noise” pixels.

The a priori model
Our aim is to detect groupings which are not likely to appear in a noisy
image.
We measure the average density of white pixels among observable
pixels (white or black):
p = P(white pixel| observable )

# white in the image
=
#observable pixels in the image
In the a priori model, an observable pixel is white with probability p

and black with probability 1 − p.
Thus, for a region W with n observable pixels, the probability of having
more than k white pixels is given by the queue of the binomial law:
n
n

pi (1 − p)(n−i) .
X
B(p, n, k ) =
i
i=k

Significance of a region
We consider N regions in the image, W1 , . . . , WN , in the image where

we expect moving objects.
Definitions
Let ∈ R∗+ , we say region Wi is -significant if
ki
B(p, ni , ki ) < and >p
N ni
The significance of Wi is given by :
S(Wi ) = − log(B(p, ni , ki )N).
where ni are the observable and ki the white pixels in Wi .

The window Wi is -significant if S(Wi ) > T , with T = − log() > 0.

Interpretation of significance
if W is − significant

1
Define a function such that f (W ) =
0 else
If we suppose the regions W1 , . . . , WN independent, the expectation of

the number of significant regions is given by
!
X X X
E f (Wi ) = P(f (Wi ) = 1) = P(Wi is significant).
i i i

As P(Wi is significant) = P B(p, ni , ki ) <
N
and k 7→ B(p, ni , k ) is decreasing, there exists K such that

B(p, ni , K ) < and B(p, ni , K − 1) ≥ .
N N

Interpretation of significance
Thus

P(Wi is significant) = P(ki ≥ K ) = B(p, ni , K ) <
N
and !
X
E f (Wi ) <
i
.
The number of false alarms (detections in noise) is lower than .
A moving object is detected if there exists a window W such that

S(W ) = − log(B(p, ni , ki )N) > T = − log().
The higher the value of S(W ), the lower the chance W is uniform
noise.

Maximality criterion
As there are a lot of significant windows, it is necessary to define a sort

of maximal significance.
A window W is maximal -significant if

(a) S(W ) > T , i.e W is -significant;
(b) for all significant window W 0 , such that W ∩ W 0 6= ∅,

else S(W ) > S(W 0 ), then W is a better approximation of the
object than W 0 ,
else there exists W 00 , such that W 00 ∩ W 0 6= ∅ and
S(W 00 ) > S(W 0 ) > S(W ),
then W 0 may be associated to another object, localized in W 00 .

Maximality - Algorithm
Object Object 1 Object 2
W W W’’
W’
W’
Algorithm:
1 In the initial list, find the most significant window Wmax , add it to
the list of maximal -significant windows and remove it from the
initial list;
2 delete all the windows W such that W ∩ Wmax 6= ∅;
3 repeat these two steps until there is no more -significant window
in the initial list.
Significance
S(W)
120
60
20
old
0
Label of W
1 2 3 4 5 6 7 8 9 10 11 12 13
Boundary Noise
Well placed

Results
test sequence 1.
test sequence 2.
Metro sequence 1.
Metro sequence 2.
Comparison W4 and AWD.

Refine the segmentation
Let O = ∪W be the disjoint union of the maximal significant regions.

We maximize the functional S(O):
remove block B ⊂ O, situated in the edges of O, if
S(O \ B) > S(O);
add block B 6⊂ O, adjacent to the edges of O, if S(O ∪ B) > S(O),
and if B has a density of observable pixels greater than the image
of differences D;
repeat these two steps as long as there are blocks to be added or
removed.

Results
⇒ Refinement of the segmentation.
⇒ No symmetric criterion.
Sequence caviar 2.

Object tracking
The connected components are numbered:
1 1 1 2 1 2
1 1 3 2 2 2
1 1 1 4 4 5 4 4 5
1 5 5
Basic algorithm: the barycenter of every related component allows to

connect it with the closest region in the previous image.

Tracking results
test sequence CAVIAR

test sequence ETISEO
test sequence APRON

References
S. Pelletier, PhD dissertation, Modèle multi-couches pour l’analyse de

séquences vidéo, University Paris Dauphine, 2007.
F. Dibos, G. Koepfler et S. Pelletier, Adapted Windows Detection of

Moving Objects in Video Scenes, in SIAM Journal on Imaging Sciences,
2-1, pp. 1-19, 2009.

Math Modelization HMU HANDOUT

Uploaded by

Copyright:

Available Formats

Math Modelization HMU HANDOUT

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math Modelization HMU HANDOUT

Uploaded by

Copyright:

Available Formats

Mathematics and modelization

MAP5, UMR CNRS 8145

UFR de Mathématiques et Informatique

Hanoi Metropolitan University, June 2018

G. Koepfler Math. modelization HMU - June 2018 1

2 Curvature and crowd density

3 Modelling camera movement

G. Koepfler Math. modelization HMU - June 2018 2

In this presentation we will focus on applying mathematics to

There are often multiple possibilities for modelization of a given

In practice the precision of the results, the complexity, the speed

The aim here is to show how mathematics can help to model

We first recall some notations and concepts which will be usefull

G. Koepfler Math. modelization Introduction HMU - June 2018 4

32 43 54 87 68 56 74 98 86 71 73 95 116 119 128 121 120 121 139 157 120

grid of numbers or pixels relief or 2D function

G. Koepfler Math. modelization Introduction HMU - June 2018 5

Various possibilities for information representation :

image data relief topographic map

G. Koepfler Math. modelization Introduction HMU - June 2018 6

Consider a function of the two variables x ∈ [−2, 2] and y ∈ [−2, 2] .

Thus (x, y ) ∈ [−2, 2] × [−2, 2] and z = u(x, y ) ∈ R .

G. Koepfler Math. modelization Introduction HMU - June 2018 7

Question : How to distinguish between these images?

Empty platform Crowded platform

Without precise detection of people, faces,. . .

There are numerous possibilities to address this problem. We will

Detection of objects and/or people might be complicated.

Empty platformCrowded platform

x 0 (s)y 00 (s) − x 00 (s)y 0 (s)

1 We have seen that u(x(s), y (s)) = c for an isophote.

uxx (x 0 )2 + 2uxy x 0 y 0 + uyy (y 0 )2 = −(ux x 00 + uy y 00 )

6 If we inject these relations in 3 :

uxx uy2 + 2uxy ux uy + uyy uy2

7 We thus obtain, by some more computations

8 This gives a relation between the curvature of isophotes, which is

In this section we will present a model of camera movement in 3D

We will show how good algebraic and geometric concepts help to

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 19

Camera in C and direction k ;

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 20

P is the projection (or camera) matrix associated to the camera:

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 21

• Extrinsic parameters: if the basis of R3 is not (C, i, j, k ) but

Camera calibration consist in the determination of matrix P.

1 In order to simplify, we will not take into account intrinsic and

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 23

1 Motion parallax: the projection of a fixed 3D scene can lead to

2 Epipolar constraint: the projection m of M on R is on the epipolar

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 24

M1 and M2 have same projection m on R.

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 25

Objects in the plane Π will not be occluded.

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 26

After a rotation, aligned points are projected on the point in the

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 27

G. Koepfler Math. modelization Modelling camera movement HMU - June 2018 28

S = D(R) with basis

R = a2 b2 c2  with R −1 = R t and t = t1 i + t2 j + t3 k .