Feature correspondences From Multiple Views of
Coplanar Ellipses
Cecile Barat, Jean-François Menudet, Thierry Fournel, Hanene Louhichi
To cite this version:
Cecile Barat, Jean-François Menudet, Thierry Fournel, Hanene Louhichi. Feature correspondences
From Multiple Views of Coplanar Ellipses. 2nd International Symposium on Visual Computing, Nov
2006, Lake Tahoe, Nevada, United States. pp.364-373, 10.1007/11919629_38. ujm-00192591
HAL Id: ujm-00192591
https://hal-ujm.archives-ouvertes.fr/ujm-00192591
Submitted on 24 Apr 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Feature Correspondences From
Multiple Views of Coplanar Ellipses
Cécile Barat, Jean-François Menudet, Hanene Louhichi, and Thierry Fournel
Laboratoire Traitement du Signal et Instrumentation,UMR CNRS-UJM 5516
Bâtiment F, 18 rue du Pr.Benoı̂t Lauras, 42000 Saint Etienne, France
Abstract. We address the problem of feature correspondences in images of coplanar ellipses with objective to benefit of robust ellipse fitting
algorithm. The main difficulty is the lack of projective invariant points
immediately available. Therefore, our key idea is to construct virtual
line and point features using the property of tangent invariance under
perspective projection. The proposed method requires first a robust detection of ellipse edge points to fit a parametric model on each ellipse.
The feature lines are then obtained by computing the 4 bitangents to
each couple of ellipses. The points are derived by considering the tangent points and the intersection points between bitangents. Results of
experimental studies are presented to demonstrate the reliability and robustness of the feature extraction process. Subpixel accuracy is easily
achieved. A real application to camera self-calibration is also described.
1
Introduction
Finding feature correspondences is an essential preliminary step in many Computer Vision tasks such as Structure From Motion or Stereovision. The most
commonly used features are points and line segments. Their relations across
multiple views have been extensively studied in the framework of projective geometry and led to the introduction of the fundamental matrix, the trifocal tensor
or the homography matrix for the case of coplanar features [4]. It is well known
that these entities can be estimated in a linear fashion from correspondences of
points or lines [4] .
Nevertheless, line segments and especially points are rather instable features,
not so easy to detect in noisy images. Several works have investigated the use of
more complex primitives whose detection in images could be potentially more
stable. Among possible primitives, conics have received a lot of attention because
they play a central role in projective geometry and robust algorithms exist to
compute them [10, 2]. Estimation of fundamental matrix from at least 4 conics
is addressed in [7]. In the case of two views of a conic with known epipolar
geometry (i.e. known fundamental matrix), [5] proposes a solution to compute
the homography associated with the conic plane. When the fundamental matrix
is not available, two coplanar conics are necessary, as described in [8]. These
results on conics are extended to general algebraic curves in [6] but at the expense
of large systems of polynomial equations.
2
Unlike previous works, this article does not focus on multiple-view geometry
of conics but rather gives a simple way to take advantage of conic fitting to
construct virtual matching features with high accuracy. Essentially, the method
we propose exploits bitangents to pairs of coplanar ellipses to obtain 4 lines and
14 points which are projective invariant, robust to noise and easy-to-compute.
Our method combines the strengths of point/line-based approach (simplicity,
extensive background) and conic-based approach (stability). It could be used
in numerous algorithms based on point correspondences such as camera selfcalibration or motion estimation. It appears to be a solution to a problem arising
when circular targets are used for point correspondences. Circular targets do not
present any characteristic points. As a result, centroids of targets in images are
classically used as feature points. And yet, they theoretically do not correspond
to images of the target centers. This systematic error is not negligible when
viewing distance is lower than 20 times the focal length. Moreover, the larger
the circle, the greater the error. The projective invariance property of our method
overcomes this limitation.
The next section details the computation of bitangents and associated points
starting from images of two unknown ellipses. Then, an experimental study of
features accuracy is presented in section 3. Finally, section 4 shows an application
of the method to a real problem, namely camera self-calibration from multiple
views of unknown coplanar ellipses.
2
Features computation
The method we propose in this paper exploits the notion of bitangent to locate
a set of reliable projective invariant features in images. A bitangent is defined
as a line which is tangent to at least two objects. It has the property of being
invariant to perspective projection.
When the two objects under consideration are two separated coplanar ellipses, there exist four bitangents, two internal (LR, RL) and two external (LL,
RR). Letters L or R denote the relative position (left or right) of both ellipses relatively to bitangent (oriented from left to right on Figure 1(b)). These bitangent
lines yield 14 projective invariant points which can be used as feature points for
solving correspondences across multiple views of the two ellipses. (Figure 1(b)).
Indeed, the 6 intersection points between bitangents and the 8 tangent points
on ellipses remain unchanged under perspective projection.
Given an image of ellipses like the one of figure 1(a), the remaining question
is how to compute the bitangents and the derived features with objectives of
subpixel accuracy, robustness to noise, easiness and efficiency.
Description of the process. The process we suggest is composed of five steps:
- Ellipse edge detection
- Object labeling and ordering
- Ellipse model fitting
3
RR
RL
LR
LL
(a)
(b)
Fig. 1. Bitangents and feature points extraction (a) Original image (b) Common tangents(solid lines: external tangents, dashed lines : internal tangents) and associated
feature points (filled circles : tangent points, rings : bitangent intersection points)
- Bitangents computation
- Feature points computation
The objective of the first three steps is to accurately estimate the parameters
of the different ellipses and their relative positions with respect to a given pattern. Starting with ellipse edge points detection, a connected component labeling
is then applied to identify each ellipse. Then, a sorting is done to match ellipses
to target features.
Next, ellipses are processed by pair. Their parameter estimates serve as inputs
to the algorithm for bitangent computation described later.
Finally, using bitangent equations, the different intersection points between
bitangents and ellipses are calculated.
As the different steps are interlinked, the more robust the ellipse estimation,
the more accurate the feature points. As a consequence, it is essential to perform
ellipse edge detection with great care using a robust method. Let us take a closer
look at the different steps and algorithms applied.
Ellipse edge detection. Images to process are images of an ellipses pattern
from possibly oblique viewpoints. The amount of blur varies across the image
plane in relation with the depths of the image points. Image blur and noise are 2
critical problems to address in order to guarantee accurate and stable features.
A standard strategy is to adopt a multiscale method for edge detection. The
algorithm chosen was proposed by Ducottet et al in [1]. It is a wavelet-based
algorithm, which allows to detect and locate edges with an automatic adaptation
of the scale to their smoothness. It is based on a modeling of contours as gaussiansmoothed singularities of three particular types (transitions, peaks and lines).
The algorithm delivers a set of connected edge pixels per ellipse.
Object labeling and ordering. Once we have all edge points, we need to
identify edge points of each ellipse. We apply a standard object labeling method
4
to assign a number to each connected component. Then, we are concerned with
the problem of putting in correspondence the ellipses on the image with their
corresponding features in space. This task starts by identifying the ellipse at the
upper-left corner and consists then in making an ordered sorting of the ellipses
from the reference one (from left to right and from top to bottom).
Ellipse model fitting. The point of this step is to determine the five parameters describing each ellipse. A least square ellipse fitting is applied to each labeled
object as proposed in [2]. It is a robust method which provides coordinates of
ellipse centers (u0 , v0 ), axes lengths (a, b) and orientation θ.
Bitangents computation. The method described in [3] allows to compute
the bitangents of two ellipses and the corresponding tangent points. It assumes
that ellipses are defined by an analytic equation aX 2 + bXY + cY 2 + dX +
eY + f = 0. Coefficients (a, b, c, d, e, f ) can be trivially derived from parameters
(u0 , v0 , a, b, θ) obtained previously. Bitangents are defined by equations of form
Y = U X + V . All equations are combined together to give a polynomial in U
of degree 4. In the general case of no parallel or vertical bitangents, it has four
real single roots corresponding to the slopes of the four bitangents. Finally, the
four corresponding values of V are computed and bitangents sorted according
to their types (LL, LR, RL and RR).
Feature points computation. In [3], the author also gives equations to obtain
the coordinates of the 8 tangent points. Furthermore, knowing bitangent equations, intersection points between bitangents can be easily calculated as line
intersections. It leads to 6 feature points more. We get a total of 14 projective
invariant points for a couple of ellipses.
3
Experimental analysis
The question arising naturally is the accuracy of the computed features in function of data, i.e. ellipse edge points detected in images. An analytic derivation
seems out of reach because of the equations involved (4th order polynomial).
Therefore, Monte-Carlo simulations are performed in order to estimate mean
error and standard deviation of feature points. The different experiments we
have conducted allow to understand the influence of several parameters : the
number of ellipse edge points, the relative position of 2 ellipses (distance, ellipse sizes, relative orientation) and the level of noise. We present here some
results ascertaining the accuracy of our method. They concern only 13 feature
points pk (k=1..13) obtained from a setup of 2 ellipses, numbered as depicted
in figure 2(a). The point resulting from intersection of bitangents LL and RR is
discarded because of its instability. Indeed, this feature tends to infinity point
in many natural configurations (such as 2 circles).
5
700
3 13
500.2
4
11
600
6
500.1
12
500
12
500.15
7
500.05
500
8
400
10
1 9
499.95
5
2
499.9
300
0
499.85
100
200
300
400
(a)
500
600
700
800
900
449.8
449.9
450
450.1
450.2
(b)
Fig. 2. (a)First setup: 2 ellipses with semi-axes 100 and 150 pixels and orthogonal orientation (b)Enlargement around feature number 12: circle (blue) gives the theoretical
position, dots (green) are the 500 computed points with N = 800 and σp = 1 pixel.
Cross and ellipse (red) are respectively mean and confidence ellipse (at two-sigma level)
of point cloud.
Protocol description. The study of the influence of the different parameters
relies on the following steps:
- sampling of N points evenly spaced on ellipse boundaries
- corruption of points coordinates with gaussian noise with zero mean and standard deviation σp (in pixel)
- ellipses fitting on noisy points
- computation of the 13 feature points from fitted ellipses
For any specific set of parameters (N, σp ), 500 trials are performed. The
result is 13 point clouds of 500 points, associated with each of the 13 feature
points. One of this point cloud (k=12) is displayed in figure 2(b). The circle gives
the theoretical position of the feature point pk and dots represent the computed
feature obtained per trial.
The mean point p̄k (cross) and the covariance matrix of each point cloud are
also computed. We denote Σk2 ≥ σk2 the variances of a given point cloud along
its principal directions. The confidence ellipse centered on mean point and with
semi-axes (2Σk ,2σk ) is also drawn in figure 2(b).
Ellipse setup influence. The first experiment concerns the 2 ellipses displayed
on figure 2, with orthogonal orientation.
Figure 3 presents results obtained with N = 800 (a point at every pixel) and
σp = 1 pixel. The first graph shows the mean error, i.e. distance between the
mean point p̄k and the theoretical point pk . In the second graph, dark bar (blue)
gives σk and lighter bar (green) Σk . The first conclusion is the negligible bias
of the method, less than 0.02 pixels with this setup. The second conclusion is
6
Mean errors (pixel)
the different behaviors of features: clearly, features {5,6,7,8,10,11} are less stable
than others. Especially, their uncertainty is more anisotropic, i.e. σk 6= Σk . The
reason is the lower curvature of the ellipse at these feature points. Consequently,
they are less constrained in the direction of tangent, resulting in an elongated
point cloud.
0.02
0.015
0.01
0.005
0
1
2
3
4
5
6
7
k
8
9
10
11
12
Std. dev. (pixel)
0.2
13
σk
0.15
Σk
0.1
0.05
0
1
2
3
4
5
6
7
k
8
9
10
11
12
13
Fig. 3. Mean error (top graph) and standard deviations σk ,Σk (bottom graph) of point
clouds associated to the 13 features of figure 2(a) with N = 800 and σp = 1 pixel.
We now consider 2 ellipses having the same orientation (Figure 4). The same
experience as above is repeated and the results are displayed in figure 5.
700
3 13
6
4
11 7
600
12
500
400
8
1 9 2
10 5
300
0
100
200
300
400
500
600
700
800
900
Fig. 4. Second setup: 2 ellipses with semi-axes 100 and 150 pixels and parallel orientation.
We can see that all features are located in areas with highest curvature
and behave identically, with approximatively σk = 0.085 pixel and Σk = 0.1
pixel. The mean error is still negligible. These values should be compared to the
7
Mean errors (pixel)
0.02
0.015
0.01
0.005
0
1
2
3
4
5
6
7
k
8
9
10
11
12
13
Std. dev. (pixel)
0.2
σ
k
0.15
Σ
k
0.1
0.05
0
1
2
3
4
5
6
7
k
8
9
10
11
12
13
Fig. 5. Mean error (top graph) and standard deviations σk , Σk (bottom graph) of
point clouds associated to the 13 features of figure 4 with N = 800 and σp = 1 pixel.
uncertainty in edge detection, i.e. σp = 1 pixel. This is to say that in this case,
ellipse fitting brings a gain of 10 and allows to obtain sub-pixel measurement.
Parameter N influence. It seems obvious that the abovementionned gain is
directly related to the number of points N used for ellipse fitting. This assumption is verified by studying Σ̄ (the mean value of the Σk ) as a function of N
(with σp = 1). The resulting graph is shown in figure 6 (Left), with N varying
from 5 (the minimum to fit an ellipse) to 1000. The hyperbolic decrease of Σ̄
is manifest and demonstrates the efficiency of ellipse fitting in our application.
In practice, it is quite easy to detect few hundreds of contour points at pixel
precision in images. Hence, sub-pixel accuracy is guaranteed for the proposed 13
feature points.
1.4
0.2
0.18
1.2
0.16
0.14
0.8
Σ (pixel)
Σ (pixel)
1
0.6
0.12
0.1
0.08
0.06
0.4
0.04
0.2
0.02
0
0
100
200
300
400
500
N
600
700
800
900
1000
0
0
0.2
0.4
0.6
0.8
1
1.2
σp (pixel)
1.4
1.6
1.8
2
Fig. 6. Mean value of the Σk in function of: (Left) the number of points N used for
ellipse fitting and with input noise level σp = 1 pixel; (Right) the input noise level σp
and with N = 800.
8
Level of noise influence. Considering ellipses of the previous experiment and
N = 800, figure 6 (Right) gives Σ̄ for σp varying from 0 to 2 pixels. This last
experiment shows the linear degradation of our algorithm with respect to the
level of input noise σp .
4
Application to camera self-calibration
Camera calibration aims at the computation of camera intrinsic parameters from
image(s) of a perfectly known object. This is a basic problem in computer vision
when metric information are required from images. Intrinsic parameters allow
T
to relate coordinates [X Y Z] of a 3D point (in camera frame) and coordinates
T
[u v] (in pixel) of its projection in image:
u
τ f 0 v0
X/Z
v = 0 f u0 Y /Z
1
0 0 1
1
T
where f is the focal length, τ the aspect ratio of pixels and [u0 v0 ] coordinates
in image of principal point. This is the standard pinhole model which made
assumptions of negligible optical distortions and zero skew camera. In practice,
aspect ratio τ is very close to 1 in most of modern CCD.
Camera self-calibration is an (harder) variant of the camera calibration problem occurring when the scene is unknown. In this case, the basic data is correspondences of points in images. Therefore, camera-self-calibration could benefit
of our method. Nevertheless, it is well-known that basic self-calibration algorithms suffered from degeneracy when correspondences come from coplanar 3D
points, which is precisely the case in our approach. Specific self-calibration algorithms have been proposed to handle this case [11]. This technique, known as
plane-based self-calibration, only required the homographies relating the different views of an unknown planar scene.
We propose to find such homographies from a planar scene made of 5 circles
as shown in figure 7. Note that the fact we have circles in practice will not
be used by our algorithm unlike circle-based calibration methods [12]. 6 images
(resolution 1600 × 1200) of this scene are taken with a digital still camera from
different viewpoints. 104 feature points per image are computed using adjacent
ellipses. Then, inter-view homographies are estimated using these features and
the normalized DLT algorithm [4]. The residual RMS error is about 0.15 pixel
which is in agreement with simulation results.
Without changing settings of camera (zoom, focus), 10 images of planar calibration pattern are acquired. Thus, Zhang’s calibration method [13] can be
applied. The result will be used as ground truth values of parameters. They
are compared with the output of our plane-based self-calibration algorithm (initialization with Triggs’s method followed by bundle adjustment). Results are
presented in table 1. Relative errors are computed for focal length (|∆f /f |),
principal point (|∆u0 /f |, |∆v0 /f |) and aspect ratio (|∆τ /τ |). Principal point
and aspect ratio are rather well estimated whereas focal length result seems less
9
Fig. 7. 3 views (among 6) of 5 coplanar ellipses (circles in practice).
accurate. This is surprising because usually principal point is the most difficult
parameter to estimate. Therefore, simulations with known camera parameters
have been performed. They have not shown specific instability of focal length
estimation. We concluded that the problem is not inherent to the method but
probably due to the images themselves. More experiments are necessary but we
are convinced that it could be a valuable approach for camera calibration.
Param.
f
u0
v0
τ
Calib. Self-Calib. Relative Errors
4140.13 4065.76 |∆f /f | 1.81%
789.91 787.74 |∆u0 /f | 0.05%
609.22 636.15 |∆v0 /f | 0.65%
0.9988 0.9971
|∆τ /τ | 0.17%
Table 1. Comparison between plane-based calibration and self-calibration methods.
Relative errors are computed with calibration method as reference.
5
Conclusion
This article proposes an ellipse-based method for computation of invariant features in images. The use of bitangent lines to a couple of coplanar ellipses is the
key idea that yields projective invariance. Hence, invariant points can be derived
from intersections of bitangents with ellipses and other bitangents. These points
and lines finally provide feature correspondences across multiple views. Benefits
of this approach are two fold. First, ellipse fitting brings accurate and stable
features (typically 0.1-0.2 pixel) from simple edge detection at pixel precision.
Secondly, extensive background of feature-based methods can subsequently be
used. For example, homography relating images of 2 or more ellipses can be
directly estimated in this way. Note that the former solution of [8] is more complicated although it also involves roots of a polynomial of degree 4. Indeed, it
10
needs the computation of intersection points between ellipses which are (in the
general case) points with complex coordinates, possibly at infinity.
A practical application of our method was demonstrated with camera calibration from images of several unknown ellipses. Inter-images homographies are estimated from invariant feature points and plane-based self-calibration algorithm
is applied. Results are quite satisfying although they surely can be improved, for
example with subpixel edge detection of ellipses. Comparison with circle-based
calibration methods should be done to evaluate the potential of our approach.
Another important point is the integration of features uncertainty. Indeed, experiments have shown that the covariance matrix of each feature is related to
ellipse curvature or angle between bitangents. In future work, we plan to predict
such covariance matrices in order to incorporate them in feature-based methods
allowing it (such as bundle adjustment).
References
1. C. Ducottet, T. Fournel and C. Barat, Scale-adaptive detection and local characterization of edges based on wavelet transform, Signal Processing, 84: 2115-2137,
2004.
2. A. Fitzgibbon, M. Pilu, and R. B. Fisher, Direct Least Square Fitting of Ellipses,
In IEEE Trans. PAMI, 21(5), pp. 476-480, 1999.
3. L. Habert, Computing bitangents for ellipses, In Proc. 17th Canadian Conference
on Computational Geometry (CCCG’05), 2005, 294-297.
4. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, 2002.
5. C.Schmid and A.Zisserman, The geometry and matching of lines and curves over
multiple views, In International Journal on Computer Vision, 40(3), pp. 199-234,
2000.
6. J.Y. Kaminski and Amnon Shashua, Multiple View Geometry of General Algebraic
Curves, In International Journal on Computer Vision, 56(3), pp. 195-219, 2004.
7. F. Kahl and A. Heyden. Using Conic Correspondence in Two Images to Estimate
the Epipolar Geometry, Proceedings of the International Conference on Computer
Vision, 1998.
8. P. K. Mudigonda C. V. Jawahar P. J. Narayanan, Geometric Structure Computation from Conics, In Proceedings of Indian Conference on Computer Vision,
Graphics & Image Processing, 2004
9. C. A. Rothwell, A. Zisserman, D. A. Forsyth, J. L. Mundy, Planar Object Recognition using Projective Shape Representation, International Journal on Computer
Vision, 16(1), pp. 57-99, 1995.
10. S.P. Sampson, Fitting conic sections to very scattered data: an iterative refinement
of the Bookstein algorithm, In Computer Graphics and Image Processing, vol. 18,
pp. 97-108, 1982.
11. B. Triggs, Autocalibration from planar sequences, In Proc. of the European Conference on Computer Vision, Freiburg, Germany, June 1998.
12. Y. Wu, X. Li, F. Wu, Z. Hu, Coplanar circles, quasi-affine invariance and calibration, Image and Vision Computing, 24(4), pp. 319-326, April 2006.
13. Z. Zhang, A flexible new technique for camera calibration, IEEE Trans. PAMI,
vol. 22, pp. 1330-1334, 2000.