Academia.eduAcademia.edu

Feature Correspondences from Multiple Views of Coplanar Ellipses

2006, Advances in Visual Computing

Feature correspondences From Multiple Views of Coplanar Ellipses Cecile Barat, Jean-François Menudet, Thierry Fournel, Hanene Louhichi To cite this version: Cecile Barat, Jean-François Menudet, Thierry Fournel, Hanene Louhichi. Feature correspondences From Multiple Views of Coplanar Ellipses. 2nd International Symposium on Visual Computing, Nov 2006, Lake Tahoe, Nevada, United States. pp.364-373, ฀10.1007/11919629_38฀. ฀ujm-00192591฀ HAL Id: ujm-00192591 https://hal-ujm.archives-ouvertes.fr/ujm-00192591 Submitted on 24 Apr 2009 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Feature Correspondences From Multiple Views of Coplanar Ellipses Cécile Barat, Jean-François Menudet, Hanene Louhichi, and Thierry Fournel Laboratoire Traitement du Signal et Instrumentation,UMR CNRS-UJM 5516 Bâtiment F, 18 rue du Pr.Benoı̂t Lauras, 42000 Saint Etienne, France Abstract. We address the problem of feature correspondences in images of coplanar ellipses with objective to benefit of robust ellipse fitting algorithm. The main difficulty is the lack of projective invariant points immediately available. Therefore, our key idea is to construct virtual line and point features using the property of tangent invariance under perspective projection. The proposed method requires first a robust detection of ellipse edge points to fit a parametric model on each ellipse. The feature lines are then obtained by computing the 4 bitangents to each couple of ellipses. The points are derived by considering the tangent points and the intersection points between bitangents. Results of experimental studies are presented to demonstrate the reliability and robustness of the feature extraction process. Subpixel accuracy is easily achieved. A real application to camera self-calibration is also described. 1 Introduction Finding feature correspondences is an essential preliminary step in many Computer Vision tasks such as Structure From Motion or Stereovision. The most commonly used features are points and line segments. Their relations across multiple views have been extensively studied in the framework of projective geometry and led to the introduction of the fundamental matrix, the trifocal tensor or the homography matrix for the case of coplanar features [4]. It is well known that these entities can be estimated in a linear fashion from correspondences of points or lines [4] . Nevertheless, line segments and especially points are rather instable features, not so easy to detect in noisy images. Several works have investigated the use of more complex primitives whose detection in images could be potentially more stable. Among possible primitives, conics have received a lot of attention because they play a central role in projective geometry and robust algorithms exist to compute them [10, 2]. Estimation of fundamental matrix from at least 4 conics is addressed in [7]. In the case of two views of a conic with known epipolar geometry (i.e. known fundamental matrix), [5] proposes a solution to compute the homography associated with the conic plane. When the fundamental matrix is not available, two coplanar conics are necessary, as described in [8]. These results on conics are extended to general algebraic curves in [6] but at the expense of large systems of polynomial equations. 2 Unlike previous works, this article does not focus on multiple-view geometry of conics but rather gives a simple way to take advantage of conic fitting to construct virtual matching features with high accuracy. Essentially, the method we propose exploits bitangents to pairs of coplanar ellipses to obtain 4 lines and 14 points which are projective invariant, robust to noise and easy-to-compute. Our method combines the strengths of point/line-based approach (simplicity, extensive background) and conic-based approach (stability). It could be used in numerous algorithms based on point correspondences such as camera selfcalibration or motion estimation. It appears to be a solution to a problem arising when circular targets are used for point correspondences. Circular targets do not present any characteristic points. As a result, centroids of targets in images are classically used as feature points. And yet, they theoretically do not correspond to images of the target centers. This systematic error is not negligible when viewing distance is lower than 20 times the focal length. Moreover, the larger the circle, the greater the error. The projective invariance property of our method overcomes this limitation. The next section details the computation of bitangents and associated points starting from images of two unknown ellipses. Then, an experimental study of features accuracy is presented in section 3. Finally, section 4 shows an application of the method to a real problem, namely camera self-calibration from multiple views of unknown coplanar ellipses. 2 Features computation The method we propose in this paper exploits the notion of bitangent to locate a set of reliable projective invariant features in images. A bitangent is defined as a line which is tangent to at least two objects. It has the property of being invariant to perspective projection. When the two objects under consideration are two separated coplanar ellipses, there exist four bitangents, two internal (LR, RL) and two external (LL, RR). Letters L or R denote the relative position (left or right) of both ellipses relatively to bitangent (oriented from left to right on Figure 1(b)). These bitangent lines yield 14 projective invariant points which can be used as feature points for solving correspondences across multiple views of the two ellipses. (Figure 1(b)). Indeed, the 6 intersection points between bitangents and the 8 tangent points on ellipses remain unchanged under perspective projection. Given an image of ellipses like the one of figure 1(a), the remaining question is how to compute the bitangents and the derived features with objectives of subpixel accuracy, robustness to noise, easiness and efficiency. Description of the process. The process we suggest is composed of five steps: - Ellipse edge detection - Object labeling and ordering - Ellipse model fitting 3 RR RL LR LL (a) (b) Fig. 1. Bitangents and feature points extraction (a) Original image (b) Common tangents(solid lines: external tangents, dashed lines : internal tangents) and associated feature points (filled circles : tangent points, rings : bitangent intersection points) - Bitangents computation - Feature points computation The objective of the first three steps is to accurately estimate the parameters of the different ellipses and their relative positions with respect to a given pattern. Starting with ellipse edge points detection, a connected component labeling is then applied to identify each ellipse. Then, a sorting is done to match ellipses to target features. Next, ellipses are processed by pair. Their parameter estimates serve as inputs to the algorithm for bitangent computation described later. Finally, using bitangent equations, the different intersection points between bitangents and ellipses are calculated. As the different steps are interlinked, the more robust the ellipse estimation, the more accurate the feature points. As a consequence, it is essential to perform ellipse edge detection with great care using a robust method. Let us take a closer look at the different steps and algorithms applied. Ellipse edge detection. Images to process are images of an ellipses pattern from possibly oblique viewpoints. The amount of blur varies across the image plane in relation with the depths of the image points. Image blur and noise are 2 critical problems to address in order to guarantee accurate and stable features. A standard strategy is to adopt a multiscale method for edge detection. The algorithm chosen was proposed by Ducottet et al in [1]. It is a wavelet-based algorithm, which allows to detect and locate edges with an automatic adaptation of the scale to their smoothness. It is based on a modeling of contours as gaussiansmoothed singularities of three particular types (transitions, peaks and lines). The algorithm delivers a set of connected edge pixels per ellipse. Object labeling and ordering. Once we have all edge points, we need to identify edge points of each ellipse. We apply a standard object labeling method 4 to assign a number to each connected component. Then, we are concerned with the problem of putting in correspondence the ellipses on the image with their corresponding features in space. This task starts by identifying the ellipse at the upper-left corner and consists then in making an ordered sorting of the ellipses from the reference one (from left to right and from top to bottom). Ellipse model fitting. The point of this step is to determine the five parameters describing each ellipse. A least square ellipse fitting is applied to each labeled object as proposed in [2]. It is a robust method which provides coordinates of ellipse centers (u0 , v0 ), axes lengths (a, b) and orientation θ. Bitangents computation. The method described in [3] allows to compute the bitangents of two ellipses and the corresponding tangent points. It assumes that ellipses are defined by an analytic equation aX 2 + bXY + cY 2 + dX + eY + f = 0. Coefficients (a, b, c, d, e, f ) can be trivially derived from parameters (u0 , v0 , a, b, θ) obtained previously. Bitangents are defined by equations of form Y = U X + V . All equations are combined together to give a polynomial in U of degree 4. In the general case of no parallel or vertical bitangents, it has four real single roots corresponding to the slopes of the four bitangents. Finally, the four corresponding values of V are computed and bitangents sorted according to their types (LL, LR, RL and RR). Feature points computation. In [3], the author also gives equations to obtain the coordinates of the 8 tangent points. Furthermore, knowing bitangent equations, intersection points between bitangents can be easily calculated as line intersections. It leads to 6 feature points more. We get a total of 14 projective invariant points for a couple of ellipses. 3 Experimental analysis The question arising naturally is the accuracy of the computed features in function of data, i.e. ellipse edge points detected in images. An analytic derivation seems out of reach because of the equations involved (4th order polynomial). Therefore, Monte-Carlo simulations are performed in order to estimate mean error and standard deviation of feature points. The different experiments we have conducted allow to understand the influence of several parameters : the number of ellipse edge points, the relative position of 2 ellipses (distance, ellipse sizes, relative orientation) and the level of noise. We present here some results ascertaining the accuracy of our method. They concern only 13 feature points pk (k=1..13) obtained from a setup of 2 ellipses, numbered as depicted in figure 2(a). The point resulting from intersection of bitangents LL and RR is discarded because of its instability. Indeed, this feature tends to infinity point in many natural configurations (such as 2 circles). 5 700 3 13 500.2 4 11 600 6 500.1 12 500 12 500.15 7 500.05 500 8 400 10 1 9 499.95 5 2 499.9 300 0 499.85 100 200 300 400 (a) 500 600 700 800 900 449.8 449.9 450 450.1 450.2 (b) Fig. 2. (a)First setup: 2 ellipses with semi-axes 100 and 150 pixels and orthogonal orientation (b)Enlargement around feature number 12: circle (blue) gives the theoretical position, dots (green) are the 500 computed points with N = 800 and σp = 1 pixel. Cross and ellipse (red) are respectively mean and confidence ellipse (at two-sigma level) of point cloud. Protocol description. The study of the influence of the different parameters relies on the following steps: - sampling of N points evenly spaced on ellipse boundaries - corruption of points coordinates with gaussian noise with zero mean and standard deviation σp (in pixel) - ellipses fitting on noisy points - computation of the 13 feature points from fitted ellipses For any specific set of parameters (N, σp ), 500 trials are performed. The result is 13 point clouds of 500 points, associated with each of the 13 feature points. One of this point cloud (k=12) is displayed in figure 2(b). The circle gives the theoretical position of the feature point pk and dots represent the computed feature obtained per trial. The mean point p̄k (cross) and the covariance matrix of each point cloud are also computed. We denote Σk2 ≥ σk2 the variances of a given point cloud along its principal directions. The confidence ellipse centered on mean point and with semi-axes (2Σk ,2σk ) is also drawn in figure 2(b). Ellipse setup influence. The first experiment concerns the 2 ellipses displayed on figure 2, with orthogonal orientation. Figure 3 presents results obtained with N = 800 (a point at every pixel) and σp = 1 pixel. The first graph shows the mean error, i.e. distance between the mean point p̄k and the theoretical point pk . In the second graph, dark bar (blue) gives σk and lighter bar (green) Σk . The first conclusion is the negligible bias of the method, less than 0.02 pixels with this setup. The second conclusion is 6 Mean errors (pixel) the different behaviors of features: clearly, features {5,6,7,8,10,11} are less stable than others. Especially, their uncertainty is more anisotropic, i.e. σk 6= Σk . The reason is the lower curvature of the ellipse at these feature points. Consequently, they are less constrained in the direction of tangent, resulting in an elongated point cloud. 0.02 0.015 0.01 0.005 0 1 2 3 4 5 6 7 k 8 9 10 11 12 Std. dev. (pixel) 0.2 13 σk 0.15 Σk 0.1 0.05 0 1 2 3 4 5 6 7 k 8 9 10 11 12 13 Fig. 3. Mean error (top graph) and standard deviations σk ,Σk (bottom graph) of point clouds associated to the 13 features of figure 2(a) with N = 800 and σp = 1 pixel. We now consider 2 ellipses having the same orientation (Figure 4). The same experience as above is repeated and the results are displayed in figure 5. 700 3 13 6 4 11 7 600 12 500 400 8 1 9 2 10 5 300 0 100 200 300 400 500 600 700 800 900 Fig. 4. Second setup: 2 ellipses with semi-axes 100 and 150 pixels and parallel orientation. We can see that all features are located in areas with highest curvature and behave identically, with approximatively σk = 0.085 pixel and Σk = 0.1 pixel. The mean error is still negligible. These values should be compared to the 7 Mean errors (pixel) 0.02 0.015 0.01 0.005 0 1 2 3 4 5 6 7 k 8 9 10 11 12 13 Std. dev. (pixel) 0.2 σ k 0.15 Σ k 0.1 0.05 0 1 2 3 4 5 6 7 k 8 9 10 11 12 13 Fig. 5. Mean error (top graph) and standard deviations σk , Σk (bottom graph) of point clouds associated to the 13 features of figure 4 with N = 800 and σp = 1 pixel. uncertainty in edge detection, i.e. σp = 1 pixel. This is to say that in this case, ellipse fitting brings a gain of 10 and allows to obtain sub-pixel measurement. Parameter N influence. It seems obvious that the abovementionned gain is directly related to the number of points N used for ellipse fitting. This assumption is verified by studying Σ̄ (the mean value of the Σk ) as a function of N (with σp = 1). The resulting graph is shown in figure 6 (Left), with N varying from 5 (the minimum to fit an ellipse) to 1000. The hyperbolic decrease of Σ̄ is manifest and demonstrates the efficiency of ellipse fitting in our application. In practice, it is quite easy to detect few hundreds of contour points at pixel precision in images. Hence, sub-pixel accuracy is guaranteed for the proposed 13 feature points. 1.4 0.2 0.18 1.2 0.16 0.14 0.8 Σ (pixel) Σ (pixel) 1 0.6 0.12 0.1 0.08 0.06 0.4 0.04 0.2 0.02 0 0 100 200 300 400 500 N 600 700 800 900 1000 0 0 0.2 0.4 0.6 0.8 1 1.2 σp (pixel) 1.4 1.6 1.8 2 Fig. 6. Mean value of the Σk in function of: (Left) the number of points N used for ellipse fitting and with input noise level σp = 1 pixel; (Right) the input noise level σp and with N = 800. 8 Level of noise influence. Considering ellipses of the previous experiment and N = 800, figure 6 (Right) gives Σ̄ for σp varying from 0 to 2 pixels. This last experiment shows the linear degradation of our algorithm with respect to the level of input noise σp . 4 Application to camera self-calibration Camera calibration aims at the computation of camera intrinsic parameters from image(s) of a perfectly known object. This is a basic problem in computer vision when metric information are required from images. Intrinsic parameters allow T to relate coordinates [X Y Z] of a 3D point (in camera frame) and coordinates T [u v] (in pixel) of its projection in image:      u τ f 0 v0 X/Z  v  =  0 f u0   Y /Z  1 0 0 1 1 T where f is the focal length, τ the aspect ratio of pixels and [u0 v0 ] coordinates in image of principal point. This is the standard pinhole model which made assumptions of negligible optical distortions and zero skew camera. In practice, aspect ratio τ is very close to 1 in most of modern CCD. Camera self-calibration is an (harder) variant of the camera calibration problem occurring when the scene is unknown. In this case, the basic data is correspondences of points in images. Therefore, camera-self-calibration could benefit of our method. Nevertheless, it is well-known that basic self-calibration algorithms suffered from degeneracy when correspondences come from coplanar 3D points, which is precisely the case in our approach. Specific self-calibration algorithms have been proposed to handle this case [11]. This technique, known as plane-based self-calibration, only required the homographies relating the different views of an unknown planar scene. We propose to find such homographies from a planar scene made of 5 circles as shown in figure 7. Note that the fact we have circles in practice will not be used by our algorithm unlike circle-based calibration methods [12]. 6 images (resolution 1600 × 1200) of this scene are taken with a digital still camera from different viewpoints. 104 feature points per image are computed using adjacent ellipses. Then, inter-view homographies are estimated using these features and the normalized DLT algorithm [4]. The residual RMS error is about 0.15 pixel which is in agreement with simulation results. Without changing settings of camera (zoom, focus), 10 images of planar calibration pattern are acquired. Thus, Zhang’s calibration method [13] can be applied. The result will be used as ground truth values of parameters. They are compared with the output of our plane-based self-calibration algorithm (initialization with Triggs’s method followed by bundle adjustment). Results are presented in table 1. Relative errors are computed for focal length (|∆f /f |), principal point (|∆u0 /f |, |∆v0 /f |) and aspect ratio (|∆τ /τ |). Principal point and aspect ratio are rather well estimated whereas focal length result seems less 9 Fig. 7. 3 views (among 6) of 5 coplanar ellipses (circles in practice). accurate. This is surprising because usually principal point is the most difficult parameter to estimate. Therefore, simulations with known camera parameters have been performed. They have not shown specific instability of focal length estimation. We concluded that the problem is not inherent to the method but probably due to the images themselves. More experiments are necessary but we are convinced that it could be a valuable approach for camera calibration. Param. f u0 v0 τ Calib. Self-Calib. Relative Errors 4140.13 4065.76 |∆f /f | 1.81% 789.91 787.74 |∆u0 /f | 0.05% 609.22 636.15 |∆v0 /f | 0.65% 0.9988 0.9971 |∆τ /τ | 0.17% Table 1. Comparison between plane-based calibration and self-calibration methods. Relative errors are computed with calibration method as reference. 5 Conclusion This article proposes an ellipse-based method for computation of invariant features in images. The use of bitangent lines to a couple of coplanar ellipses is the key idea that yields projective invariance. Hence, invariant points can be derived from intersections of bitangents with ellipses and other bitangents. These points and lines finally provide feature correspondences across multiple views. Benefits of this approach are two fold. First, ellipse fitting brings accurate and stable features (typically 0.1-0.2 pixel) from simple edge detection at pixel precision. Secondly, extensive background of feature-based methods can subsequently be used. For example, homography relating images of 2 or more ellipses can be directly estimated in this way. Note that the former solution of [8] is more complicated although it also involves roots of a polynomial of degree 4. Indeed, it 10 needs the computation of intersection points between ellipses which are (in the general case) points with complex coordinates, possibly at infinity. A practical application of our method was demonstrated with camera calibration from images of several unknown ellipses. Inter-images homographies are estimated from invariant feature points and plane-based self-calibration algorithm is applied. Results are quite satisfying although they surely can be improved, for example with subpixel edge detection of ellipses. Comparison with circle-based calibration methods should be done to evaluate the potential of our approach. Another important point is the integration of features uncertainty. Indeed, experiments have shown that the covariance matrix of each feature is related to ellipse curvature or angle between bitangents. In future work, we plan to predict such covariance matrices in order to incorporate them in feature-based methods allowing it (such as bundle adjustment). References 1. C. Ducottet, T. Fournel and C. Barat, Scale-adaptive detection and local characterization of edges based on wavelet transform, Signal Processing, 84: 2115-2137, 2004. 2. A. Fitzgibbon, M. Pilu, and R. B. Fisher, Direct Least Square Fitting of Ellipses, In IEEE Trans. PAMI, 21(5), pp. 476-480, 1999. 3. L. Habert, Computing bitangents for ellipses, In Proc. 17th Canadian Conference on Computational Geometry (CCCG’05), 2005, 294-297. 4. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, 2002. 5. C.Schmid and A.Zisserman, The geometry and matching of lines and curves over multiple views, In International Journal on Computer Vision, 40(3), pp. 199-234, 2000. 6. J.Y. Kaminski and Amnon Shashua, Multiple View Geometry of General Algebraic Curves, In International Journal on Computer Vision, 56(3), pp. 195-219, 2004. 7. F. Kahl and A. Heyden. Using Conic Correspondence in Two Images to Estimate the Epipolar Geometry, Proceedings of the International Conference on Computer Vision, 1998. 8. P. K. Mudigonda C. V. Jawahar P. J. Narayanan, Geometric Structure Computation from Conics, In Proceedings of Indian Conference on Computer Vision, Graphics & Image Processing, 2004 9. C. A. Rothwell, A. Zisserman, D. A. Forsyth, J. L. Mundy, Planar Object Recognition using Projective Shape Representation, International Journal on Computer Vision, 16(1), pp. 57-99, 1995. 10. S.P. Sampson, Fitting conic sections to very scattered data: an iterative refinement of the Bookstein algorithm, In Computer Graphics and Image Processing, vol. 18, pp. 97-108, 1982. 11. B. Triggs, Autocalibration from planar sequences, In Proc. of the European Conference on Computer Vision, Freiburg, Germany, June 1998. 12. Y. Wu, X. Li, F. Wu, Z. Hu, Coplanar circles, quasi-affine invariance and calibration, Image and Vision Computing, 24(4), pp. 319-326, April 2006. 13. Z. Zhang, A flexible new technique for camera calibration, IEEE Trans. PAMI, vol. 22, pp. 1330-1334, 2000.