Abstract
In this paper we propose an extension to the KinectFusion approach which enables both SLAM-graph optimization, usually required on large looping routes, as well as discovery of semantic information in the form of object detection and localization. Global optimization is achieved by incorporating the notion of keyframe into a KinectFusion-style approach, thus providing the system with the ability to explore large environments and maintain a globally consistent map. Moreover, we integrate into the system our recent object detection approach based on a new Semantic Bundle Adjustment paradigm, thereby achieving joint detection, tracking and mapping. Although our current implementation is not optimized for real-time operation, the principles and ideas set forth in this paper can be considered a relevant contribution towards a Semantic KinectFusion system.
Chapter PDF
Similar content being viewed by others
References
Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 9(5), 698–700 (1987)
Bao, S.Y., Savarese, S.: Semantic structure from motion. In: CVPR (2011)
Besl, P.J., McKay, H.D.: A method for registration of 3-d shapes. PAMI 14(2), 239–256 (1992)
Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. In: Proc. of the IEEE Int’l Conf. on Robotics and Automation, vol. 3, pp. 2724–2729 (April 1991)
Civera, J., Gálvez-López, D., Riazuelo, L., Tardós, J.D., Montiel, J.M.M.: Towards semantic SLAM using a monocular camera. In: Proc. of the Int’l Conf. on Intelligent Robot Systems (IROS), pp. 1277–1284 (2011)
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1996, pp. 303–312. ACM, New York (1996)
Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: IEEE Int’l Conf. on Computer Vision (ICCV), Washington, DC, USA, p. 1403 (2003)
Ekvall, S., Jensfelt, P., Kragic, D.: Integrating active mobile robot object recognition and slam in natural environments. In: IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems (October 2006)
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: IEEE Int’l Conf. on Robotics and Automation (ICRA), St. Paul, MA, USA (May 2012)
Fioraio, N., Di Stefano, L.: Joint detection, tracking and mapping by semantic bundle adjustment. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA (2013)
Fioraio, N., Konolige, K.: Realtime visual and point cloud slam. In: Proc. of the RGB-D Workshop on Advanced Reasoning with Depth Cameras at Robotics: Science and Systems Conf. (RSS), pp. 27 (2011)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Proc. of Int’l Symp. on Experimental Robotics (ISER) (2010)
Johnson, A.: Spin-Images: A Representation for 3-D Surface Matching. Ph.D. thesis, Robotics Institute, Carnegie Mellon University (August 1997)
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: IEEE and ACM Int’l Symp. on Mixed and Augmented Reality (ISMAR), pp. 225–234 (November 2007)
Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: A general framework for graph optimization. In: ICRA, Shanghai, China (May 2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–119 (2004)
Newcombe, R., Lovegrove, S., Davison, A.: Dtam: Dense tracking and mapping in real-time. In: IEEE Int’l Conf. on Computer Vision (ICCV), pp. 2320–2327 (November 2011)
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: ISMAR, Washington, DC, USA, pp. 127–136 (2011)
Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152. IEEE Comput. Soc. (2001)
Rusu, R.B., Cousins, S.: 3D is here: Point cloud library (PCL). In: IEEE Int’l Conf. on Robotics and Automation (ICRA), Shanghai, China, May 9-13 (2011)
Sibley, G., Mei, C., Reid, I., Newman, P.: Adaptive relative bundle adjustment. In: Robotics Science and Systems (RSS), Seattle, USA (June 2009)
Stühmer, J., Gumhold, S., Cremers, D.: Real-time dense geometry from a handheld camera. In: Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K. (eds.) DAGM 2010. LNCS, vol. 6376, pp. 11–20. Springer, Heidelberg (2010)
Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: 18th IEEE Int’l Conf. on Image Processing (ICIP), September 11-14, pp. 809–812. Brussels, Belgium (2011)
Whelan, T., McDonald, J., Kaess, M., Fallon, M., Johannsson, H., Leonard, J.: Kintinuous: Spatially extended Kinect Fusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia (July 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fioraio, N., Cerri, G., Di Stefano, L. (2013). Towards Semantic KinectFusion. In: Petrosino, A. (eds) Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41184-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-41184-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41183-0
Online ISBN: 978-3-642-41184-7
eBook Packages: Computer ScienceComputer Science (R0)