Towards Semantic KinectFusion

Fioraio, Nicola; Cerri, Gregorio; Di Stefano, Luigi

doi:10.1007/978-3-642-41184-7_31

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8157))

Included in the following conference series:

International Conference on Image Analysis and Processing

3589 Accesses

Abstract

In this paper we propose an extension to the KinectFusion approach which enables both SLAM-graph optimization, usually required on large looping routes, as well as discovery of semantic information in the form of object detection and localization. Global optimization is achieved by incorporating the notion of keyframe into a KinectFusion-style approach, thus providing the system with the ability to explore large environments and maintain a globally consistent map. Moreover, we integrate into the system our recent object detection approach based on a new Semantic Bundle Adjustment paradigm, thereby achieving joint detection, tracking and mapping. Although our current implementation is not optimized for real-time operation, the principles and ideas set forth in this paper can be considered a relevant contribution towards a Semantic KinectFusion system.

Download to read the full chapter text

Chapter PDF

Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure

DART: dense articulated real-time tracking with consumer depth cameras

Article 28 July 2015

GGC-SLAM: a VSLAM system based on predicted static probability of feature points in dynamic environments

Article 27 June 2024

Keywords

References

Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 9(5), 698–700 (1987)
Article Google Scholar
Bao, S.Y., Savarese, S.: Semantic structure from motion. In: CVPR (2011)
Google Scholar
Besl, P.J., McKay, H.D.: A method for registration of 3-d shapes. PAMI 14(2), 239–256 (1992)
Article Google Scholar
Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. In: Proc. of the IEEE Int’l Conf. on Robotics and Automation, vol. 3, pp. 2724–2729 (April 1991)
Google Scholar
Civera, J., Gálvez-López, D., Riazuelo, L., Tardós, J.D., Montiel, J.M.M.: Towards semantic SLAM using a monocular camera. In: Proc. of the Int’l Conf. on Intelligent Robot Systems (IROS), pp. 1277–1284 (2011)
Google Scholar
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1996, pp. 303–312. ACM, New York (1996)
Chapter Google Scholar
Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: IEEE Int’l Conf. on Computer Vision (ICCV), Washington, DC, USA, p. 1403 (2003)
Google Scholar
Ekvall, S., Jensfelt, P., Kragic, D.: Integrating active mobile robot object recognition and slam in natural environments. In: IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems (October 2006)
Google Scholar
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: IEEE Int’l Conf. on Robotics and Automation (ICRA), St. Paul, MA, USA (May 2012)
Google Scholar
Fioraio, N., Di Stefano, L.: Joint detection, tracking and mapping by semantic bundle adjustment. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA (2013)
Google Scholar
Fioraio, N., Konolige, K.: Realtime visual and point cloud slam. In: Proc. of the RGB-D Workshop on Advanced Reasoning with Depth Cameras at Robotics: Science and Systems Conf. (RSS), pp. 27 (2011)
Google Scholar
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Proc. of Int’l Symp. on Experimental Robotics (ISER) (2010)
Google Scholar
Johnson, A.: Spin-Images: A Representation for 3-D Surface Matching. Ph.D. thesis, Robotics Institute, Carnegie Mellon University (August 1997)
Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: IEEE and ACM Int’l Symp. on Mixed and Augmented Reality (ISMAR), pp. 225–234 (November 2007)
Google Scholar
Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: A general framework for graph optimization. In: ICRA, Shanghai, China (May 2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–119 (2004)
Article Google Scholar
Newcombe, R., Lovegrove, S., Davison, A.: Dtam: Dense tracking and mapping in real-time. In: IEEE Int’l Conf. on Computer Vision (ICCV), pp. 2320–2327 (November 2011)
Google Scholar
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: ISMAR, Washington, DC, USA, pp. 127–136 (2011)
Google Scholar
Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152. IEEE Comput. Soc. (2001)
Google Scholar
Rusu, R.B., Cousins, S.: 3D is here: Point cloud library (PCL). In: IEEE Int’l Conf. on Robotics and Automation (ICRA), Shanghai, China, May 9-13 (2011)
Google Scholar
Sibley, G., Mei, C., Reid, I., Newman, P.: Adaptive relative bundle adjustment. In: Robotics Science and Systems (RSS), Seattle, USA (June 2009)
Google Scholar
Stühmer, J., Gumhold, S., Cremers, D.: Real-time dense geometry from a handheld camera. In: Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K. (eds.) DAGM 2010. LNCS, vol. 6376, pp. 11–20. Springer, Heidelberg (2010)
Google Scholar
Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: 18th IEEE Int’l Conf. on Image Processing (ICIP), September 11-14, pp. 809–812. Brussels, Belgium (2011)
Google Scholar
Whelan, T., McDonald, J., Kaess, M., Fallon, M., Johannsson, H., Leonard, J.: Kintinuous: Spatially extended Kinect Fusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia (July 2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science and Engineering, University of Bologna, viale Risorgimento, 2, Bologna, Italy
Nicola Fioraio, Gregorio Cerri & Luigi Di Stefano

Authors

Nicola Fioraio
View author publications
You can also search for this author in PubMed Google Scholar
Gregorio Cerri
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Di Stefano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Applied Science, University of Naples Parthenope, Centro Direzionale Isola C4, 80133, Napoli, Italy
Alfredo Petrosino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fioraio, N., Cerri, G., Di Stefano, L. (2013). Towards Semantic KinectFusion. In: Petrosino, A. (eds) Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41184-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-41184-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41183-0
Online ISBN: 978-3-642-41184-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Towards Semantic KinectFusion

Abstract

Chapter PDF

Similar content being viewed by others

Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure

DART: dense articulated real-time tracking with consumer depth cameras

GGC-SLAM: a VSLAM system based on predicted static probability of feature points in dynamic environments

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Towards Semantic KinectFusion

Abstract

Chapter PDF

Similar content being viewed by others

Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure

DART: dense articulated real-time tracking with consumer depth cameras

GGC-SLAM: a VSLAM system based on predicted static probability of feature points in dynamic environments

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation