Skip to main content

Learning Visual Servo Policies via Planner Cloning

  • Conference paper
  • First Online:
Experimental Robotics (ISER 2020)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 19))

Included in the following conference series:

  • 2397 Accesses

Abstract

Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow to learn. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 234.33
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 295.39
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
EUR 295.39
Price includes VAT (France)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Espiau, B., Chaumette, F., Rives, P.: A new approach to visual servoing in robotics. IEEE Trans. Robot. Autom. 8(3), 313–326 (1992)

    Article  Google Scholar 

  2. Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Osband, I., et al.: Deep q-learning from demonstrations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  3. Hutchinson, S., Hager, G.D., Corke, P.I.: A tutorial on visual servo control. IEEE Trans. Robot. Autom. 12(5), 651–670 (1996)

    Article  Google Scholar 

  4. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017)

    Google Scholar 

  5. James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. CoRR, abs/1707.02267 (2017)

    Google Scholar 

  6. Kalashnikov, D., Irpan, A., Pastor, P., Ibarz, J., Herzog, A., Jang, E., Quillen, D., Holly, E., Kalakrishnan, M., Vanhoucke, V., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)

  7. Lakshminarayanan, A.S., Ozair, S., Bengio, Y.: Reinforcement learning with few expert demonstrations. In: NIPS Workshop on Deep Learning for Action and Interaction, vol. 2016 (2016)

    Google Scholar 

  8. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  9. Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. In: Proceedings International Symposium on Experimental Robotics (ISER), Tokyo, Japan (2016)

    Google Scholar 

  10. Mahler, J., Goldberg, K.: Learning deep policies for robot bin picking by simulating robust grasping sequences. In: Conference on Robot Learning, pp. 515–524 (2017)

    Google Scholar 

  11. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Google Scholar 

  12. Ross, S., Bagnell, J.A.: Reinforcement and imitation learning via interactive no-regret learning. CoRR, abs/1406.5979 (2014)

    Google Scholar 

  13. Ross, S., Gordon, G., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 627–635 (2011)

    Google Scholar 

  14. Sadeghi, F., Toshev, A., Jang, E., Levine, S.: Sim2real view invariant visual servoing by recurrent control. CoRR, abs/1712.07642 (2017)

    Google Scholar 

  15. Viereck, U., Saenko, K., Platt, R.: Learning visual servo policies via planner cloning. arXiv preprint arXiv:2005.11810 (2020)

  16. Yan, M., Frosio, I., Tyree, S., Kautz, J.: Sim-to-real transfer of accurate grasping with eye-in-hand observations and continuous control. arXiv preprint arXiv:1712.03303 (2017)

  17. Yoshimi, B.H., Allen, P.K.: Active, uncalibrated visual servoing. In: Proceedings of the 1994 IEEE International Conference on Robotics and Automation, pp. 156–161. IEEE (1994)

    Google Scholar 

  18. Zhang, F., Leitner, J., Milford, M., Corke, P.: Sim-to-real transfer of visuo-motor policies for reaching in clutter: domain randomization and adaptation with modular networks. CoRR, abs/1709.05746 (2017)

    Google Scholar 

  19. Zhu, Y., Wang, Z., Merel, J., Rusu, A., Erez, T., Cabi, S., Tunyasuvunakool, S., Kramár, J., Hadsell, R., de Freitas, N., et al.: Reinforcement and imitation learning for diverse visuomotor skills. arXiv preprint arXiv:1802.09564 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ulrich Viereck .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Viereck, U., Saenko, K., Platt, R. (2021). Learning Visual Servo Policies via Planner Cloning. In: Siciliano, B., Laschi, C., Khatib, O. (eds) Experimental Robotics. ISER 2020. Springer Proceedings in Advanced Robotics, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-71151-1_26

Download citation

Publish with us

Policies and ethics