Abstract
Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow to learn. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Espiau, B., Chaumette, F., Rives, P.: A new approach to visual servoing in robotics. IEEE Trans. Robot. Autom. 8(3), 313–326 (1992)
Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Osband, I., et al.: Deep q-learning from demonstrations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Hutchinson, S., Hager, G.D., Corke, P.I.: A tutorial on visual servo control. IEEE Trans. Robot. Autom. 12(5), 651–670 (1996)
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017)
James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. CoRR, abs/1707.02267 (2017)
Kalashnikov, D., Irpan, A., Pastor, P., Ibarz, J., Herzog, A., Jang, E., Quillen, D., Holly, E., Kalakrishnan, M., Vanhoucke, V., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)
Lakshminarayanan, A.S., Ozair, S., Bengio, Y.: Reinforcement learning with few expert demonstrations. In: NIPS Workshop on Deep Learning for Action and Interaction, vol. 2016 (2016)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. In: Proceedings International Symposium on Experimental Robotics (ISER), Tokyo, Japan (2016)
Mahler, J., Goldberg, K.: Learning deep policies for robot bin picking by simulating robust grasping sequences. In: Conference on Robot Learning, pp. 515–524 (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Ross, S., Bagnell, J.A.: Reinforcement and imitation learning via interactive no-regret learning. CoRR, abs/1406.5979 (2014)
Ross, S., Gordon, G., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 627–635 (2011)
Sadeghi, F., Toshev, A., Jang, E., Levine, S.: Sim2real view invariant visual servoing by recurrent control. CoRR, abs/1712.07642 (2017)
Viereck, U., Saenko, K., Platt, R.: Learning visual servo policies via planner cloning. arXiv preprint arXiv:2005.11810 (2020)
Yan, M., Frosio, I., Tyree, S., Kautz, J.: Sim-to-real transfer of accurate grasping with eye-in-hand observations and continuous control. arXiv preprint arXiv:1712.03303 (2017)
Yoshimi, B.H., Allen, P.K.: Active, uncalibrated visual servoing. In: Proceedings of the 1994 IEEE International Conference on Robotics and Automation, pp. 156–161. IEEE (1994)
Zhang, F., Leitner, J., Milford, M., Corke, P.: Sim-to-real transfer of visuo-motor policies for reaching in clutter: domain randomization and adaptation with modular networks. CoRR, abs/1709.05746 (2017)
Zhu, Y., Wang, Z., Merel, J., Rusu, A., Erez, T., Cabi, S., Tunyasuvunakool, S., Kramár, J., Hadsell, R., de Freitas, N., et al.: Reinforcement and imitation learning for diverse visuomotor skills. arXiv preprint arXiv:1802.09564 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Viereck, U., Saenko, K., Platt, R. (2021). Learning Visual Servo Policies via Planner Cloning. In: Siciliano, B., Laschi, C., Khatib, O. (eds) Experimental Robotics. ISER 2020. Springer Proceedings in Advanced Robotics, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-71151-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-71151-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71150-4
Online ISBN: 978-3-030-71151-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)