Untrimmed Action Anticipation

Rodin, Ivan; Furnari, Antonino; Mavroeidis, Dimitrios; Farinella, Giovanni Maria

doi:10.1007/978-3-031-06433-3_29

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13233))

Included in the following conference series:

International Conference on Image Analysis and Processing

Abstract

Egocentric action anticipation consists in predicting a future action the camera wearer will perform from egocentric video. While the task has recently attracted the attention of the research community, current approaches assume that the input videos are “trimmed”, meaning that a short video sequence is sampled a fixed time before the beginning of the action. We argue that, despite the recent advances in the field, trimmed action anticipation has a limited applicability in real-world scenarios where it is important to deal with “untrimmed” video inputs and it cannot be assumed that the exact moment in which the action will begin is known at test time. To overcome such limitations, we propose an untrimmed action anticipation task, which, similarly to temporal action detection, assumes that the input video is untrimmed at test time, while still requiring predictions to be made before the actions actually take place. We propose an evaluation procedure for methods designed to address this novel task, and compare several baselines on the EPIC-KITCHENS-100 dataset. Experiments show that the performance of current models designed for trimmed action anticipation is very limited and more research on this task is required.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 67.40; Price includes VAT (France)

Softcover Book: EUR 84.39; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation

Action Anticipation by Predicting Future Dynamic Images

Weakly supervised action anticipation without object annotations

Article 08 August 2022

References

Betancourt, A., Morerio, P., Regazzoni, C.S., Rauterberg, M.: The evolution of first person vision methods: a survey. IEEE Trans. Circ. Syst. Video Technol. 25(5), 744–760 (2015)
Article Google Scholar
Bubic, A., Von Cramon, D.Y., Schubotz, R.I.: Prediction, cognition and the brain. Front. Hum. Neurosci. 4, 25 (2010)
Google Scholar
Damen, D., et al.: Rescaling egocentric vision. arXiv preprint arXiv:2006.13256 (2020)
Damen, D., et al.: Scaling egocentric vision: the epic-kitchens dataset. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 720–736 (2018)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Inf. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Furnari, A., Farinella, G.M.: What would you expect? Anticipating egocentric actions with rolling-unrolling LSTMS and modality attention. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6252–6261 (2019)
Google Scholar
Furnari, A., Farinella, G.M.: Towards streaming egocentric action anticipation. arXiv preprint arXiv:2110.05386 (2021)
Gao, M., Xu, M., Davis, L.S., Socher, R., Xiong, C.: StartNet: online detection of action start in untrimmed videos. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5542–5551 (2019)
Google Scholar
Ke, Q., Fritz, M., Schiele, B.: Time-conditioned action anticipation in one shot. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9925–9934 (2019)
Google Scholar
Koppula, H.S., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 14–29 (2015)
Article Google Scholar
Li, Y., Lan, C., Xing, J., Zeng, W., Yuan, C., Liu, J.: Online human action detection using joint classification-regression recurrent neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 203–220. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_13
Chapter Google Scholar
Liu, M., Tang, S., Li, Y., Rehg, J.: Forecasting human object interaction: Joint prediction of motor attention and egocentric activity. arXiv:1911.10967 (2019)
Manglik, A., Weng, X., Ohn-Bar, E., Kitani, K.M.: Forecasting time-to-collision from monocular video: feasibility, dataset, and challenges. arXiv preprint arXiv:1903.09102 (2019)
Neumann, L., Zisserman, A., Vedaldi, A.: Future event prediction: if and when. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Ohn-Bar, E., Kitani, K., Asakawa, C.: Personalized dynamics models for adaptive assistive navigation systems. arXiv preprint arXiv:1804.04118 (2018)
Rodin, I., Furnari, A., Mavroeidis, D., Farinella, G.M.: Predicting the future from first person (egocentric) vision: a survey. Comput. Vis. Image Underst. 211(5), 103252 (2021)
Google Scholar
Ryoo, M., Fuchs, T.J., Xia, L., Aggarwal, J.K., Matthies, L.: Robot-centric activity prediction from first-person videos: what will they do to me? In: 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 295–302. IEEE (2015)
Google Scholar
Sener, F., Singhania, D., Yao, A.: Temporal aggregate representations for long term video understanding. arXiv:2006.00830 (2020)
Shou, Z., Pan, J., Chan, J., Miyazawa, K., Mansour, H., Vetro, A., Nieto, X.G., Chang, S.F.: Online action detection in untrimmed, streaming videos-modeling and evaluation. In: European Conference on Computer Vision (2018)
Google Scholar

Download references

Acknowledgements

This research has been supported by Marie Skłodowska-Curie Innovative Training Networks - European Industrial Doctorates - PhilHumans Project, European Union - Grant agreement 812882 (http://www.philhumans.eu), project MEGABIT - PIAno di inCEntivi per la RIcerca di Ateneo 2020/2022 (PIACERI) - linea di intervento 2, DMI - University of Catania, and by the MISE - PON I&C 2014-2020 - Progetto ENIGMA - Prog n. F/190050/02/X44 - CUP: B61B19000520008.

Author information

Authors and Affiliations

University of Catania, Viale Andrea Doria 6, 95128, Catania, Italy
Ivan Rodin, Antonino Furnari & Giovanni Maria Farinella
Philips Research, High Tech Campus 34, 5656 AE, Eindhoven, The Netherlands
Ivan Rodin & Dimitrios Mavroeidis
Next Vision s.r.l - Spinoff of the University of Catania, Catania, Italy
Antonino Furnari & Giovanni Maria Farinella

Authors

Ivan Rodin
View author publications
You can also search for this author in PubMed Google Scholar
Antonino Furnari
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Mavroeidis
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Maria Farinella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonino Furnari .

Editor information

Editors and Affiliations

Boston University, Boston, MA, USA
Stan Sclaroff
National Research Council, Lecce, Italy
Cosimo Distante
National Research Council, Lecce, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni M. Farinella
Technische Universität München, Garching, Germany
Federico Tombari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodin, I., Furnari, A., Mavroeidis, D., Farinella, G.M. (2022). Untrimmed Action Anticipation. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-06433-3_29
Published: 15 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06432-6
Online ISBN: 978-3-031-06433-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Untrimmed Action Anticipation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation

Action Anticipation by Predicting Future Dynamic Images

Weakly supervised action anticipation without object annotations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Untrimmed Action Anticipation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation

Action Anticipation by Predicting Future Dynamic Images

Weakly supervised action anticipation without object annotations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation