skip to main content
10.1145/3172944.3173003acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
short-paper
Public Access

EASEL: Easy Automatic Segmentation Event Labeler

Published: 05 March 2018 Publication History

Abstract

Video annotation is a vital part of research examining gestural and multimodal interaction as well as computer vision, machine learning, and interface design. However, annotation is a difficult, time-consuming task that requires high cognitive effort. Existing tools for labeling and annotation still require users to manually label most of the data, limiting the tools helpfulness. In this paper, we present the Easy Automatic Segmentation Event Labeler (EASEL), a tool supporting gesture analysis. EASEL streamlines the annotation process by introducing assisted annotation, using automatic gesture segmentation and recognition to automatically annotate gestures. To evaluate the efficacy of assisted annotation, we conducted a user study with 24 participants and found that assisted annotation decreased the time needed to annotate videos with no difference in accuracy compared with manual annotation. The results of our study demonstrate the benefit of adding computational intelligence to video and audio annotation tasks.

References

[1]
Robert Arn, Pradyumna Narayana, Teegan Emerson, Bruce Draper, Michael Kirby, and Chris Peterson. Motion Segmentation via Generalized Curvatures. Under review in IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2]
Merve Aydinlilar and Adnan Yazici. 2013. Semi-Automatic Semantic Video Annotation Tool. In Computer and Information Sciences III. Springer, London, 303--310.
[3]
Victoria Bloom, Dimitrios Makris, and Vasileios Argyriou. 2012. G3D: A gaming action dataset and real time action recognition evaluation framework. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 7--12.
[4]
Stamatia Dasiopoulou, Eirini Giannakidou, Georgios Litos, Polyxeni Malasioti, and Yiannis Kompatsiaris. 2011. A Survey of Semantic Image and Video Annotation Tools. In Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. Springer, Berlin, Heidelberg, 196--239.
[5]
Simon Fothergill, Helena Mentis, Pushmeet Kohli, and Sebastian Nowozin. 2012. Instructing People for Training Gestural Interactive Systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12), 1737--1746.
[6]
Isabelle Guyon, Vassilis Athitsos, Pat Jangyodsuk, and Hugo Jair Escalante. 2014. The ChaLearn gesture dataset (CGD 2011). Machine Vision and Applications 25, 8: 1929--1951.
[7]
Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Advances in Psychology, Peter A. Hancock and Najmedin Meshkati (eds.). North-Holland, 139--183.
[8]
Michael Kipp. 2001. Anvil - A Generic Annotation Tool for Multimodal Dialogue. In Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), 1367--1370.
[9]
Michael Nebeling, David Ott, and Moira C. Norrie. 2015. Kinect Analysis: A System for Recording, Analysing and Sharing Multimodal Interaction Elicitation Studies. In Proceedings of the 7th ACM SIGCHI Symposium on Engineering Interactive Computing Systems (EICS '15), 142--151.
[10]
Francis Quek, David McNeill, Robert Bryll, Susan Duncan, Xin-Feng Ma, Cemil Kirbas, Karl E. McCullough, and Rashid Ansari. 2002. Multimodal Human Discourse: Gesture and Speech. ACM Trans. Comput.-Hum. Interact. 9, 3: 171--193.
[11]
C. J. Van Rijsbergen. 1979. Information Retrieval. Butterworth-Heinemann, Newton, MA, USA.
[12]
Katharina Rohlfing, Daniel Loehr, Susan Duncan, Amanda Brown, Amy Franklin, Irene Kimbara, J-T Milde, Fey Parrill, Travis Rose, Thomas Schmidt, and others. 2006. Comparison of multimodal annotation tools: Workshop report. Gesprächsforschung 7.
[13]
R. Travis Rose, Francis Quek, and Yang Shi. 2004. MacVisSTA: a system for multimodal analysis. In Proceedings of the 6th international conference on Multimodal interfaces, 259--264.
[14]
Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1: 43--49.
[15]
Advait Sarkar, Cecily Morrison, Jonas F. Dorn, Rishi Bedi, Saskia Steinheimer, Jacques Boisvert, Jessica Burggraaff, Marcus D'Souza, Peter Kontschieder, Samuel Rota Bulò, Lorcan Walsh, Christian P. Kamm, Yordan Zaykov, Abigail Sellen, and Siân Lindley. 2016. Setwise Comparison: Consistent, Scalable, Continuum Labels for Computer Vision. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16), 261--271.
[16]
Thomas Schmidt and Kai Wörner. 2009. EXMARaLDA Creating, analyzing and sharing spoken language corpora for pragmatics research. Pragmatics-Quarterly Publication of the International Pragmatics Association 19, 4: 565.
[17]
Isaac Wang, Mohtadi Ben Fraj, Pradyumna Narayana, Dhruva Patil, Gururaj Mulay, Rahul Bangar, J. Ross Beveridge, Bruce A. Draper, and Jaime Ruiz. 2017. EGGNOG: A Continuous, Multi-modal Data Set of Naturally Occurring Gestures with Ground Truth Labels. In 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), 414--421.
[18]
Peter Wittenburg, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. Elan: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), 1556--1559. Retrieved from http://tla.mpi.nl/tools/tla-tools/elan/
[19]
JMF 2.1.1 - Supported Formats. Retrieved October 3, 2017 from http://www.oracle.com/technetwork/java/javase/formats-138492.html
[20]
Supported Media Formats in Media Foundation (Windows). Retrieved January 10, 2017 from https://msdn.microsoft.com/en-us/library/windows/desktop/dd757927
[21]
Kinect Studio. Retrieved October 3, 2017 from https://msdn.microsoft.com/en-us/library/dn785306.aspx

Cited By

View all
  • (2025)Body Language Between Humans and MachinesBody Language Communication10.1007/978-3-031-70064-4_18(443-476)Online publication date: 2-Jan-2025
  • (2024)Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-time Video StreamsACM Transactions on Interactive Intelligent Systems10.1145/364945714:2(1-22)Online publication date: 29-Feb-2024
  • (2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
  • Show More Cited By

Index Terms

  1. EASEL: Easy Automatic Segmentation Event Labeler

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces
    March 2018
    698 pages
    ISBN:9781450349451
    DOI:10.1145/3172944
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 March 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data annotation tools
    2. gesture analysis
    3. gesture segmentation

    Qualifiers

    • Short-paper

    Funding Sources

    • DARPA

    Conference

    IUI'18
    Sponsor:

    Acceptance Rates

    IUI '18 Paper Acceptance Rate 43 of 299 submissions, 14%;
    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)81
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Body Language Between Humans and MachinesBody Language Communication10.1007/978-3-031-70064-4_18(443-476)Online publication date: 2-Jan-2025
    • (2024)Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-time Video StreamsACM Transactions on Interactive Intelligent Systems10.1145/364945714:2(1-22)Online publication date: 29-Feb-2024
    • (2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
    • (2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
    • (2021)Examining the Use of Nonverbal Communication in Virtual AgentsInternational Journal of Human–Computer Interaction10.1080/10447318.2021.189885137:17(1648-1673)Online publication date: 28-Mar-2021
    • (2019)A methodology for gestural interaction relying on user-defined gestures sets following a one-shot learning approachJournal of Intelligent & Fuzzy Systems10.3233/JIFS-17904636:5(5001-5010)Online publication date: 14-May-2019
    • (2019)Assisting group activity analysis through hand detection and identification in multiple egocentric videosProceedings of the 24th International Conference on Intelligent User Interfaces10.1145/3301275.3302297(570-574)Online publication date: 17-Mar-2019

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media