short-paper

Public Access

EASEL: Easy Automatic Segmentation Event Labeler

Authors:

Pradyumna Narayana,

Ross Beveridge,

Jaime RuizAuthors Info & Claims

IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces

Pages 595 - 599

https://doi.org/10.1145/3172944.3173003

Published: 05 March 2018 Publication History

Abstract

Video annotation is a vital part of research examining gestural and multimodal interaction as well as computer vision, machine learning, and interface design. However, annotation is a difficult, time-consuming task that requires high cognitive effort. Existing tools for labeling and annotation still require users to manually label most of the data, limiting the tools helpfulness. In this paper, we present the Easy Automatic Segmentation Event Labeler (EASEL), a tool supporting gesture analysis. EASEL streamlines the annotation process by introducing assisted annotation, using automatic gesture segmentation and recognition to automatically annotate gestures. To evaluate the efficacy of assisted annotation, we conducted a user study with 24 participants and found that assisted annotation decreased the time needed to annotate videos with no difference in accuracy compared with manual annotation. The results of our study demonstrate the benefit of adding computational intelligence to video and audio annotation tasks.

References

[1]

Robert Arn, Pradyumna Narayana, Teegan Emerson, Bruce Draper, Michael Kirby, and Chris Peterson. Motion Segmentation via Generalized Curvatures. Under review in IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]

Merve Aydinlilar and Adnan Yazici. 2013. Semi-Automatic Semantic Video Annotation Tool. In Computer and Information Sciences III. Springer, London, 303--310.

[3]

Victoria Bloom, Dimitrios Makris, and Vasileios Argyriou. 2012. G3D: A gaming action dataset and real time action recognition evaluation framework. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 7--12.

[4]

Stamatia Dasiopoulou, Eirini Giannakidou, Georgios Litos, Polyxeni Malasioti, and Yiannis Kompatsiaris. 2011. A Survey of Semantic Image and Video Annotation Tools. In Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. Springer, Berlin, Heidelberg, 196--239.

Digital Library

[5]

Simon Fothergill, Helena Mentis, Pushmeet Kohli, and Sebastian Nowozin. 2012. Instructing People for Training Gestural Interactive Systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12), 1737--1746.

Digital Library

[6]

Isabelle Guyon, Vassilis Athitsos, Pat Jangyodsuk, and Hugo Jair Escalante. 2014. The ChaLearn gesture dataset (CGD 2011). Machine Vision and Applications 25, 8: 1929--1951.

Digital Library

[7]

Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Advances in Psychology, Peter A. Hancock and Najmedin Meshkati (eds.). North-Holland, 139--183.

[8]

Michael Kipp. 2001. Anvil - A Generic Annotation Tool for Multimodal Dialogue. In Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), 1367--1370.

[9]

Michael Nebeling, David Ott, and Moira C. Norrie. 2015. Kinect Analysis: A System for Recording, Analysing and Sharing Multimodal Interaction Elicitation Studies. In Proceedings of the 7th ACM SIGCHI Symposium on Engineering Interactive Computing Systems (EICS '15), 142--151.

Digital Library

[10]

Francis Quek, David McNeill, Robert Bryll, Susan Duncan, Xin-Feng Ma, Cemil Kirbas, Karl E. McCullough, and Rashid Ansari. 2002. Multimodal Human Discourse: Gesture and Speech. ACM Trans. Comput.-Hum. Interact. 9, 3: 171--193.

Digital Library

[11]

C. J. Van Rijsbergen. 1979. Information Retrieval. Butterworth-Heinemann, Newton, MA, USA.

Digital Library

[12]

Katharina Rohlfing, Daniel Loehr, Susan Duncan, Amanda Brown, Amy Franklin, Irene Kimbara, J-T Milde, Fey Parrill, Travis Rose, Thomas Schmidt, and others. 2006. Comparison of multimodal annotation tools: Workshop report. Gesprächsforschung 7.

[13]

R. Travis Rose, Francis Quek, and Yang Shi. 2004. MacVisSTA: a system for multimodal analysis. In Proceedings of the 6th international conference on Multimodal interfaces, 259--264.

Digital Library

[14]

Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1: 43--49.

[15]

Advait Sarkar, Cecily Morrison, Jonas F. Dorn, Rishi Bedi, Saskia Steinheimer, Jacques Boisvert, Jessica Burggraaff, Marcus D'Souza, Peter Kontschieder, Samuel Rota Bulò, Lorcan Walsh, Christian P. Kamm, Yordan Zaykov, Abigail Sellen, and Siân Lindley. 2016. Setwise Comparison: Consistent, Scalable, Continuum Labels for Computer Vision. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16), 261--271.

Digital Library

[16]

Thomas Schmidt and Kai Wörner. 2009. EXMARaLDA Creating, analyzing and sharing spoken language corpora for pragmatics research. Pragmatics-Quarterly Publication of the International Pragmatics Association 19, 4: 565.

[17]

Isaac Wang, Mohtadi Ben Fraj, Pradyumna Narayana, Dhruva Patil, Gururaj Mulay, Rahul Bangar, J. Ross Beveridge, Bruce A. Draper, and Jaime Ruiz. 2017. EGGNOG: A Continuous, Multi-modal Data Set of Naturally Occurring Gestures with Ground Truth Labels. In 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), 414--421.

[18]

Peter Wittenburg, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. Elan: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), 1556--1559. Retrieved from http://tla.mpi.nl/tools/tla-tools/elan/

[19]

JMF 2.1.1 - Supported Formats. Retrieved October 3, 2017 from http://www.oracle.com/technetwork/java/javase/formats-138492.html

[20]

Supported Media Formats in Media Foundation (Windows). Retrieved January 10, 2017 from https://msdn.microsoft.com/en-us/library/windows/desktop/dd757927

[21]

Kinect Studio. Retrieved October 3, 2017 from https://msdn.microsoft.com/en-us/library/dn785306.aspx

Cited By

Wang IRuiz JKappas A(2025)Body Language Between Humans and MachinesBody Language Communication10.1007/978-3-031-70064-4_18(443-476)Online publication date: 2-Jan-2025
https://doi.org/10.1007/978-3-031-70064-4_18
Radeta MFreitas RRodrigues CZuniga ANguyen NFlores HNurmi P(2024)Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-time Video StreamsACM Transactions on Interactive Intelligent Systems10.1145/364945714:2(1-22)Online publication date: 29-Feb-2024
https://dl.acm.org/doi/10.1145/3649457
Tang YChang CYang XIgarashi T(2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
https://dl.acm.org/doi/10.1145/3604273
Show More Cited By

Index Terms

EASEL: Easy Automatic Segmentation Event Labeler
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools

Recommendations

A semi-automatic annotation tool for unobtrusive gesture analysis

In a variety of research fields, including linguistics, human---computer interaction research, psychology, sociology and behavioral studies, there is a growing interest in the role of gestural behavior related to speech and other modalities. The ...
Gesture unit segmentation using support vector machines: segmenting gestures from rest positions
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied Computing

Gesture analysis has been widely used for developing new methods of human-computer interaction. The advancement reached in the gesture analysis area is also motivating its application to automate tasks related to discourse analysis, such as the gesture ...
Towards Accurate Automatic Segmentation of IMU-Tracked Motion Gestures
CHI EA '15: Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems

We present our ongoing research on automatic segmentation of motion gestures tracked by IMUs. We postulate that by recognizing gesture execution phases from motion data that we may be able to auto-delimit user gesture entries. We demonstrate that ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces

March 2018

698 pages

ISBN:9781450349451

DOI:10.1145/3172944

General Chairs:
Shlomo Berkovsky
CSIRO, Australia
,
Yoshinori Hijikata
Kwansei Gakuin University, Japan
,
Jun Rekimoto
University of Tokyo, Japan
,
Program Chairs:
Margaret Burnett
Oregon State University, USA
,
Mark Billinghurst
University of South Australia, Australia
,
Aaron Quigley
University of St Andrews, UK

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

In-Cooperation

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

DARPA

Conference

IUI'18

Sponsor:

SIGAI

IUI'18: 23rd International Conference on Intelligent User Interfaces

March 7 - 11, 2018

Tokyo, Japan

Acceptance Rates

IUI '18 Paper Acceptance Rate 43 of 299 submissions, 14%;

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
409
Total Downloads

Downloads (Last 12 months)81
Downloads (Last 6 weeks)14

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang IRuiz JKappas A(2025)Body Language Between Humans and MachinesBody Language Communication10.1007/978-3-031-70064-4_18(443-476)Online publication date: 2-Jan-2025
https://doi.org/10.1007/978-3-031-70064-4_18
Radeta MFreitas RRodrigues CZuniga ANguyen NFlores HNurmi P(2024)Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-time Video StreamsACM Transactions on Interactive Intelligent Systems10.1145/364945714:2(1-22)Online publication date: 29-Feb-2024
https://dl.acm.org/doi/10.1145/3649457
Tang YChang CYang XIgarashi T(2023)SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesProceedings of the ACM on Human-Computer Interaction10.1145/36042737:MHCI(1-19)Online publication date: 13-Sep-2023
https://dl.acm.org/doi/10.1145/3604273
Pham TMoesgen TSiltanen SBergstrom JXiao Y(2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3216015
Wang IRuiz J(2021)Examining the Use of Nonverbal Communication in Virtual AgentsInternational Journal of Human–Computer Interaction10.1080/10447318.2021.189885137:17(1648-1673)Online publication date: 28-Mar-2021
https://doi.org/10.1080/10447318.2021.1898851
Céspedes-Hernández DGonzález-Calleros J(2019)A methodology for gestural interaction relying on user-defined gestures sets following a one-shot learning approachJournal of Intelligent & Fuzzy Systems10.3233/JIFS-17904636:5(5001-5010)Online publication date: 14-May-2019
https://doi.org/10.3233/JIFS-179046
Charoenkulvanich NKamikubo RYonetani RSato YFu WPan SBrdiczka OChau PCalvary G(2019)Assisting group activity analysis through hand detection and identification in multiple egocentric videosProceedings of the 24th International Conference on Intelligent User Interfaces10.1145/3301275.3302297(570-574)Online publication date: 17-Mar-2019
https://dl.acm.org/doi/10.1145/3301275.3302297

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten