skip to main content
10.1145/3411764.3445233acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Open access

What Makes Videos Accessible to Blind and Visually Impaired People?

Published: 07 May 2021 Publication History


User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description of the visual content, BVI people in our formative studies reported that they used a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7 of the identified heuristics (2 audio-related, 2 video-related, and 3 audio-visual) as automated metrics to assess video accessibility. We collected a dataset of accessibility ratings of videos by BVI people and found that our automatic video accessibility metrics correlated with the accessibility ratings (Adjusted R2 = 0.642). We augmented a video search interface with our video accessibility metrics and predictions. BVI people using our augmented video search interface selected an accessible video more efficiently than when using the original search interface. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.


[n.d.]. 3PlayMedia.
[n.d.]. American Council of the Blind, Audio Description Project, Guidelines for Audio Describers.
[n.d.]. Average Speaking Rate and Words per Minute.
[n.d.]. Gentle Forced-aligner.
[n.d.]. These are the 10 most used smartphone apps.
N. Reviers A. Remael and G. Vercauteren. [n.d.]. Pictures painted in Words: ADLab Audio Description Guidelines.
Nayyer Aafaq, Ajmal Mian, Wei Liu, Syed Zulqarnain Gilani, and Mubarak Shah. 2019. Video description: A survey of methods, datasets, and evaluation metrics. ACM Computing Surveys (CSUR) 52, 6 (2019), 1–37.
Tania Acosta, Patricia Acosta-Vargas, Jose Zambrano-Miranda, and Sergio Lujan-Mora. 2020. Web Accessibility Evaluation of Videos Published on YouTube by Worldwide Top-Ranking Universities. IEEE Access 8(2020), 110994–111011.
Cynthia L Bennett, Jane E, Martez E Mott, Edward Cutrell, and Meredith Ringel Morris. 2018. How teens with visual impairments take, edit, and share photos on social media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.
Carmen J Branje and Deborah I Fels. 2012. Livedescribe: can amateur describers create high-quality audio description?Journal of Visual Impairment & Blindness 106, 3 (2012), 154–165.
Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg Vanderheiden, Wendy Chisholm, John Slatin, and Jason White. 2008. Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C)(2008).
Ronald Chenail. 2011. Youtube as a qualitative research asset: Reviewing user generated videos as learning resources. Qualitative Report 16 (01 2011), 229–235.
Hsiu-Sen Chiang and Kuo-Lun Hsiao. 2015. YouTube stickiness: The needs, personal, and environmental perspective. Internet Research 25 (02 2015), 85–106.
Amy Pavel Cole Gleason, Emma McCamey, Christina Low, Patrick Carrington, Kris M Kitani, and Jeffrey P Bigham. [n.d.]. Twitter A11y: A Browser Extension to Make Twitter Images Accessible. ([n. d.]).
Berkeley J. Dietvorst, Joseph P. Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114.
Casually Explained. [n.d.]. Casually Explained: Reddit & Casually Explained.
Robert Fildes and Paul Goodwin. 2007. Against your better judgment? How organizations can improve their use of management judgment in forecasting. Interfaces 37, 6 (2007), 570–576.
L Gagnon, C Chapdelaine, D Byrns, S Foucher, M Héritier, and V Gupta. 2010. Computer-Assisted System for Videodescription Scripting. In Proceedings of Computer Vision Application for Visually-Impaired (CVAVI), a satellite workshop of CVPR.
Langis Gagnon, Samuel Foucher, Maguelonne Heritier, Marc Lalonde, David Byrns, Claude Chapdelaine, James Turner, Suzanne Mathieu, Denis Laurendeau, Nath Tan Nguyen, 2009. Towards computer-vision software tools to increase production and accessibility of video description for people with vision loss. Universal Access in the Information Society 8, 3 (2009), 199–218.
Lianli Gao, Zhao Guo, Hanwang Zhang, Xing Xu, and Heng Tao Shen. 2017. Video captioning with attention-based LSTM and semantic consistency. IEEE Transactions on Multimedia 19, 9 (2017), 2045–2055.
Cole Gleason, Patrick Carrington, Cameron Cassidy, Meredith Ringel Morris, Kris M Kitani, and Jeffrey P Bigham. 2019. “It’s almost like they’re trying to hide it”: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible. In The World Wide Web Conference. 549–559.
Google/Insight Strategy Group. [n.d.]. “What the world watched in a day” from Premium is Personal studies.
Frank E Harrell Jr, Kerry L Lee, Robert M Califf, David B Pryor, and Robert A Rosati. 1984. Regression modelling strategies for improved prognostic prediction. Statistics in medicine 3, 2 (1984), 143–152.
The Smith-Kettlewell Eye Research Institute. [n.d.]. YouDescribe.
Jeep. [n.d.]. 2020 Jeep Grand Cherokee.
Jingyang Jiang and Haitao Liu. 2015. The effects of sentence length on dependency distance, dependency direction and the implications–based on a parallel English–Chinese dependency treebank. Language Sciences 50(2015), 93–104.
Victoria Johansson. 2009. Lexical diversity and lexical density in speech and writing: a developmental perspective. Lund Working Papers in Linguistics 53 (2009), 61–79.
M Laeeq Khan. 2017. Social media engagement: What motivates user participation and consumption on YouTube?Computers in Human Behavior 66 (2017), 236–247.
Masatomo Kobayashi, Kentarou Fukuda, Hironobu Takagi, and Chieko Asakawa. 2009. Providing synthesized audio description for online videos. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. 249–250.
Masatomo Kobayashi, Trisha O’Connell, Bryan Gould, Hironobu Takagi, and Chieko Asakawa. 2010. Are synthesized video descriptions acceptable?. In Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility. 163–170.
J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.
The late show. [n.d.]. Jon Stewart Climbs Out From Under Colbert’s Desk To Debut ”Irresistible” Movie Trailer.
Apex Legends. [n.d.]. Apex Legends Season 4: Assimilation Gameplay Trailer.
Edward Loper and Steven Bird. 2002. NLTK: The Natural Language Toolkit. In In Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. Philadelphia: Association for Computational Linguistics.
Jennifer Mankoff, Holly Fait, and Tu Tran. 2005. Is your web page accessible? A comparative study of methods for assessing web page accessibility for the blind. In Proceedings of the SIGCHI conference on Human factors in computing systems. 41–50.
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39–41.
George A Miller. 1998. WordNet: An electronic lexical database. MIT press.
Lourdes Moreno, María González-García, Paloma Martínez, and Yolanda González. 2017. Checklist for Accessible Media Player Evaluation. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 367–368.
Rosiana Natalie, Ebrima Jarjue, Hernisa Kacorri, and Kotaro Hara. 2020. ViScene: A Collaborative Authoring Tool for Scene Descriptions in Videos. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–4.
Jaclyn Packer, Katie Vizenor, and Joshua A Miele. 2015. An overview of video description: history, benefits, and guidelines. Journal of Visual Impairment & Blindness 109, 2 (2015), 83–93.
Amy Pavel, Gabriel Reyes, and Jeffrey P Bigham. 2020. Rescribe: Authoring and Automatically Editing Audio Descriptions. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 747–759.
Yi-Hao Peng, JiWoon Jang, Jeffrey P. Bigham, and Amy Pavel. [n.d.]. Say It All: Feedback for Non-visual Presentation Accessibility. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (To Appear).
Dude Perfect. [n.d.]. Impossible Ping Pong Trick Shots.
Paul Haridakis Ph.D and Gary Hanson M.A.2009. Social Interaction and Co-Viewing With YouTube: Blending Mass Communication Reception and Social Connection. Journal of Broadcasting & Electronic Media 53, 2 (2009), 317–335.
Audio Description Project. [n.d.]. Master AD List.
Audio Description Project. [n.d.]. What is Audio Description?
Anna Rohrbach, Marcus Rohrbach, Niket Tandon, and Bernt Schiele. 2015. A dataset for movie description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3202–3212.
Murray Rowan, Peter Gregor, David Sloan, and Paul Booth. 2000. Evaluating web resources for disability access. In Proceedings of the fourth international ACM conference on Assistive technologies. 80–84.
Andreas Sackl, Franziska Graf, Raimund Schatz, and Manfred Tscheligi. 2020. Ensuring Accessibility: Individual Video Playback Enhancements for Low Vision Users. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–4.
José Francisco Saray Villamizar, Benoît Encelle, Yannick Prié, and Pierre-Antoine Champin. 2011. An adaptive videos enrichment system based on decision trees for people with sensory disabilities. In Proceedings of the International Cross-Disciplinary Conference on Web Accessibility. 1–4.
Woosuk Seo and Hyunggu Jung. 2020. Understanding the community of blind or visually impaired vloggers on YouTube. Universal Access in the Information Society(2020), 1–14.
Best Ever Food Review Show. [n.d.]. TWISTED Cuban LECHON in Cuba!!! Pork Hammock!!
The Infographics Show. [n.d.]. How Insane is El Chapo’s Prison Cell Security?
Agnieszka Szarkowska. 2011. Text-to-speech audio description: towards wider availability of AD. The Journal of Specialised Translation 15 (2011), 142–162.
Brennen Taylor. [n.d.]. We TASTED Viral TikTok Cooking Life Hacks.
UFC. [n.d.]. UFC 246: Conor McGregor Octagon Interview.
Markel Vigo and Giorgio Brajnik. 2011. Automatic web accessibility metrics: Where we are and where we can go. Interacting with computers 23, 2 (2011), 137–155.
Markel Vigo, Justin Brown, and Vivienne Conway. 2013. Benchmarking web accessibility evaluation tools: measuring the harm of sole reliance on automated tests. In Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility. 1–10.
Markel Vigo, Barbara Leporini, and Fabio Paternò. 2009. Enriching web information scent for blind users. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. 123–130.
Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How Blind People Interact with Visual Content on Social Networking Services. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW ’16). Association for Computing Machinery, New York, NY, USA, 1584–1595.
Mirjam Wattenhofer, Roger Wattenhofer, and Zack Zhu. 2012. The YouTube Social Network. (01 2012).
Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1180–1192.
YouTube. 2017. You know what’s cool? A billion hours.
YouTube. 2018. The latest YouTube stats on when, where, and what people watch.
Beste F Yuksel, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Soo Jung Kim, Joshua Junhee Lee, Seung Jung Jin, Yue-Ting Siu, Joshua A Miele, and Ilmi Yoon. 2020. Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users. In Proceedings of the 2020 ACM Designing Interactive Systems Conference. 47–60.
Renjie Zhou, Samamon Khemmarat, and Lixin Gao. 2010. The Impact of YouTube Recommendation System on Video Views. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (Melbourne, Australia) (IMC ’10). Association for Computing Machinery, New York, NY, USA, 404–410.

Cited By

View all
  • (2024)Caring for Special Participants in the Digital Media Era: A Study on Enhancing the Blind User Experience on Short Video Platforms Through Auditory CuesJournal of Information Systems Engineering and Management10.55267/iadt.07.147749:3(28013)Online publication date: 2024
  • (2024)VoiceMath: Enhancing Accessibility in STEM Education through Formulae Transcription in Video LecturesProceedings of the 2024 the 16th International Conference on Education Technology and Computers10.1145/3702163.3702458(477-485)Online publication date: 18-Sep-2024
  • (2024)Accessibility Evaluation of Web Systems for People with Visual Impairments: Findings from a Literature SurveyProceedings of the XXIII Brazilian Symposium on Human Factors in Computing Systems10.1145/3702038.3702090(1-13)Online publication date: 7-Oct-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
May 2021
10862 pages
This work is licensed under a Creative Commons Attribution International 4.0 License.



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Check for updates

Author Tags

  1. accessibility
  2. blind
  3. online videos
  4. visual impairments


  • Research-article
  • Research
  • Refereed limited


CHI '21

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)1,092
  • Downloads (Last 6 weeks)93
Reflects downloads up to 03 Feb 2025

Other Metrics


Cited By

View all
  • (2024)Caring for Special Participants in the Digital Media Era: A Study on Enhancing the Blind User Experience on Short Video Platforms Through Auditory CuesJournal of Information Systems Engineering and Management10.55267/iadt.07.147749:3(28013)Online publication date: 2024
  • (2024)VoiceMath: Enhancing Accessibility in STEM Education through Formulae Transcription in Video LecturesProceedings of the 2024 the 16th International Conference on Education Technology and Computers10.1145/3702163.3702458(477-485)Online publication date: 18-Sep-2024
  • (2024)Accessibility Evaluation of Web Systems for People with Visual Impairments: Findings from a Literature SurveyProceedings of the XXIII Brazilian Symposium on Human Factors in Computing Systems10.1145/3702038.3702090(1-13)Online publication date: 7-Oct-2024
  • (2024)Please Understand My Disability: An Analysis of YouTubers' Discourse on Disability ChallengesProceedings of the ACM on Human-Computer Interaction10.1145/36869468:CSCW2(1-25)Online publication date: 8-Nov-2024
  • (2024)Musical Performances in Virtual Reality with Spatial and View-Dependent Audio Descriptions for Blind and Low-Vision UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688492(1-5)Online publication date: 27-Oct-2024
  • (2024)A Recipe for Success? Exploring Strategies for Improving Non-Visual Access to Cooking InstructionsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675662(1-15)Online publication date: 27-Oct-2024
  • (2024)Design considerations for photosensitivity warnings in visual mediaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675643(1-12)Online publication date: 27-Oct-2024
  • (2024)Towards Accessible Musical Performances in Virtual Reality: Designing a Conceptual Framework for Omnidirectional Audio DescriptionsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675618(1-17)Online publication date: 27-Oct-2024
  • (2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
  • (2024)"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with AphasiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675598(1-17)Online publication date: 27-Oct-2024
  • Show More Cited By

View Options

View options


View or Download as a PDF file.



View online with eReader.


HTML Format

View this article in HTML Format.

HTML Format

Login options






Share this Publication link

Share on social media