Onfocus detection: identifying individual-camera eye contact from unconstrained images

Zhang, Dingwen; Wang, Bo; Wang, Gerong; Zhang, Qiang; Zhang, Jiajia; Han, Jungong; You, Zheng

doi:10.1007/s11432-020-3181-9

Onfocus detection: identifying individual-camera eye contact from unconstrained images

Research Paper
Open access
Published: 20 April 2022

Volume 65, article number 160101, (2022)
Cite this article

Download PDF

You have full access to this open access article

Science China Information Sciences Aims and scope Submit manuscript

Onfocus detection: identifying individual-camera eye contact from unconstrained images

Download PDF

817 Accesses
Explore all metrics

Abstract

Onfocus detection aims at identifying whether the focus of the individual captured by a camera is on the camera or not. Based on the behavioral research, the focus of an individual during face-to-camera communication leads to a special type of eye contact, i.e., the individual-camera eye contact, which is a powerful signal in social communication and plays a crucial role in recognizing irregular individual status (e.g., lying or suffering mental disease) and special purposes (e.g., seeking help or attracting fans). Thus, developing effective onfocus detection algorithms is of significance for assisting the criminal investigation, disease discovery, and social behavior analysis. However, the review of the literature shows that very few efforts have been made toward the development of onfocus detector owing to the lack of large-scale public available datasets as well as the challenging nature of this task. To this end, this paper engages in the onfocus detection research by addressing the above two issues. Firstly, we build a large-scale onfocus detection dataset, named as the onfocus detection in the wild (OFDIW). It consists of 20623 images in unconstrained capture conditions (thus called “in the wild”) and contains individuals with diverse emotions, ages, facial characteristics, and rich interactions with surrounding objects and background scenes. On top of that, we propose a novel end-to-end deep model, i.e., the eye-context interaction inferring network (ECIIN), for onfocus detection, which explores eye-context interaction via dynamic capsule routing. Finally, comprehensive experiments are conducted on the proposed OFDIW dataset to benchmark the existing learning models and demonstrate the effectiveness of the proposed ECIIN.

Article PDF

Eye Contact Detection from Third Person Video

Detection of eye contact with deep neural networks is as accurate as human experts

Article Open access 14 December 2020

MGTR: End-to-End Mutual Gaze Detection with Transformer

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Walczyk J J, Griffith D A, Yates R, et al. LIE detection by inducing cognitive load. Criminal Justice Behav, 2012, 39: 887–909
Article Google Scholar
Chong E, Chanda K, Ye Z, et al. Detecting gaze towards eyes in natural social interactions and its use in child assessment. Proc ACM Interact Mob Wearable Ubiquitous Technol, 2017, 1: 1–20
Article Google Scholar
Grossmann T. The eyes as windows into other minds. Perspect Psychol Sci, 2017, 12: 107–121
Article Google Scholar
Prantl L, Heidekrueger P I, Broer P N, et al. Female eye attractiveness—Where beauty meets science. J Cranio-Maxillofacial Surgery, 2019, 47: 73–79
Article Google Scholar
Zhao W, Zhao F, Wang D, et al. Defocus blur detection via multi-stream bottom-top-bottom network. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 1884–1897
Article Google Scholar
Huang G B, Learned-Miller E. Labeled Faces in the Wild: Updates and New Reporting Procedures. Technical Report, 14-003. 2014
Parkhi O M, Vedaldi A, Zisserman A, et al. Cats and dogs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, 2012. 3498–3505
Hinton G E, Sabour S, Frosst N. Matrix capsules with EM routing. In: Proceedings of International Conference on Learning Representations, Vancouver, 2018
Qodseya M, Panta F, Sedes F. Visual-based eye contact detection in multi-person interactions. In: Proceedings of 2019 International Conference on Content-Based Multimedia Indexing, Dublin, 2019. 1–6
Li B, Zhang H, He L, et al. Eye contact detection via CNN-based gaze direction regression. In: Proceedings of 2019 WRC Symposium on Advanced Robotics and Automation, Beijing, 2019. 425–431
Mitsuzumi Y, Nakazawa A, Nishida T. DEEP eye contact detector: robust eye contact bid detection using convolutional neural network. In: Proceedings of British Machine Vision Conference, London, 2017
Ye Z, Li Y, Fathi A, et al. Detecting eye contact using wearable eye-tracking glasses. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, Pittsburgh, 2012. 699–704
Zhang X, Sugano Y, Bulling A. Everyday eye contact detection using unsupervised gaze target discovery. In: Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, Québec City, 2017. 193–203
Smith B A, Yin Q, Feiner S K, et al. Gaze locking: passive eye contact detection for human-object interaction. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, St. Andrews Scotland, 2013. 271–280
Zhang X, Sugano Y, Fritz M, et al. Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 4511–4520
Huang Q, Veeraraghavan A, Sabharwal A. TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets. Machine Vision Appl, 2017, 28: 445–461
Article Google Scholar
Park S, Spurr A, Hilliges O. Deep pictorial gaze estimation. In: Proceedings of the European Conference on Computer Vision, Munich, 2018. 721–738
Lian D, Hu L, Luo W, et al. Multiview multitask gaze estimation with deep convolutional neural networks. IEEE Trans Neural Netw Learn Syst, 2019, 30: 3010–3023
Article Google Scholar
Kellnhofer P, Recasens A, Stent S, et al. Gaze360: physically unconstrained gaze estimation in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, Long Beach, 2019. 6912–6921
Müller P, Huang M X, Zhang X, et al. Robust eye contact detection in natural multi-person interactions using gaze and speaking behaviour. In: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, Warsaw Poland, 2018. 1–10
Mora K A F, Monay F, Odobez J M. Eyediap: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In: Proceedings of the Symposium on Eye Tracking Research and Applications, Safety Harbor, 2014. 255–258
Rehg J, Abowd G, Rozga A, et al. Decoding children’s social behavior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland Oregon, 2013. 3414–3421
Zhang X, Sugano Y, Fritz M, et al. It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu Hawaii, 2017. 51–60
Krafka K, Khosla A, Kellnhofer P, et al. Eye tracking for everyone. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas Nevada, 2016. 2176–2184
Parekh V, Subramanian R, Jawahar C V. Eye contact detection via deep neural networks. In: Proceedings of International Conference on Human-Computer Interaction, Vancouver, 2017. 366–374
Ye Z, Li Y, Liu Y, et al. Detecting bids for eye contact using a wearable camera. In: Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, Ljubljana, 2015. 1–8
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas Nevada, 2016. 2921–2929
Cheng Y, Huang S, Wang F, et al. A coarse-to-fine adaptive network for appearance-based gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, New York, 2020. 10623–10630
Yang Z, Luo T, Wang D, et al. Learning to navigate for fine-grained classification. In: Proceedings of the European Conference on Computer Vision, Munich, 2018. 420–435
Li P, Xie J, Wang Q, et al. Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 947–955
Wang Y, Morariu V I, Davis L S. Learning a discriminative filter bank within a CNN for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 4148–4157
Lin T Y, RoyChowdhury A, Maji S. Bilinear convolutional neural networks for fine-grained visual recognition. IEEE Trans Pattern Anal Mach Intell, 2018, 40: 1309–1322
Article Google Scholar
Chen Y, Bai Y, Zhang W, et al. Destruction and construction learning for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 5157–5166
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 770–778
Gao S H, Cheng M M, Zhao K, et al. Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell, 2021, 43: 652–662
Article Google Scholar
Huang G, Liu Z, Pleiss G, et al. Convolutional networks with dense connectivity. IEEE Trans Pattern Anal Mach Intell, 2019. doi: https://doi.org/10.1109/TPAMI.2019.2918284
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 7132–7141
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis, 2015, 115: 211–252
Article MathSciNet Google Scholar
Wang G, Wang K, Lin L. Adaptively connected neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 1781–1790
Tokozume Y, Ushiku Y, Harada T. Between-class learning for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 5486–5494

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61876140, 61773301), Fundamental Research Funds for the Central Universities (Grant No. JBZ170401), and China Postdoctoral Support Scheme for Innovative Talents (Grant No. BX20180236).

Author information

Zhang D W, Wang B, and Wang G R have the same contribution to this work.

Authors and Affiliations

School of Mechano-Electronic Engineering, Xidian University, Xi’an, 710071, China
Dingwen Zhang, Gerong Wang, Qiang Zhang & Jiajia Zhang
State Key Laboratory of Precision Measurement Technology and Instruments, Tsinghua University, Beijing, 100084, China
Bo Wang & Zheng You
Computer Science Department, Aberystwyth University, Ceredigion, SY23 3FL, UK
Jungong Han
Beijing Jingzhen Medical Technology Ltd., Beijing, 100084, China
Bo Wang

Authors

Dingwen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gerong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiajia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jungong Han
View author publications
You can also search for this author in PubMed Google Scholar
Zheng You
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Qiang Zhang or Jungong Han.

Rights and permissions

Open access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, D., Wang, B., Wang, G. et al. Onfocus detection: identifying individual-camera eye contact from unconstrained images. Sci. China Inf. Sci. 65, 160101 (2022). https://doi.org/10.1007/s11432-020-3181-9

Download citation

Received: 16 September 2020
Revised: 08 December 2020
Accepted: 29 January 2021
Published: 20 April 2022
DOI: https://doi.org/10.1007/s11432-020-3181-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Onfocus detection: identifying individual-camera eye contact from unconstrained images

Abstract

Article PDF

Similar content being viewed by others

Eye Contact Detection from Third Person Video

Detection of eye contact with deep neural networks is as accurate as human experts

MGTR: End-to-End Mutual Gaze Detection with Transformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Onfocus detection: identifying individual-camera eye contact from unconstrained images

Abstract

Article PDF

Similar content being viewed by others

Eye Contact Detection from Third Person Video

Detection of eye contact with deep neural networks is as accurate as human experts

MGTR: End-to-End Mutual Gaze Detection with Transformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation