Abstract
Most of the existing studies have focused on the expression recognition of micro-expressions, while little research has been done on how to recognize the action units of micro-expressions. This is due to the low intensity of facial action units, which are not easy to be recognized. We proposed a micro-expression action unit recognition algorithm based on dynamic image and spatial pyramids to address this problem. First, the video is passed through the dynamic image generation module to generate a dynamic image and extract the motion information contained in all frames. Then, given the subtle movement properties of micro-expressions, different levels of semantic features are obtained through spatial pyramids. It is also known that micro-expressions appear in the small range and are concentrated in the local area of the face, so the regional feature network and attention mechanism are used for the image features of each layer. Finally, our models are trained separately due to the weak correlation between action units. Experiments on CASME and CAS(ME)\(^2\) datasets verify that our proposed algorithm has shown better action unit recognition performance compared with other advanced methods.







Similar content being viewed by others
References
Yang P, Jin H, Li Z (2022) Combining attention mechanism and dual-stream 3d convolutional neural network for micro-expression recognition. In: 2022 7th International Conference on Image, Vision and Computing (ICIVC), pp 51–59. https://doi.org/10.1109/ICIVC55077.2022.9886046
Haggard EA, Isaacs KS (1966) Micromomentary facial expressions as indicators of ego mechanisms in psychotherapy. Springer, Boston, MA, pp 154–165. https://doi.org/10.1007/978-1-4684-6045-2_14
Ekman P FW (1969) Nonverbal leakage and clues to deception. Psychiatry 32(1):88–106. https://doi.org/10.1080/00332747.1969.11023575
Yu EH, Choi EJ, Lee SY, Im SJ, Yune SJ, Baek SY (2016) Effects of micro- and subtle-expression reading skill training in medical students: a randomized trial. Patient Educ Couns 99(10):1670–1675. https://doi.org/10.1016/j.pec.2016.04.013
Frank MG, Svetieva E (2015) Microexpressions and Deception. In: Mandal MK, Awasthi A (eds) Understanding facial expressions in communication. Springer, New Delhi, pp 227–242. https://doi.org/10.1007/978-81-322-1934-7_11
Döllinger L, Laukka P, Högman LB, Bänziger T, Makower I, Fischer H, Hau S (2021) Training emotion recognition accuracy: results for multimodal expressions and facial micro expressions. Front Psychol. https://doi.org/10.3389/fpsyg.2021.708867
Khan W, Crockett K, O’Shea J, Hussain A, Khan BM (2021) Deception in the eyes of deceiver: a computer vision and machine learning based automated deception detection. Expert Syst Appl 169:114341. https://doi.org/10.1016/j.eswa.2020.114341
Qu F, Wang S-J, Yan W-J, Li H, Wu S, Fu X (2018) Cas(me)\(^2\): a database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans Affect Comput 9(4):424–436. https://doi.org/10.1109/TAFFC.2017.2654440
Duan X, Dai Q, Wang X, Wang Y, Hua Z (2016) Recognizing spontaneous micro-expression from eye region. Neurocomputing 217:27–36. https://doi.org/10.1016/j.neucom.2016.03.090. (SI: ALLSHC)
Wang S-J, Yan W-J, Sun T, Zhao G, Fu X (2016) Sparse tensor canonical correlation analysis for micro-expression recognition. Neurocomputing 214:218–232. https://doi.org/10.1016/j.neucom.2016.05.083
Sun B, Cao S, Li D, He J, Yu L (2022) Dynamic micro-expression recognition using knowledge distillation. IEEE Trans Affect Comput 13(2):1037–1043. https://doi.org/10.1109/TAFFC.2020.2986962
Wiggers M, Vangelder R, Heymans P (1987) The evaluation of facial paralysis: a case study using the facial action coding system and electromyography. J Clin Exp Neuropsychol 9:278–279
Martinez B, Valstar MF, Jiang B, Pantic M (2019) Automatic analysis of facial actions: a survey. IEEE Trans Affect Comput 10(3):325–347. https://doi.org/10.1109/TAFFC.2017.2731763
Zhao K, Chu W-S, Martinez AM (2018) Learning facial action units from web images with scalable weakly supervised clustering. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2090–2099. https://doi.org/10.1109/CVPR.2018.00223
Han S, Meng Z, O’Reilly J, Cai J, Wang X, Tong Y (2017) Optimizing filter size in convolutional neural networks for facial action unit recognition. CoRR arXiv:1707.08630
Wang S, Pan B, Wu S, Ji Q (2021) Deep facial action unit recognition and intensity estimation from partially labelled data. IEEE Trans Affect Comput 12(4):1018–1030. https://doi.org/10.1109/TAFFC.2019.2914654
Hoai DL, Lim E, Choi E, Kim S, Pant S, Lee G-S, Kim S-H, Yang H-J (2022) An attention-based method for multi-label facial action unit detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 2453–2458. https://doi.org/10.1109/CVPRW56347.2022.00274
Li Y, Huang X, Zhao G (2019) Micro-expression action unit detection withspatio-temporal adaptive pooling. CoRR arXiv:1907.05023
Li Y, Huang X, Zhao G (2021) Micro-expression action unit detection with spatial and channel attention. Neurocomputing 436:221–231. https://doi.org/10.1016/j.neucom.2021.01.032
Li Y, Peng W, Zhao G (2021) Micro-expression action unit detection with dual-view attentive similarity-preserving knowledge distillation. In: 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pp 01–08. https://doi.org/10.1109/FG52635.2021.9666975
Zhang X, Yin L, Cohn JF, Canavan S, Reale M, Horowitz A, Liu P, Girard JM (2014) Bp4d-spontaneous: a high-resolution spontaneous 3d dynamic facial expression database. Image Vis Comput 32(10):692–706. https://doi.org/10.1016/j.imavis.2014.06.002
Zhang W, Wang L, Yan J, Wang X, Zha H (2017) Deep extreme multi-label learning. CoRR arXiv:1704.03718
Bilen H, Fernando B, Gavves E, Vedaldi A, Gould S (2016) Dynamic image networks for action recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3034–3042. https://doi.org/10.1109/CVPR.2016.331
Bilen H, Fernando B, Gavves E, Vedaldi A (2018) Action recognition with dynamic image networks. IEEE Trans Pattern Anal Mach Intell 40(12):2799–2813. https://doi.org/10.1109/TPAMI.2017.2769085
Whitehill J, Omlin CW (2006) Haar features for facs au recognition. In: 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pp 5–101. https://doi.org/10.1109/FGR.2006.61
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol 1, pp 886–8931. https://doi.org/10.1109/CVPR.2005.177
Jiang B, Valstar MF, Pantic M (2011) Action unit detection using sparse appearance descriptors in space–time video volumes. In: 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG), pp 314–321. https://doi.org/10.1109/FG.2011.5771416
Bazzo JJ, Lamar MV (2004) Recognizing facial actions using gabor wavelets with neutral face average difference. In: Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings., pp 505–510. https://doi.org/10.1109/AFGR.2004.1301583
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol 2, pp 1150–11572. https://doi.org/10.1109/ICCV.1999.790410
Rathee N, Ganotra D (2018) An efficient approach for facial action unit intensity detection using distance metric learning based on cosine similarity. SIViP 12:1141–1148. https://doi.org/10.1007/s11760-018-1255-3
Zhao K, Chu W-S, De la Torre F, Cohn JF, Zhang H (2016) Joint patch and multi-label learning for facial action unit and holistic expression recognition. IEEE Trans Image Process 25(8):3931–3946. https://doi.org/10.1109/TIP.2016.2570550
Rathee N, Ganotra D, Rathee A (2020) Facial action unit intensity detection by extracting complimentary information using distance metric learning. IETE J Res 66(2):214–223. https://doi.org/10.1080/03772063.2018.1483746
Wei C, Lu K, Gan W, Xue J (2021) Spatiotemporal features and local relationship learning for facial action unit intensity regression. In: 2021 IEEE International Conference on Image Processing (ICIP), pp 1109–1113. https://doi.org/10.1109/ICIP42928.2021.9506789
Tang C, Lu C, Zheng W, Zong Y, Li S (2021) Multi-view facial action unit detection via deep feature enhancement. Electron Lett 57(25):970–972. https://doi.org/10.1049/ell2.12322
Benitez-Quiroz CF, Srinivasan R, Martinez AM (2016) Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5562–5570. https://doi.org/10.1109/CVPR.2016.600
Zhao K, Chu W-S, Zhang H (2016) Deep region and multi-label learning for facial action unit detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3391–3399. https://doi.org/10.1109/CVPR.2016.369
Li S, Deng W (2022) Deep facial expression recognition: a survey. IEEE Trans Affect Comput 13(3):1195–1215. https://doi.org/10.1109/TAFFC.2020.2981446
Li W, Abtahi F, Zhu Z (2017) Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6766–6775. https://doi.org/10.1109/CVPR.2017.716
Li W, Abtahi F, Zhu Z, Yin L (2018) EAC-net: deep nets with enhancing and cropping for facial action unit detection. IEEE Trans Pattern Anal Mach Intell 40(11):2583–2596. https://doi.org/10.1109/TPAMI.2018.2791608
Mi Y, Liu Z, Zhao K, Wang S (2022) Recognizing micro actions in videos by learning multi-layer local features. Pattern Recogn Lett 158:55–62. https://doi.org/10.1016/j.patrec.2022.04.002
Mi Y, Zhang X, Li Z, Wang S (2020) Dual-branch network with a subtle motion detector for microaction recognition in videos. IEEE Trans Image Process 29:6194–6208. https://doi.org/10.1109/TIP.2020.2989864
Mi Y, Wang S (2019) Recognizing micro actions in videos: learning motion details via segment-level temporal pyramid. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp 1036–1041. https://doi.org/10.1109/ICME.2019.00182
Yonetani R, Kitani KM, Sato Y (2016) Recognizing micro-actions and reactions from paired egocentric videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2629–2638. https://doi.org/10.1109/CVPR.2016.288
Yan W-J, Wu Q, Liu Y-J, Wang S-J, Fu X (2013) Casme database: a dataset of spontaneous micro-expressions collected from neutralized faces. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp 1–7. https://doi.org/10.1109/FG.2013.6553799
Constâncio AS, Tsunoda DF, Silva HDFN, Silveira JMD, Carvalho DR (2023) Deception detection with machine learning: a systematic review and statistical analysis. PLoS ONE 18(2):1–31. https://doi.org/10.1371/journal.pone.0281323
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928. https://doi.org/10.1109/TPAMI.2007.1110
Päivärinta J, Rahtu E, Heikkilä J (2011) Volume local phase quantization for blur-insensitive dynamic texture classification. In: Heyden A, Kahl F (eds) Image analysis. Springer, Berlin, pp 360–369
Wang Y, See J, Phan RC-W, Oh Y-H (2015) LBP with six intersection points: reducing redundant information in LBP-top for micro-expression recognition. In: Cremers D, Reid I, Saito H, Yang M-H (eds) Computer Vision—ACCV 2014. Springer, Cham, pp 525–537
Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4724–4733. https://doi.org/10.1109/CVPR.2017.502
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, G., Yuan, S., Xing, H. et al. Micro-expression action unit recognition based on dynamic image and spatial pyramid. J Supercomput 79, 19879–19902 (2023). https://doi.org/10.1007/s11227-023-05409-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05409-7