2510001763
  • Open Access
  • Article
A Directional Attention Fusion and Multi-Head Spatial-Channel Attention Network for Facial Expression Recognition
  • Yukun Shao,   
  • Yang Li,   
  • Baiqiang Wu *

Received: 15 Sep 2025 | Revised: 21 Oct 2025 | Accepted: 22 Oct 2025 | Published: 05 Nov 2025

Abstract

Facial Expression Recognition (FER), as a cutting-edge affective computing technology, holds significant application value in the field of human–computer interaction. However, due to intra-class variations, subtle inter-class differences, and environmental interference, FER in unconstrained scenarios remains challenging. To address these limitations, this paper proposes a Directional Attention Fusion and Multi-Head Spatial-Channel Attention Network (DAF-MHSCA). Firstly, coarse-grained facial features are extracted through a ResNet18 backbone network, followed by the capture of detailed expression features via an adaptive feature calibration (AFC) mechanism. Subsequently, we introduce a directional attention fusion (DAF) module, which generates spatial attention maps through both self-attention and cross-attention mechanisms along the width and height directions. Finally, a multi-head spatial-channel attention (MHSCA) module is incorporated, which integrates the spatial attention maps to perform channel-wise and spatial-wise attention on the features, ultimately enabling emotion recognition through a classifier. The competitive experimental results on five datasets have shown that our proposed method achieves notable improvements over state-of-the-art methods.

References 

  • 1.
    Schmidt, K.L.; Cohn, J.F. Human facial expressions as adaptations: Evolutionary questions in facial expression research. Am. J. Phys. Anthropol. Off. Publ. Am. Assoc. Phys. Anthropol. 2001, 116, 3–24.
  • 2.
    Li, S.; Deng, W. Deep facial expression recognition: A survey. IEEE Trans. Affect. Comput. 2020, 13, 1195–1215.
  • 3.
    Li, Y.; Yang, G.; Su, Z.; et al. Human activity recognition based on multienvironment sensor data. Inf. Fusion 2023, 91, 47–63.
  • 4.
    Ekundayo O; Viriri S. Facial expression recognition: a review of methods, performances and limitations. In Proceedings of the 2019 Conference on Information Communications Technology and Society (ICTAS), Durban, South Africa, 6–8 March 2019.
  • 5.
    Kopalidis, T.; Solachidis, V.; Vretos, N.; et al. Advances in facial expression recognition: a survey of methods, benchmarks, models, and datasets. Information 2024, 15, 135.
  • 6.
    Pham, T.D.; Duong, M.T.; Ho, Q.T.; et al. CNN-based facial expression recognition with simultaneous consideration of inter-class and intra-class variations. Sensors 2023, 23, 9658.
  • 7.
    Li, S.; Deng, W. Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition. IEEE Trans. Image Process. 2019, 28, 356–370.
  • 8.
    Wen, Z.; Lin, W.; Wang, T.; et al. Distract your attention: Multihead cross attention network for facial expression recognition. Biomimetics 2023, 8, 199.
  • 9.
    Zhang, S.; Zhang, Y.; Zhang, Y.; et al. A dual direction attention mixed feature network for facial expression recognition. Electronics 2023, 12, 3595.
  • 10.
    Cabacas-Maso, J.; Ortega-Beltr´an, E.; Benito-Altamirano, I.; et al. Enhancing facial expression recognition through dual-direction attention mixed feature networks: Application to 7th ABAW challenge. In Proceedings of the European Conference on Computer Vision (ECCV), Milan, Italy, 29 September–4 October 2024.
  • 11.
    Mollahosseini, A.; Hasani, B.; Mahoor, M.H. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 2017, 10, 18–31.
  • 12.
    He, K.; Zhang, X.; Ren, S.; et al. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016.
  • 13.
    Guo, Y.; Zhang, L.; Hu, Y.; et al. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016.
  • 14.
    Uniyal, S.; Agarwal, R.. Analyzing Facial Emotion Patterns in AffectNet with Deep Neural Networks. In Proceedings of the 2024 1st International Conference on Advances in Computing, Communication and Networking (ICAC2N), Greater Noida, India, 16–17 December 2024.
  • 15.
    Song, C.H.; Han, H.J.; Avrithis, Y. All the attention you need: Global-local, spatial-channel attention for image retrieval. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022.
  • 16.
    Li, H.; Sui, M.; Zhao, F.; et al. MVT: Mask vision transformer for facial expression recognition in the wild. arXiv 2021, arXiv:2106.04520.
  • 17.
    Farzaneh, A.H.; Qi, X. Facial expression recognition in the wild via deep attentive center loss. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV), Virtual, 5–9 January 2021.
  • 18.
    Ma, F.; Sun, B.; Li, S.. Facial expression recognition with visual transformers and attentional selective fusion. IEEE Trans. Affect. Comput. 2021, 14, 1236–1248.
  • 19.
    Liu, H.; Cai, H.; Lin, Q.; et al. FEDA: Fine-grained emotion difference analysis for facial expression recognition. Biomed. Signal Process. Control 2023, 79, 104209.
  • 20.
    Zheng, J.; Li, B.; Zhang, S.; et al. Attack can benefit: An adversarial approach to recognizing facial expressions under noisy annotations. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Washington, DC, USA, 7–14 February 2023.
  • 21.
    Liu, H.; Zhou, Q.; Zhang, C.; et al. MMATrans: Muscle movement aware representation learning for facial expression recognition via transformers. IEEE Trans. Ind. Inform. 2024, 20, 13753–13764.
  • 22.
    Xu, J.; Li, Y.; Yang, G.; et al. Multiscale facial expression recognition based on dynamic global and static local attention. IEEE Trans. Affect. Comput. 2025, 16, 683–696.
Share this article:
How to Cite
Shao, Y.; Li, Y.; Wu, B. A Directional Attention Fusion and Multi-Head Spatial-Channel Attention Network for Facial Expression Recognition. Journal of Machine Learning and Information Security 2025, 1 (1), 7.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.