Abstract
The automatic recognition of human emotions from facial expressions is a critical component of affective computing, with applications spanning human-computer interaction, customer service, and mental health. While traditional methods often rely on handcrafted features, deep learning approaches, particularly Convolutional Neural Networks (CNNs), have demonstrated superior performance. However, challenges remain in model interpretability and robustness to real-world variations. To address these limitations, this study proposes a web-based deep learning model for real-time emotion recognition. An enhanced facial emotion recognition model using a deep CNN integrated with a Convolutional Block Attention Module (CBAM) and Grad-CAM explainability was developed. Using the Extended Cohn-Kanade (CK+) dataset comprising 981 images across seven emotion classes, the model was trained with rigorous preprocessing and data augmentation. Results show that the proposed model achieved 98.71% accuracy, 98.7% recall, and an Fl-score of 98.9%, with perfect AUC and Average Precision scores of 1.00 for all classes. The integration of CBAM improved feature focus on salient facial regions, while Grad-CAM provided visual explanations, enhancing clinical and practical trustworthiness. The system was successfully deployed as a browser-based application, demonstrating real-time inference capabilities. This study highlights the potential of attention-enhanced deep learning models in advancing transparent and efficient emotion diagnostics for real-world deployment.
References
Arrieta et al., 2020: Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible Al. Information Fusion, 58, 82-115. Li & Deng, 2020: Li, Y., & Deng, W. (2020). Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing, 30, 689-701. Zhang et al., 2021: Zhang, Y., Wang, C., & Deng, W. (2021). Relative uncertainty learning for facial expression recognition. Advances in Neural Information Processing Systems, 34, 17616-17627. Saputra, D. M. et al. (2023). A comprehensive survey of explainable AI (XAI) in deep learning for computer vision: Methods, metrics, and challenges. Journal of Big Data, 10(1), 1-32. Chen, L., Liu, M., & Zhang, D. (2024). ECA-CBAM: An Efficient Channel Attention-based Convolutional Block Attention Module for Facial Expression Recognition. Neural Networks, 171, 1-13. Wang, Z., & Wang, E. (2023). A survey on attention mechanisms in deep learning for computer vision. IEEE Access, 11, 10575-10591. Arrieta, A. B., et al. (2023). Vision transformers for facial expression recognition: A comparative study. Pattern Recognition Letters, 175, 50-57. Chen, L., Liu, M., & Zhang, D. (2024). ECA-CBAM: An Efficient Channel Attention-based Convolutional Block Attention Module for Facial Expression Recognition. Neural Networks, 171, 1-13. Khan, U. A., et al. (2023). Explainable Al for affective computing: A review. IEEE Transactions on Affective Computing, 14(3), 1234-1249. Li, Y., & Deng, W. (2022). Deep learning for facial expression recognition: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 8915-8932. Metaxa, D., et al. (2024). Cultural Bias in Facial Analysis Technology: A Comparative Audit. Proceedings of the ACM on Human-Computer Interaction, 8(CSCW1), 1-34. Minaee, S., et al. (2021). Facial Emotion Recognition: A Survey of Datasets, Algorithms, and Future Directions. Journal of Computer Science and Technology, 36(6), 1335-1355. Park, J., et al. (2024). Validating deep learning models for emotion recognition using integrated attention and explanation maps. Nature Machine Intelligence, 6(2), 150-162. Rakova, B., et al. (2021). Ethical considerations for facial recognition technologies in affective computing. Al and Ethics, 1(3), 301-317. Saputra, D. M., et al. (2023). A comprehensive survey of explainable AI (XAI) in deep learning for computer vision: Methods, metrics, and challenges. Journal of Big Data, 10(1), 1-32. Wang, Z., & Wang, E. (2023). A survey on attention mechanisms in deep learning for computer vision. IEEE Access, 11, 10575-10591. Zadeh, A., et al. (2024). Cross-modal transformer networks for multimodal affective computing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(5), 2801-2815. Zhang, Y., Wang, C., & Deng, W. (2021). Relative uncertainty learning for facial expression recognition. Advances in Neural Information Processing Systems, 34, 17616-17627. Zhao, S., et al. (2024). Towards real-world deployment of affective AI: Challenges in robustness and system integration. ACM Computing Surveys, 56(8), 1-38.