Leveraging recent advances in deep learning for audio-Visual emotion recognition

Liam Schoneveld; Alice Othmani; Hazem Abdelkawy

doi:10.1016/j.patrec.2021.03.007

Article Dans Une Revue Pattern Recognition Letters Année : 2021

Leveraging recent advances in deep learning for audio-Visual emotion recognition

, (1, 2) , (1, 3)

1
2
3

Liam Schoneveld

Fonction : Auteur
PersonId : 1238215
ORCID : 0000-0002-7324-6234

Alice Othmani

Fonction : Auteur

Laboratoire Images, Signaux et Systèmes Intelligents

SYNAPSE

Hazem Abdelkawy

Fonction : Auteur

Laboratoire Images, Signaux et Systèmes Intelligents

SIRIUS

Résumé

Emotional expressions are the behaviors that communicate our emotional state or attitude to others. They are expressed through verbal and non-verbal communication. Complex human behavior can be understood by studying physical features from multiple modalities; mainly facial, vocal and physical gestures. Recently, spontaneous multi-modal emotion recognition has been extensively studied for human behavior analysis. In this paper, we propose a new deep learning-based approach for audio-visual emotion recognition. Our approach leverages recent advances in deep learning like knowledge distillation and high-performing deep architectures. The deep feature representations of the audio and visual modalities are fused based on a model-level fusion strategy. A recurrent neural network is then used to capture the temporal dynamics. Our proposed approach substantially outperforms state-of-the-art approaches in predicting valence on the RECOLA dataset. Moreover, our proposed visual facial expression feature extraction network outperforms state-of-the-art results on the AffectNet and Google Facial Expression Comparison datasets.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

S0167865521000878.pdf (1.6 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Accord Elsevier CCSD : Connectez-vous pour contacter le contributeur

https://hal.u-pec.fr/hal-04032955

Soumis le : lundi 24 avril 2023-09:26:42

Dernière modification le : vendredi 27 décembre 2024-19:55:51

Archivage à long terme le : mardi 25 juillet 2023-18:12:59

Dates et versions

hal-04032955 , version 1 (24-04-2023)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

HAL Id : hal-04032955 , version 1
ARXIV : 2103.09154
DOI : 10.1016/j.patrec.2021.03.007
PII : S0167-8655(21)00087-8

Citer

Liam Schoneveld, Alice Othmani, Hazem Abdelkawy. Leveraging recent advances in deep learning for audio-Visual emotion recognition. Pattern Recognition Letters, 2021, 146, pp.1-7. ⟨10.1016/j.patrec.2021.03.007⟩. ⟨hal-04032955⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

LISSI UPEC

40 Consultations

196 Téléchargements

Leveraging recent advances in deep learning for audio-Visual emotion recognition

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager