Afficher la notice abrégée

dc.rights.licenseopenen_US
dc.contributor.authorPEÑA, Diego
dc.contributor.authorAGUILERA, Ana
hal.structure.identifierESTIA INSTITUTE OF TECHNOLOGY
dc.contributor.authorDONGO, Irvin
dc.contributor.authorHEREDIA, Juanpablo
dc.contributor.authorCARDINALE, Yudith
dc.date.accessioned2024-11-02T10:48:34Z
dc.date.available2024-11-02T10:48:34Z
dc.date.issued2023
dc.identifier.issn2169-3536en_US
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/203107
dc.description.abstractEnMultimodal methods for emotion recognition consider several sources of data to predict emotions; thus, a fusion method is needed to aggregate the individual results. In the literature, there is a high variety of fusion methods to perform this task, but they are not suitable for all scenarios. In particular, there are two relevant aspects that can vary from one application to another: (i) in many scenarios, individual modalities can have different levels of data quality or even be absent, which demands fusion methods able to discriminate non-useful from relevant data; and (ii) in many applications, there are hardware restrictions that limit the use of complex fusion methods (e.g., a deep learning model), which could be quite computationally intensive. In this context, developers and researchers need metrics, guidelines, and a systematic process to evaluate and compare different fusion methods that can fit to their particular application scenarios. As a response to this need, this paper presents a framework that establishes a base to perform a comparative evaluation of fusion methods to demonstrate how they adapt to the quality differences of individual modalities and to evaluate their performance. The framework provides equivalent conditions to perform a fair assessment of fusion methods. Based on this framework, we evaluate several fusion methods for multimodal emotion recognition. Results demonstrate that for the architecture and dataset selected, the methods that best fit are: Self-Attention andWeighted methods for all available modalities, and Self-Attention and Embracenet+when a modality is missing. Concerning the time, the best times correspond to Multilayer Perceptron (MLP) and Self-Attention models, due to their small number of operations. Thus, the proposed framework provides insights for researchers in this area to identify which fusion methods better fit their requirements, and thus to justify the selection.
dc.language.isoENen_US
dc.rightsAttribution 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/us/*
dc.subject.enEmotion recognition
dc.subject.enFusion methods
dc.subject.enMultimodality.
dc.title.enA Framework to Evaluate Fusion Methods for Multimodal Emotion Recognition
dc.typeArticle de revueen_US
dc.identifier.doi10.1109/ACCESS.2023.3240420en_US
dc.subject.halInformatique [cs]en_US
bordeaux.journalIEEE Accessen_US
bordeaux.page10218-10237en_US
bordeaux.volume11en_US
bordeaux.hal.laboratoriesESTIA - Rechercheen_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.peerReviewedouien_US
bordeaux.inpressnonen_US
bordeaux.import.sourcehal
hal.identifierhal-04745555
hal.version1
hal.popularnonen_US
hal.audienceInternationaleen_US
hal.exportfalse
workflow.import.sourcehal
dc.rights.ccCC BYen_US
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=IEEE%20Access&rft.date=2023&rft.volume=11&rft.spage=10218-10237&rft.epage=10218-10237&rft.eissn=2169-3536&rft.issn=2169-3536&rft.au=PE%C3%91A,%20Diego&AGUILERA,%20Ana&DONGO,%20Irvin&HEREDIA,%20Juanpablo&CARDINALE,%20Yudith&rft.genre=article


Fichier(s) constituant ce document

Thumbnail
Thumbnail

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée