XIA, Sylvain; FOURER, Dominique; AUDIN, Liliana; ROUAS, Jean-Luc; SHOCHI, Takaaki

hal.structure.identifier	Informatique, BioInformatique, Systèmes Complexes [IBISC]
dc.contributor.author	XIA, Sylvain
hal.structure.identifier	Informatique, BioInformatique, Systèmes Complexes [IBISC]
dc.contributor.author	FOURER, Dominique
hal.structure.identifier	Laboratoire de l'intégration, du matériau au système [IMS]
dc.contributor.author	AUDIN, Liliana
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	ROUAS, Jean-Luc
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	SHOCHI, Takaaki
dc.date.accessioned	2022-03-07T14:26:26Z
dc.date.available	2022-03-07T14:26:26Z
dc.date.conference	2022-05-23
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/129775
dc.description.abstractEn	This paper addresses the problem of emotion recognition from a speech signal. Thus, we investigate a data augmentation technique based on circular shift of the input time-frequency representation which significantly enhances the emotion prediction results using a deep convolutional neural network method. After an investigation of the best combination of the method parameters, we comparatively assess several neural network architectures (Alexnet, Resnet and Inception) using our approach applied on two publicly available datasets: eNTERFACE05 and EMO-DB. Our results reveal an improvement of the prediction accuracy in comparison to a more complicated technique of the state of the art based on Discriminant Temporal Pyramid Matching (DCNN-DTPM).
dc.language.iso	en
dc.subject.en	Speech Emotion Recognition (SER)
dc.subject.en	Deep Convolutional Neural Networks
dc.subject.en	Time-frequency
dc.subject.en	Random Circular Shift (RCS)
dc.title.en	Speech Emotion Recognition using Time-frequency Random Circular Shift and Deep Neural Networks
dc.type	Communication dans un congrès avec actes
dc.subject.hal	Informatique [cs]/Son [cs.SD]
dc.subject.hal	Informatique [cs]/Traitement du signal et de l'image
dc.subject.hal	Informatique [cs]/Intelligence artificielle [cs.AI]
bordeaux.hal.laboratories	CLLE Montaigne : Cognition, langues, Langages, Ergonomie - UMR 5263	*
bordeaux.institution	Université Bordeaux Montaigne
bordeaux.country	PT
bordeaux.title.proceeding	Speech Prosody 2022
bordeaux.conference.city	Lisbonne
bordeaux.peerReviewed	oui
hal.identifier	hal-03583535
hal.version	1
hal.origin.link	https://hal.archives-ouvertes.fr//hal-03583535v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=XIA,%20Sylvain&FOURER,%20Dominique&AUDIN,%20Liliana&ROUAS,%20Jean-Luc&SHOCHI,%20Takaaki&rft.genre=proceeding

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

CLLE Montaigne : Cognition, langues, Langages, Ergonomie - UMR 5263

Show simple item record

Speech Emotion Recognition using Time-frequency Random Circular Shift and Deep Neural Networks

Files in this item

This item appears in the following Collection(s)