Categorisation of spoken social affects in Japanese: human vs. machine
SHOCHI, Takaaki
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Voir plus >
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
SHOCHI, Takaaki
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
RILLIARD, Albert
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur [LIMSI]
< Réduire
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur [LIMSI]
Langue
en
Communication dans un congrès avec actes
Ce document a été publié dans
International Congress of Phonetic Sciences ICPhS 2019, International Congress of Phonetic Sciences ICPhS, 2019-08-04, Melbourne.
Résumé en anglais
In this paper, we investigate the abilities of both human listeners and computers to categorise social affects using only speech. The database used is composed of speech recorded by 19 native Japanese speakers. It is first ...Lire la suite >
In this paper, we investigate the abilities of both human listeners and computers to categorise social affects using only speech. The database used is composed of speech recorded by 19 native Japanese speakers. It is first evaluated perceptually to rank speakers according to their perceived performance. The four best speakers are then selected to be used in a categorisation experiment in nine social affects spread across four broad categories. An automatic classification experiment is then carried out using prosodic features and voice quality related features. The automatic classification system takes advantages of a feature selection algorithm and uses Linear Discriminant Analysis. The results show that the performance obtained by automatic classification using only eight features is comparable to the performance produced by our set of listeners: three out of four broad categories are quite well identified whereas the seduction affect is poorly recognised either by the listeners or the computer.< Réduire
Mots clés en anglais
Prosodic analysis
Ex- pressive speech
Speech perception
Social attitudes
Origine
Importé de halUnités de recherche