Categorisation of spoken social affects in Japanese: human vs. machine
SHOCHI, Takaaki
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Leer más >
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
SHOCHI, Takaaki
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Cognition, Langues, Langage, Ergonomie [CLLE-ERSS]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
RILLIARD, Albert
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur [LIMSI]
< Leer menos
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur [LIMSI]
Idioma
en
Communication dans un congrès
Este ítem está publicado en
International Congress of Phonetic Sciences ICPhS 2019, International Congress of Phonetic Sciences ICPhS 2019, 19th International Congress of Phonetic Sciences, 2019-08-05, Melbourne.
Resumen en inglés
In this paper, we investigate the abilities of both human listeners and computers to categorise social affects using only speech. The database used is composed of speech recorded by 19 native Japanese speakers. It is first ...Leer más >
In this paper, we investigate the abilities of both human listeners and computers to categorise social affects using only speech. The database used is composed of speech recorded by 19 native Japanese speakers. It is first evaluated perceptually to rank speakers according to their perceived performance. The four best speakers are then selected to be used in a categorisation experiment in nine social affects spread across four broad categories. An automatic classification experiment is then carried out using prosodic features and voice quality related features. The automatic classification system takes advantages of a feature selection algorithm and uses Linear Discriminant Analysis. The results show that the performance obtained by automatic classification using only eight features is comparable to the performance produced by our set of listeners: three out of four broad categories are quite well identified whereas the seduction affect is poorly recognised either by the listeners or the computer.< Leer menos
Palabras clave en inglés
Social attitudes
Speech perception
Ex- pressive speech
Prosodic analysis
Orígen
Importado de HalCentros de investigación