Abstract
We investigate the usefulness of DISfluencies and Non-verbal Vocalisations (DIS-NV) for recognizing human emotions in dialogues. The proposed features measure filled pauses, fillers, stutters, laughter, and breath in utterances. The predictiveness of DISNV features is compared with lexical features and state-of-the-art low-level acoustic features. Our experimental results show that using DIS-NV features alone is not as predictive as using lexical or acoustic features. However, adding them to lexical or acoustic feature set yields improvement compared to using lexical or acoustic features alone. This indicates that disfluencies and non-verbal vocalisations provide useful information overlooked by the other two types of features for emotion recognition.
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech 2015 |
Subtitle of host publication | 14–15 April 2015 |
Editors | Khiet Truong, Dirk Heylen, Jürgen Trouvain, Nick Campbell |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 39-41 |
Number of pages | 3 |
Publication status | Published - 2015 |
Externally published | Yes |
Event | Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech 2015 - Enschede, Netherlands Duration: 14 Apr 2015 → 15 Apr 2015 Conference number: 4th https://laughterworkshop2015.wordpress.com/ |
Conference
Conference | Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech 2015 |
---|---|
Country/Territory | Netherlands |
City | Enschede |
Period | 14/04/15 → 15/04/15 |
Internet address |
Keywords
- emotion recognition
- dialogue
- disfluency
- speech processing
- HCI