A review of evaluation practices of gesture generation in embodied conversational agents

Pieter Wolfert, Nicole Robinson, Tony Belpaeme

Research output: Contribution to journalArticleResearchpeer-review

30 Citations (Scopus)

Abstract

Embodied conversational agents (ECAs) are often designed to produce nonverbal behavior to complement or enhance their verbal communication. One such form of the nonverbal behavior is co-speech gesturing, which involves movements that the agent makes with its arms and hands that are paired with verbal communication. Co-speech gestures for ECAs can be created using different generation methods, divided into rule-based and data-driven processes, with the latter, gaining traction because of the increasing interest from the applied machine learning community. However, reports on gesture generation methods use a variety of evaluation measures, which hinders comparison. To address this, we present a systematic review on co-speech gesture generation methods for iconic, metaphoric, deictic, and beat gestures, including reported evaluation methods. We review 22 studies that have an ECA with a human-like upper body that uses co-speech gesturing in social human-agent interaction. This includes studies that use human participants to evaluate performance. We found most studies use a within-subject design and rely on a form of subjective evaluation, but without a systematic approach. We argue that the field requires more rigorous and uniform tools for co-speech gesture evaluation, and formulate recommendations for empirical evaluation, including standardized phrases and example scenarios to help systematically test generative models across studies. Furthermore, we also propose a checklist that can be used to report relevant information for the evaluation of generative models, as well as to evaluate co-speech gesture use.

Original languageEnglish
Pages (from-to)379-389
Number of pages11
JournalIEEE Transactions on Human-Machine Systems
Volume52
Issue number3
DOIs
Publication statusPublished - Jun 2022

Keywords

  • Avatars
  • Data mining
  • Databases
  • Human–computer interface
  • human–robot interaction
  • Measurement
  • Neural networks
  • Protocols
  • social robotics
  • Systematics
  • virtual interaction

Cite this