Human perception of intended addressee during computer-assisted meetings

Rebecca Lunsford, Sharon Oviatt

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

15 Citations (Scopus)

Abstract

Recent research aims to develop new open-microphone engagement techniques capable of identifying when a speaker is addressing a computer versus human partner, including during computer-assisted group interactions. The present research explores: (1) how accurately people can judge whether an intended interlocutor is a human versus computer, (2) which linguistic, acoustic-prosodic, and visual information sources they use to make these judgments, and (3) what type of systematic errors are present in their judgments. Sixteen participants were asked to determine a speaker's intended addressee based on actual videotaped utterances matched on illocutionary force, which were played back as: (1) lexical transcriptions only, (2) audio-only, (3) visual-only, and (4) audio-visual information. Perhaps surprisingly, people's accuracy in judging human versus computer addressees did not exceed chance levels with lexical-only content (46%). As predicted, accuracy improved significantly with audio (58%), visual (57%), and especially audio-visual information (63%). Overall, accuracy in detecting human interlocutors was significantly worse than judging computer ones, and specifically worse when only visual information was present because speakers often looked at the computer when addressing peers. In contrast, accuracy in judging computer interlocutors was significantly better whenever visual information was present than with audio alone, and it yielded the highest accuracy levels observed (86%). Questionnaire data also revealed that speakers' gaze, peers' gaze, and tone of voice were considered the most valuable information sources. These results reveal that people rely on cues appropriate for interpersonal interactions in determining computer- versus human-directed speech during mixed human-computer interactions, even though this degrades their accuracy. Future systems that process actual rather than expected communication patterns potentially could be designed that perform better than humans.

Original languageEnglish
Title of host publicationICMI '06 - Proceedings of the 8th international conference on Multimodal interfaces
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages20-27
Number of pages8
ISBN (Electronic)159593541X
ISBN (Print)9781595935410
DOIs
Publication statusPublished - 2006
Externally publishedYes
EventInternational Conference on Multimodal Interfaces 2006 - Banff, AB, Canada
Duration: 2 Nov 20064 Nov 2006
Conference number: 8th

Conference

ConferenceInternational Conference on Multimodal Interfaces 2006
Abbreviated titleICMI'06
CountryCanada
CityBanff, AB
Period2/11/064/11/06

Keywords

  • Acoustic-prosodic cues
  • Dialogue style
  • Gaze
  • Human-computer teamwork
  • Intended addressee
  • Multiparty interaction
  • Open-microphone engagement

Cite this

Lunsford, R., & Oviatt, S. (2006). Human perception of intended addressee during computer-assisted meetings. In ICMI '06 - Proceedings of the 8th international conference on Multimodal interfaces (pp. 20-27). Association for Computing Machinery (ACM). https://doi.org/10.1145/1180995.1181002