Statistical multimodal integration for intelligent HCI

Lizhong Wu, Sharon L. Oviatt, Philip R. Cohen

Research output: Contribution to conferencePaperpeer-review

4 Citations (Scopus)

Abstract

This paper presents a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first derive the performance bounds of multimodal recognition probabilities, and identify the primary factors that influence multimodal recognition performance. We then develop a technique, a Members-Teams-Committee (MTC) recognition approach, designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented here offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems.

Original languageEnglish
Pages487-496
Number of pages10
Publication statusPublished - 1 Dec 1999
Externally publishedYes
EventProceedings of the 1999 9th IEEE Workshop on Neural Networks for Signal Processing (NNSP'99) - Madison, WI, USA
Duration: 23 Aug 199925 Aug 1999

Conference

ConferenceProceedings of the 1999 9th IEEE Workshop on Neural Networks for Signal Processing (NNSP'99)
CityMadison, WI, USA
Period23/08/9925/08/99

Cite this