Multimodal integration - A statistical view

Lizhong Wu, Sharon L. Oviatt, Philip R. Cohen

Research output: Contribution to journalArticleResearchpeer-review

122 Citations (Scopus)

Abstract

This paper presents a statistical approach to developing niultiniodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the niultiniodal system. We first identify the primary factors that influence niultiniodal recognition performance by evaluating the niultiniodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the niultiniodal integration process. We evaluate these methods using Quickset, a speech/gesture niultiniodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented here offers enhanced robustness. It also is premised on more realistic assumptions than previous niultiniodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating niultiniodal systems.

Original languageEnglish
Pages (from-to)334-341
Number of pages8
JournalIEEE Transactions on Multimedia
Volume1
Issue number4
DOIs
Publication statusPublished - 1 Dec 1999
Externally publishedYes

Keywords

  • Combination of multiple classifiers
  • Decision making
  • Gesture recognition
  • Learning
  • Niultiniodal integration
  • Speech recognition
  • Uncertainty

Cite this