Theoretical foundations of multimodal interfaces and systems

Sharon Oviatt

Research output: Chapter in Book/Report/Conference proceedingChapter (Book)Otherpeer-review


This chapter discusses the theoretical foundations of multisensory perception and multimodal communication. It provides a basis for understanding the performance advantages of multimodal interfaces, as well as how to design them to reap these advantages. Historically, the major theories that have influenced contemporary views of multimodal interaction and interface design include Gestalt theory, Working Memory theory, and Activity theory. They include perception-action dynamic theories and also limited resource theories that focus on constraints involving attention and short-term memory. This chapter emphasizes these theories in part because they are supported heavily by neuroscience findings. Their predictions also have been corroborated by studies on multimodal human-computer interaction. In addition to summarizing these three main theories and their impact, several related theoretical frameworks will be described that have influenced multimodal interface design, including Multiple Resource theory, Cognitive Load theory, Embodied Cognition, Communication Accommodation theory, and Affordance theory.

The large and multidisciplinary body of research on multisensory perception, production, and multimodal interaction confirms many Gestalt,Working Memory, and Activity theory predictions that will be discussed in this chapter. These theories provide conceptual anchors. They create a path for understanding how to design more powerful systems, so we can gain better control over our own future. In spite of this, it is surprising how many systems are developed from an engineering perspective that is sophisticated, yet in a complete theoretical vacuum that Leonardo da Vinci would have ridiculed:

Those who fall in love with practice without science are like a sailor who enters a ship without helm or compass, and who never can be certain whither he is going. Richter and Wells [2008]

This chapter aims to provide a better basis for motivating and accelerating future multimodal system design, and the quality of its impact on human users. AB@For a definition of highlighted terms in this chapter, see the Glossary. For other related terms and concepts, also see the textbook on multimodal interfaces by [Oviatt and Cohen 2015]. Focus Questions to aid comprehension are available at the end of this chapter.
Original languageEnglish
Title of host publicationThe Handbook of Multimodal-Multisensor Interfaces, Volume 1
Subtitle of host publicationFoundations, User Modeling, and Common Modality Combinations
EditorsSharon Oviatt, Bjorn Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos, Antonio Kruger
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages32
ISBN (Electronic)9781970001655, 9781970001662
ISBN (Print)9781970001679, 9781970001648
Publication statusPublished - 2017
Externally publishedYes

Cite this