Multimodal signal processing in naturalistic noisy environments

Sharon Oviatt

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

15 Citations (Scopus)

Abstract

When a system must process spoken language in natural environments that involve different types and levels of noise, the problem of supporting robust recognition is a very difficult one. In the present studies, over 2,600 multimodal utterances were collected during both mobile and stationary use of a multimodal pen/voice system. The results confirmed that multimodal signal processing supports significantly improved robustness over spoken language processing alone, with the largest improvement during mobile use. The multimodal architecture decreased the spoken language error rate by 19-35%. In addition, data collected on a command-by-command basis while users were mobile emphasized the adverse impact of users' Lombard adaptation on system processing, even when a noise-canceling microphone was used. Implications of these findings are discussed for improving the reliability and stability of spoken language processing in mobile environments.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association (ISCA)
Number of pages4
ISBN (Electronic)7801501144, 9787801501141
Publication statusPublished - 2000
Externally publishedYes
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 16 Oct 200020 Oct 2000

Conference

Conference6th International Conference on Spoken Language Processing, ICSLP 2000
Country/TerritoryChina
CityBeijing
Period16/10/0020/10/00

Cite this