Improving the understanding of spoken referring expressions through syntactic-semantic and contextual-phonetic error-correction

Ingrid Zukerman, Andisheh Partovi

    Research output: Contribution to journalArticleResearchpeer-review

    2 Citations (Scopus)

    Abstract

    Despite recent advances in automatic speech recognition, one of the main stumbling blocks to the widespread adoption of Spoken Dialogue Systems is the lack of reliability of automatic speech recognizers. In this paper, we offer a two-tier error-correction process that harnesses syntactic, semantic and pragmatic information to improve the understanding of spoken referring expressions, specifically descriptions of objects in physical spaces. A syntactic-semantic tier offers generic corrections to perceived ASR errors on the basis of syntactic expectations of a semantic model, and passes the corrected texts to a language understanding system. The output of this system, which consists of pragmatic interpretations, is then refined by a contextual-phonetic tier, which prefers interpretations that are phonetically similar to the mis-heard words. Our results, obtained on a corpus of 341 referring expressions, show that syntactic-semantic error correction significantly improves interpretation performance, and contextual-phonetic refinements yield further improvements.

    Original languageEnglish
    Pages (from-to)284-310
    Number of pages27
    JournalComputer Speech and Language
    Volume46
    DOIs
    Publication statusPublished - 1 Nov 2017

    Keywords

    • Contextual-phonetic model
    • Error correction
    • Physical spaces
    • Pragmatic interpretation
    • Referring expressions
    • Spoken language understanding
    • Syntactic-semantic model

    Cite this

    @article{aaba6d1191a8468cb8d82e2b82167923,
    title = "Improving the understanding of spoken referring expressions through syntactic-semantic and contextual-phonetic error-correction",
    abstract = "Despite recent advances in automatic speech recognition, one of the main stumbling blocks to the widespread adoption of Spoken Dialogue Systems is the lack of reliability of automatic speech recognizers. In this paper, we offer a two-tier error-correction process that harnesses syntactic, semantic and pragmatic information to improve the understanding of spoken referring expressions, specifically descriptions of objects in physical spaces. A syntactic-semantic tier offers generic corrections to perceived ASR errors on the basis of syntactic expectations of a semantic model, and passes the corrected texts to a language understanding system. The output of this system, which consists of pragmatic interpretations, is then refined by a contextual-phonetic tier, which prefers interpretations that are phonetically similar to the mis-heard words. Our results, obtained on a corpus of 341 referring expressions, show that syntactic-semantic error correction significantly improves interpretation performance, and contextual-phonetic refinements yield further improvements.",
    keywords = "Contextual-phonetic model, Error correction, Physical spaces, Pragmatic interpretation, Referring expressions, Spoken language understanding, Syntactic-semantic model",
    author = "Ingrid Zukerman and Andisheh Partovi",
    year = "2017",
    month = "11",
    day = "1",
    doi = "10.1016/j.csl.2017.05.005",
    language = "English",
    volume = "46",
    pages = "284--310",
    journal = "Computer Speech and Language",
    issn = "0885-2308",
    publisher = "Elsevier",

    }

    Improving the understanding of spoken referring expressions through syntactic-semantic and contextual-phonetic error-correction. / Zukerman, Ingrid; Partovi, Andisheh.

    In: Computer Speech and Language, Vol. 46, 01.11.2017, p. 284-310.

    Research output: Contribution to journalArticleResearchpeer-review

    TY - JOUR

    T1 - Improving the understanding of spoken referring expressions through syntactic-semantic and contextual-phonetic error-correction

    AU - Zukerman, Ingrid

    AU - Partovi, Andisheh

    PY - 2017/11/1

    Y1 - 2017/11/1

    N2 - Despite recent advances in automatic speech recognition, one of the main stumbling blocks to the widespread adoption of Spoken Dialogue Systems is the lack of reliability of automatic speech recognizers. In this paper, we offer a two-tier error-correction process that harnesses syntactic, semantic and pragmatic information to improve the understanding of spoken referring expressions, specifically descriptions of objects in physical spaces. A syntactic-semantic tier offers generic corrections to perceived ASR errors on the basis of syntactic expectations of a semantic model, and passes the corrected texts to a language understanding system. The output of this system, which consists of pragmatic interpretations, is then refined by a contextual-phonetic tier, which prefers interpretations that are phonetically similar to the mis-heard words. Our results, obtained on a corpus of 341 referring expressions, show that syntactic-semantic error correction significantly improves interpretation performance, and contextual-phonetic refinements yield further improvements.

    AB - Despite recent advances in automatic speech recognition, one of the main stumbling blocks to the widespread adoption of Spoken Dialogue Systems is the lack of reliability of automatic speech recognizers. In this paper, we offer a two-tier error-correction process that harnesses syntactic, semantic and pragmatic information to improve the understanding of spoken referring expressions, specifically descriptions of objects in physical spaces. A syntactic-semantic tier offers generic corrections to perceived ASR errors on the basis of syntactic expectations of a semantic model, and passes the corrected texts to a language understanding system. The output of this system, which consists of pragmatic interpretations, is then refined by a contextual-phonetic tier, which prefers interpretations that are phonetically similar to the mis-heard words. Our results, obtained on a corpus of 341 referring expressions, show that syntactic-semantic error correction significantly improves interpretation performance, and contextual-phonetic refinements yield further improvements.

    KW - Contextual-phonetic model

    KW - Error correction

    KW - Physical spaces

    KW - Pragmatic interpretation

    KW - Referring expressions

    KW - Spoken language understanding

    KW - Syntactic-semantic model

    UR - http://www.scopus.com/inward/record.url?scp=85020699823&partnerID=8YFLogxK

    U2 - 10.1016/j.csl.2017.05.005

    DO - 10.1016/j.csl.2017.05.005

    M3 - Article

    VL - 46

    SP - 284

    EP - 310

    JO - Computer Speech and Language

    JF - Computer Speech and Language

    SN - 0885-2308

    ER -