Topic model or topic twaddle? Re-evaluating demantic interpretability measures

Caitlin Doogan, Wray Buntine

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

72 Citations (Scopus)

Abstract

When developing topic models, a critical question that should be asked is: How well will this model work in an applied setting? Because standard performance evaluation of topic interpretability uses automated measures modeled on human evaluation tests that are dissimilar to applied usage, these models’ generalizability remains in question. In this paper, we probe the issue of validity in topic model evaluation and assess how informative coherence measures are for specialized collections used in an applied setting. Informed by the literature, we propose four understandings of interpretability. We evaluate these using a novel experimental framework reflective of varied applied settings, including human evaluations using open labeling, typical of applied research. These evaluations show that for some specialized collections, standard coherence measures may not inform the most appropriate topic model or the optimal number of topics, and current interpretability performance validation methods are challenged as a means to confirm model quality in the absence of ground truth data.

Original languageEnglish
Title of host publicationNAACL-HLT 2021, The 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
EditorsIz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages3824-3848
Number of pages25
ISBN (Electronic)9781954085466
DOIs
Publication statusPublished - 2021
EventNorth American Association for Computational Linguistics 2021 - Online, United States of America
Duration: 6 Jun 202111 Jun 2021
https://2021.naacl.org (Website)
https://www.aclweb.org/anthology/volumes/2021.naacl-main/ (Proceedings)

Conference

ConferenceNorth American Association for Computational Linguistics 2021
Abbreviated titleNAACL-HLT 2021
Country/TerritoryUnited States of America
Period6/06/2111/06/21
Internet address

Cite this