Leveraging meta information in short text aggregation

He Zhao, Lan Du, Guanfeng Liu, Wray Buntine

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Short texts such as tweets often contain insufficient word co-occurrence information for training conventional topic models. To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. Our model can generate more interpretable topics as well as document clusters. We develop an effective Gibbs sampling algorithm favoured by the fully local conjugacy in the model. Extensive experiments demonstrate that our model achieves better performance in terms of document clustering and topic coherence.
Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics
EditorsAnna Korhonen, David Traum, Lluís Màrquez
Place of PublicationFlorence Italy
PublisherAssociation for Computational Linguistics (ACL)
Pages4042-4049
Number of pages8
ISBN (Electronic)9781950737482
DOIs
Publication statusPublished - Jul 2019
EventAnnual Meeting of the Association of Computational Linguistics 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019
Conference number: 57th
http://www.acl2019.org/EN/index.xhtml

Conference

ConferenceAnnual Meeting of the Association of Computational Linguistics 2019
Abbreviated titleACL 2019
CountryItaly
CityFlorence
Period28/07/192/08/19
Internet address

Keywords

  • Clustering algorithms
  • Short texts
  • Topic Models

Cite this

Zhao, H., Du, L., Liu, G., & Buntine, W. (2019). Leveraging meta information in short text aggregation. In A. Korhonen, D. Traum, & L. Màrquez (Eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 4042-4049). [P19-1396] Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/P19-1396