Leveraging meta information in short text aggregation

He Zhao, Lan Du, Guanfeng Liu, Wray Buntine

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

1 Citation (Scopus)


Short texts such as tweets often contain insufficient word co-occurrence information for training conventional topic models. To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. Our model can generate more interpretable topics as well as document clusters. We develop an effective Gibbs sampling algorithm favoured by the fully local conjugacy in the model. Extensive experiments demonstrate that our model achieves better performance in terms of document clustering and topic coherence.
Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics
EditorsAnna Korhonen, David Traum, Lluís Màrquez
Place of PublicationFlorence Italy
PublisherAssociation for Computational Linguistics (ACL)
Number of pages8
ISBN (Electronic)9781950737482
Publication statusPublished - Jul 2019
EventAnnual Meeting of the Association of Computational Linguistics 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019
Conference number: 57th
https://www.aclweb.org/anthology/events/acl-2019/ (Proceedings)


ConferenceAnnual Meeting of the Association of Computational Linguistics 2019
Abbreviated titleACL 2019
Internet address


  • Clustering algorithms
  • Short texts
  • Topic Models

Cite this