Learning semantic textual similarity via topic-informed discrete latent variables

Erxin Yu, Lan Du, Yuan Jin, Zhepei Wei, Yi Chang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

6 Citations (Scopus)

Abstract

Recently, discrete latent variable models have received a surge of interest in both Natural Language Processing (NLP) and Computer Vision (CV), attributed to their comparable performance to the continuous counterparts in representation learning, while being more interpretable in their predictions. In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. Compared with previous models limited to local semantic contexts, our model can explore richer semantic information via topic modeling. We further boost the performance of semantic similarity by injecting the quantized representation into a transformer-based language model with a well-designed semantic-driven attention mechanism. We demonstrate, through extensive experiments across various English language datasets, that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.

Original languageEnglish
Title of host publicationProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages4937-4948
Number of pages12
Publication statusPublished - 2022
EventEmpirical Methods in Natural Language Processing 2022 - Abu Dhabi, United Arab Emirates
Duration: 7 Dec 202211 Dec 2022
https://preview.aclanthology.org/emnlp-22-ingestion/volumes/2022.emnlp-main/ (Proceedings)
https://2022.emnlp.org/ (Website)

Conference

ConferenceEmpirical Methods in Natural Language Processing 2022
Abbreviated titleEMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/2211/12/22
Internet address

Cite this