Sentiment-based candidate selection for NMT

Alexander Jones, Derry Wijaya

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

The explosion of user-generated content (UGC)---e.g. social media posts and comments and and reviews---has motivated the development of NLP applications tailored to these types of informal texts. Prevalent among these applications have been sentiment analysis and machine translation (MT). Grounded in the observation that UGC features highly idiomatic and sentiment-charged language and we propose a decoder-side approach that incorporates automatic sentiment scoring into the MT candidate selection process. We train monolingual sentiment classifiers in English and Spanish and in addition to a multilingual sentiment model and by fine-tuning BERT and XLM-RoBERTa. Using n-best candidates generated by a baseline MT model with beam search and we select the candidate that minimizes the absolute difference between the sentiment score of the source sentence and that of the translation and and perform two human evaluations to assess the produced translations. Unlike previous work and we select this minimally divergent translation by considering the sentiment scores of the source sentence and translation on a continuous interval and rather than using e.g. binary classification and allowing for more fine-grained selection of translation candidates. The results of human evaluations show that and in comparison to the open-source MT baseline model on top of which our sentiment-based pipeline is built and our pipeline produces more accurate translations of colloquial and sentiment-heavy source texts.
Original languageEnglish
Title of host publicationProceedings of Machine Translation Summit XVIII
EditorsElaine O’Curran
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages188-201
Number of pages14
Publication statusPublished - 2021
Externally publishedYes
EventMachine Translation Summit 2021 - , United States of America
Duration: 16 Aug 202120 Aug 2021
https://aclanthology.org/2021.mtsummit-research.0/ (Proceedings)
https://amtaweb.org/mt-summit2021/ (Website)

Conference

ConferenceMachine Translation Summit 2021
Abbreviated titleMT Summit XVIII
Country/TerritoryUnited States of America
Period16/08/2120/08/21
Internet address

Cite this