Can domains be transferred across languages in multi-domain multilingual Neural Machine Translation?

Thuy Trang Vu, Shahram Khadivi, Xuanli He, Dinh Phung, Gholamreza Haffari

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

1 Citation (Scopus)

Abstract

Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT). This paper investigates whether the domain information can be transferred across languages on the composition of multi-domain and multilingual NMT, particularly for the incomplete data condition where in-domain bitext is missing for some language pairs. Our results in the curated leave-one-domain-out experiments show that multi-domain multilingual (MDML) NMT can boost zero-shot translation performance up to +10 gains on BLEU, as well as aid the generalisation of multi-domain NMT to the missing domain. We also explore strategies for effective integration of multilingual and multi-domain NMT, including language and domain tag combination and auxiliary task training. We find that learning domain-aware representations and adding target-language tags to the encoder leads to effective MDML-NMT.

Original languageEnglish
Title of host publicationWMT 2022 - Seventh Conference on Machine Translation - Proceedings of the Conference
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages381-396
Number of pages16
ISBN (Electronic)9781959429296
Publication statusPublished - 2022
EventConference on Machine Translation 2022 - Abu Dhabi, United Arab Emirates
Duration: 7 Dec 20228 Dec 2022
Conference number: 7th
https://aclanthology.org/volumes/2022.wmt-1/ (Proceedings)

Conference

ConferenceConference on Machine Translation 2022
Abbreviated titleWMT 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/228/12/22
Internet address

Cite this