Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation

Poorya Zaremoodi, Wray Buntine, Gholamreza Haffari

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    26 Citations (Scopus)


    Neural Machine Translation (NMT) is notorious for its need for large amounts of
    bilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the SEQ2SEQ transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference, and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple blocks along with a trainable routing network. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.
    Original languageEnglish
    Title of host publicationACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics
    Subtitle of host publicationProceedings of the Conference, Vol. 2 (Short Papers)
    EditorsIryna Gurevych, Yusuke Miyao
    Place of PublicationStroudsburg PA USA
    PublisherAssociation for Computational Linguistics (ACL)
    Number of pages6
    ISBN (Print)9781948087346
    Publication statusPublished - 2018
    EventAnnual Meeting of the Association of Computational Linguistics 2018 - Melbourne, Australia
    Duration: 15 Jul 201820 Jul 2018
    Conference number: 56th


    ConferenceAnnual Meeting of the Association of Computational Linguistics 2018
    Abbreviated titleACL 2018
    Internet address

    Cite this