Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation

Poorya Zaremoodi, Wray Buntine, Gholamreza Haffari

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    Abstract

    Neural Machine Translation (NMT) is notorious for its need for large amounts of
    bilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the SEQ2SEQ transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference, and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple blocks along with a trainable routing network. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.
    Original languageEnglish
    Title of host publicationACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics
    Subtitle of host publicationProceedings of the Conference, Vol. 2 (Short Papers)
    EditorsIryna Gurevych, Yusuke Miyao
    Place of PublicationStroudsburg PA USA
    PublisherAssociation for Computational Linguistics (ACL)
    Pages656-661
    Number of pages6
    ISBN (Print)9781948087346
    Publication statusPublished - 2018
    EventAnnual Meeting of the Association for Computational Linguistics 2018 - Melbourne, Australia
    Duration: 15 Jul 201820 Jul 2018
    Conference number: 56th
    https://aclanthology.info/events/acl-2018

    Conference

    ConferenceAnnual Meeting of the Association for Computational Linguistics 2018
    Abbreviated titleACL 2018
    CountryAustralia
    CityMelbourne
    Period15/07/1820/07/18
    Internet address

    Cite this

    Zaremoodi, P., Buntine, W., & Haffari, G. (2018). Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation. In I. Gurevych, & Y. Miyao (Eds.), ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, Vol. 2 (Short Papers) (pp. 656-661). Stroudsburg PA USA: Association for Computational Linguistics (ACL).
    Zaremoodi, Poorya ; Buntine, Wray ; Haffari, Gholamreza. / Adaptive knowledge sharing in multi-task learning : improving low-resource neural machine translation. ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, Vol. 2 (Short Papers). editor / Iryna Gurevych ; Yusuke Miyao. Stroudsburg PA USA : Association for Computational Linguistics (ACL), 2018. pp. 656-661
    @inproceedings{52131f3b7d954a6392ec0e31bcb0e97a,
    title = "Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation",
    abstract = "Neural Machine Translation (NMT) is notorious for its need for large amounts ofbilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the SEQ2SEQ transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference, and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple blocks along with a trainable routing network. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.",
    author = "Poorya Zaremoodi and Wray Buntine and Gholamreza Haffari",
    year = "2018",
    language = "English",
    isbn = "9781948087346",
    pages = "656--661",
    editor = "Iryna Gurevych and Yusuke Miyao",
    booktitle = "ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics",
    publisher = "Association for Computational Linguistics (ACL)",

    }

    Zaremoodi, P, Buntine, W & Haffari, G 2018, Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation. in I Gurevych & Y Miyao (eds), ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, Vol. 2 (Short Papers). Association for Computational Linguistics (ACL), Stroudsburg PA USA, pp. 656-661, Annual Meeting of the Association for Computational Linguistics 2018, Melbourne, Australia, 15/07/18.

    Adaptive knowledge sharing in multi-task learning : improving low-resource neural machine translation. / Zaremoodi, Poorya; Buntine, Wray; Haffari, Gholamreza.

    ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, Vol. 2 (Short Papers). ed. / Iryna Gurevych; Yusuke Miyao. Stroudsburg PA USA : Association for Computational Linguistics (ACL), 2018. p. 656-661.

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    TY - GEN

    T1 - Adaptive knowledge sharing in multi-task learning

    T2 - improving low-resource neural machine translation

    AU - Zaremoodi, Poorya

    AU - Buntine, Wray

    AU - Haffari, Gholamreza

    PY - 2018

    Y1 - 2018

    N2 - Neural Machine Translation (NMT) is notorious for its need for large amounts ofbilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the SEQ2SEQ transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference, and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple blocks along with a trainable routing network. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.

    AB - Neural Machine Translation (NMT) is notorious for its need for large amounts ofbilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the SEQ2SEQ transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference, and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple blocks along with a trainable routing network. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.

    M3 - Conference Paper

    SN - 9781948087346

    SP - 656

    EP - 661

    BT - ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics

    A2 - Gurevych, Iryna

    A2 - Miyao, Yusuke

    PB - Association for Computational Linguistics (ACL)

    CY - Stroudsburg PA USA

    ER -

    Zaremoodi P, Buntine W, Haffari G. Adaptive knowledge sharing in multi-task learning: improving low-resource neural machine translation. In Gurevych I, Miyao Y, editors, ACL 2018 - The 56th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, Vol. 2 (Short Papers). Stroudsburg PA USA: Association for Computational Linguistics (ACL). 2018. p. 656-661