Abstract
Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence. We propose a method to alleviate this problem by augmenting NMT systems with discrete translation lexicons that efficiently encode translations of these low-frequency words. We describe a method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on. We test two methods to combine this probability with the standard NMT probability: (1) using it as a bias, and (2) linear interpolation. Experiments on two corpora show an improvement of 2.0-2.3 BLEU and 0.13-0.44 NIST score, and faster convergence time.1
Original language | English |
---|---|
Title of host publication | Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing |
Editors | Jian Su, Kevin Duh, Xavier Carreras |
Place of Publication | Austin Texas |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 1557-1567 |
Number of pages | 11 |
DOIs | |
Publication status | Published - Nov 2016 |
Externally published | Yes |
Event | Empirical Methods in Natural Language Processing 2016 - Austin, United States of America Duration: 1 Nov 2016 → 5 Nov 2016 https://www.aclweb.org/mirror/emnlp2016/ https://www.aclweb.org/anthology/volumes/D16-1/ (Proceedings) |
Conference
Conference | Empirical Methods in Natural Language Processing 2016 |
---|---|
Abbreviated title | EMNLP 2016 |
Country/Territory | United States of America |
City | Austin |
Period | 1/11/16 → 5/11/16 |
Internet address |