Abstract
Recurrent neural network language models (RNNLM) have recently demonstrated vast potential in modelling long-term dependencies for NLP problems, ranging from speech recognition to machine translation. In this work, we propose methods for conditioning RNNLMs on external side information, e.g., metadata such as keywords, description, document title or topic headline. Our experiments show consistent improvements of RNNLMs using side information over the baselines for two different datasets and genres in two languages. Interestingly, we found that side information in a foreign language can be highly beneficial in modelling texts in another language, serving as a form of cross-lingual language modelling.
Original language | English |
---|---|
Title of host publication | The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2016) |
Subtitle of host publication | Proceedings of the Conference, June 12-17, 2016, San Diego, California, USA |
Editors | Ani Nenkova, Owen Rambow |
Place of Publication | Stroudsburg, PA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 1250-1255 |
Number of pages | 6 |
ISBN (Print) | 9781941643914 |
Publication status | Published - 2016 |
Event | North American Association for Computational Linguistics 2016 - Sheraton San Diego Hotel & Marina, San Diego, United States of America Duration: 12 Jun 2016 → 17 Jun 2016 Conference number: 15th http://naacl.org/naacl-hlt-2016/ |
Conference
Conference | North American Association for Computational Linguistics 2016 |
---|---|
Abbreviated title | NAACL HLT 2016 |
Country/Territory | United States of America |
City | San Diego |
Period | 12/06/16 → 17/06/16 |
Internet address |