Abstract
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-of-the-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.
Original language | English |
---|---|
Title of host publication | The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2016) |
Subtitle of host publication | Proceedings of the Conference, June 12-17, 2016, San Diego, California, USA |
Editors | Ani Nenkova, Owen Rambow |
Place of Publication | Stroudsburg, PA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 332-342 |
Number of pages | 11 |
ISBN (Print) | 9781941643914 |
Publication status | Published - 2016 |
Event | North American Association for Computational Linguistics 2016 - Sheraton San Diego Hotel & Marina, San Diego, United States of America Duration: 12 Jun 2016 → 17 Jun 2016 Conference number: 15th http://naacl.org/naacl-hlt-2016/ |
Conference
Conference | North American Association for Computational Linguistics 2016 |
---|---|
Abbreviated title | NAACL HLT 2016 |
Country/Territory | United States of America |
City | San Diego |
Period | 12/06/16 → 17/06/16 |
Internet address |