Decoding as dynamic programming for recurrent autoregressive models

Najam Zaidi, Trevor Cohn, Reza Haffari

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review


Decoding in autoregressive models (ARMs) consists of searching for a high scoring output sequence under the trained model. Standard decoding methods, based on unidirectional greedy algorithm or beam search, are suboptimal due to error propagation and myopic decisions which do not account for future steps in the generation process. In this paper we present a novel decoding approach based on the method of auxiliary coordinates (Carreira-Perpinan & Wang, 2014) to address the aforementioned shortcomings. Our method introduces discrete variables for output tokens, and auxiliary continuous variables representing the states of the underlying ARM. The auxiliary variables lead to a factor graph approximation of the ARM, whose maximum a posteriori (MAP) solution is found exactly using dynamic programming. The MAP solution is then used to recreate an improved factor graph approximation of the ARM via updated auxiliary variables. We then extend our approach to decode in an ensemble of ARMs, possibly with different generation orders, which is out of reach for the standard unidirectional decoding algorithms. Experiments on the text infilling task over SWAG and Daily Dialogue datasets show that our decoding method is superior to strong competing decoding methods.
Original languageEnglish
Title of host publicationInternational Conference on Learning Representations, ICLR 2020
EditorsShakir Mohamed
Place of PublicationPortland OR USA
Number of pages11
Publication statusPublished - 2020
EventInternational Conference on Learning Representations 2020 - Virtual, Addis Ababa, Ethiopia
Duration: 26 Apr 202030 Apr 2020
Conference number: 8th


ConferenceInternational Conference on Learning Representations 2020
Abbreviated titleICLR 2020
CityAddis Ababa
Internet address

Cite this