Dynamic choice of state abstraction in Q-learning

Marco Tamassia, Fabio Zambetta, William L. Raffe, Florian 'Floyd' Mueller, Xiaodong Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Q-learning associates states and actions of a Markov Decision Process to expected future reward through online learning. In practice, however, when the state space is large and experience is still limited, the algorithm will not find a match between current state and experience unless some details describing states are ignored. On the other hand, reducing state information affects long term performance because decisions will need to be made on less informative inputs. We propose a variation of Q-learning that gradually enriches state descriptions, after enough experience is accumulated. This is coupled with an ad-hoc exploration strategy that aims at collecting key information that allows the algorithm to enrich state descriptions earlier. Experimental results obtained by applying our algorithm to the arcade game Pac-Man show that our approach significantly outperforms Q-learning during the learning process while not penalizing long-term performance.

Original languageEnglish
Title of host publicationFrontiers in Artificial Intelligence and Applications
Subtitle of host publicationECAI 2016 - 22nd European Conference on Artificial Intelligence 29 August–2 September 2016, The Hague, The Netherlands
EditorsGal A. Kaminka, Maria Fox, Paolo Bouquet, Eyke Hullermeier, Virginia Dignum, Frank Dignum, Frank van Harmelen
Place of PublicationAmsterdam Netherlands
PublisherIOS Press
Pages46-54
Number of pages9
ISBN (Electronic)9781614996729
ISBN (Print)9781614996712
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventEuropean Conference on Artificial Intelligence 2016 - The Hague, Netherlands
Duration: 29 Aug 20162 Sep 2016
Conference number: 22nd
http://www.ecai2016.org/

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume285
ISSN (Print)0922-6389

Conference

ConferenceEuropean Conference on Artificial Intelligence 2016
Abbreviated titleECAI 2016
CountryNetherlands
CityThe Hague
Period29/08/162/09/16
Internet address

Cite this

Tamassia, M., Zambetta, F., Raffe, W. L., Mueller, F. F., & Li, X. (2016). Dynamic choice of state abstraction in Q-learning. In G. A. Kaminka, M. Fox, P. Bouquet, E. Hullermeier, V. Dignum, F. Dignum, & F. van Harmelen (Eds.), Frontiers in Artificial Intelligence and Applications: ECAI 2016 - 22nd European Conference on Artificial Intelligence 29 August–2 September 2016, The Hague, The Netherlands (pp. 46-54). (Frontiers in Artificial Intelligence and Applications; Vol. 285). IOS Press. https://doi.org/10.3233/978-1-61499-672-9-46