Locality-sensitive state-guided experience replay optimization for sparse rewards in online recommendation

Xiaocong Chen, Lina Yao, Julian McAuley, Weili Guan, Xiaojun Chang, Xianzhi Wang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

9 Citations (Scopus)

Abstract

Online recommendation requires handling rapidly changing user preferences. Deep reinforcement learning (DRL) is an effective means of capturing users' dynamic interest during interactions with recommender systems. Generally, it is challenging to train a DRL agent in online recommender systems because of the sparse rewards caused by the large action space (e.g., candidate item space) and comparatively fewer user interactions. Leveraging experience replay (ER) has been extensively studied to conquer the issue of sparse rewards. However, they adapt poorly to the complex environment of online recommender systems and are inefficient in learning an optimal strategy from past experience. As a step to filling this gap, we propose a novel state-aware experience replay model, in which the agent selectively discovers the most relevant and salient experiences and is guided to find the optimal policy for online recommendations. In particular, a locality-sensitive hashing method is proposed to selectively retain the most meaningful experience at scale and a prioritized reward-driven strategy is designed to replay more valuable experiences with higher chance. We formally show that the proposed method guarantees the upper and lower bound on experience replay and optimizes the space complexity, as well as empirically demonstrate our model's superiority to several existing experience replay methods over three benchmark simulation platforms.

Original languageEnglish
Title of host publicationProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsLuke Gallagher, Qingyun Wu
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages1316-1325
Number of pages10
ISBN (Electronic)9781450387323
DOIs
Publication statusPublished - 2022
EventACM International Conference on Research and Development in Information Retrieval 2022 - Madrid, Spain
Duration: 11 Jul 202215 Jul 2022
Conference number: 45th
https://dl.acm.org/doi/proceedings/10.1145/3477495 (Proceedings)
https://sigir.org/sigir2022/ (Website)

Conference

ConferenceACM International Conference on Research and Development in Information Retrieval 2022
Abbreviated titleSIGIR 2022
Country/TerritorySpain
CityMadrid
Period11/07/2215/07/22
Internet address

Keywords

  • deep reinforcement learning
  • experience replay
  • recommender systems

Cite this