Abstract
We consider the continuum-armed bandits problem, under a novel setting of recommending the best arms within a fixed budget under aggregated feedback. This is motivated by applications where the precise rewards are impossible or expensive to obtain, while an aggregated reward or feedback, such as the average over a subset, is available. We constrain the set of reward functions by assuming that they are from a Gaussian Process and propose the Gaussian Process Optimistic Optimisation (GPOO) algorithm. We adaptively construct a tree with nodes as subsets of the arm space, where the feedback is the aggregated reward of representatives of a node. We propose a new simple regret notion with respect to aggregated feedback on the recommended arms. We provide theoretical analysis for the proposed algorithm, and recover single point feedback as a special case. We illustrate GPOO and compare it with related algorithms on simulated data.
Original language | English |
---|---|
Title of host publication | Thirty-Sixth AAAI Conference on Artificial Intelligence |
Editors | Vasant Honavar, Matthijs Spaan |
Place of Publication | Palo Alto CA USA |
Publisher | Association for the Advancement of Artificial Intelligence (AAAI) |
Pages | 9074-9081 |
Number of pages | 8 |
ISBN (Electronic) | 1577358767, 9781577358763 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | AAAI Conference on Artificial Intelligence 2022 - Online, United States of America Duration: 22 Feb 2022 → 1 Mar 2022 Conference number: 36th https://aaai-2022.virtualchair.net/index.html (Website) https://aaai.org/conference/aaai/aaai-22/ https://ojs.aaai.org/index.php/AAAI/issue/view/510 (Proceedings) |
Publication series
Name | Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 |
---|---|
Publisher | Association for the Advancement of Artificial Intelligence (AAAI) |
Volume | 36 |
ISSN (Print) | 2159-5399 |
ISSN (Electronic) | 2374-3468 |
Conference
Conference | AAAI Conference on Artificial Intelligence 2022 |
---|---|
Abbreviated title | AAAI 2022 |
Country/Territory | United States of America |
City | Online |
Period | 22/02/22 → 1/03/22 |
Internet address |
Keywords
- Machine Learning (ML)