Abstract
Optimal planning with respect to learned neural network (NN) models in continuous action and state spaces using mixed-integer linear programming (MILP) is a challenging task for branch-and-bound solvers due to the poor linear relaxation of the underlying MILP model. For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In this paper, we model the problem of finding optimal potential bounds for learned NN models as a bilevel program, and solve it using a novel finite-time constraint generation algorithm. We then strengthen the linear relaxation of the underlying MILP model by introducing constraints to bound the reward function based on the precomputed reward potentials. Experimentally, we show that our algorithm efficiently computes reward potentials for learned NN models, and that the overhead of computing reward potentials is justified by the overall strengthening of the underlying MILP model for the task of planning over long horizons.
Original language | English |
---|---|
Title of host publication | Principles and Practice of Constraint Programming |
Subtitle of host publication | 25th International Conference, CP 2019 Stamford, CT, USA, September 30 – October 4, 2019 Proceedings |
Editors | Thomas Schiex, Simon de Givry |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 674-689 |
Number of pages | 16 |
ISBN (Electronic) | 9783030300487 |
ISBN (Print) | 9783030300470 |
DOIs | |
Publication status | Published - 2019 |
Externally published | Yes |
Event | International Conference on Principles and Practice of Constraint Programming 2019 - Stamford, United States of America Duration: 30 Sept 2019 → 4 Oct 2019 Conference number: 25th https://cp2019.a4cp.org/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 11802 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | International Conference on Principles and Practice of Constraint Programming 2019 |
---|---|
Abbreviated title | CP 2019 |
Country/Territory | United States of America |
City | Stamford |
Period | 30/09/19 → 4/10/19 |
Internet address |
Keywords
- Neural networks
- Potential heuristics
- Planning
- Constraint generation