Abstract
Bayesian optimization (BO) is an efficient method for optimizing expensive black-box functions. In real-world applications, BO often faces a major problem of missing values in inputs. The missing inputs can happen in two cases. First, the historical data for training BO often contain missing values. Second, when performing the function evaluation (e.g., computing alloy strength in a heat treatment process), errors may occur (e.g., a thermostat stops working) leading to an erroneous situation where the function is computed at a random unknown value instead of the suggested value. To deal with this problem, a common approach just simply skips data points where missing values happen. Clearly, this naive method cannot utilize data efficiently and often leads to poor performance. In this paper, we propose a novel BO method to handle missing inputs. We first find a probability distribution of each missing value so that we can impute the missing value by drawing a sample from its distribution. We then develop a new acquisition function based on the well-known Upper Confidence Bound (UCB) acquisition function, which considers the uncertainty of imputed values when suggesting the next point for function evaluation. We conduct comprehensive experiments on both synthetic and real-world applications to show the usefulness of our method.
| Original language | English |
|---|---|
| Title of host publication | European Conference, ECML PKDD 2020 Ghent, Belgium, September 14–18, 2020 Proceedings, Part II |
| Editors | Frank Hutter, Kristian Kersting, Jefrey Lijffijt, Isabel Valera |
| Place of Publication | Cham Switzerland |
| Publisher | Springer |
| Pages | 691-706 |
| Number of pages | 16 |
| ISBN (Electronic) | 9783030676612 |
| ISBN (Print) | 9783030676605 |
| DOIs | |
| Publication status | Published - 2021 |
| Externally published | Yes |
| Event | European Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2020 - Ghent, Belgium Duration: 14 Sept 2020 → 18 Sept 2020 https://ecmlpkdd2020.net/ https://link.springer.com/book/10.1007/978-3-030-67661-2 (Proceedings) |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Publisher | Springer |
| Volume | 12458 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | European Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2020 |
|---|---|
| Abbreviated title | ECML PKDD 2020 |
| Country/Territory | Belgium |
| City | Ghent |
| Period | 14/09/20 → 18/09/20 |
| Internet address |
Keywords
- Bayesian optimization
- Missing data
- Matrix factorization
- Gaussian process