Bayesian optimization with missing inputs

Phuc Luong, Dang Nguyen, Sunil Kumar Gupta, Santu Rana, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Bayesian optimization (BO) is an efficient method for optimizing expensive black-box functions. In real-world applications, BO often faces a major problem of missing values in inputs. The missing inputs can happen in two cases. First, the historical data for training BO often contain missing values. Second, when performing the function evaluation (e.g., computing alloy strength in a heat treatment process), errors may occur (e.g., a thermostat stops working) leading to an erroneous situation where the function is computed at a random unknown value instead of the suggested value. To deal with this problem, a common approach just simply skips data points where missing values happen. Clearly, this naive method cannot utilize data efficiently and often leads to poor performance. In this paper, we propose a novel BO method to handle missing inputs. We first find a probability distribution of each missing value so that we can impute the missing value by drawing a sample from its distribution. We then develop a new acquisition function based on the well-known Upper Confidence Bound (UCB) acquisition function, which considers the uncertainty of imputed values when suggesting the next point for function evaluation. We conduct comprehensive experiments on both synthetic and real-world applications to show the usefulness of our method.

Original languageEnglish
Title of host publicationEuropean Conference, ECML PKDD 2020 Ghent, Belgium, September 14–18, 2020 Proceedings, Part II
EditorsFrank Hutter, Kristian Kersting, Jefrey Lijffijt, Isabel Valera
Place of PublicationCham Switzerland
PublisherSpringer
Pages691-706
Number of pages16
ISBN (Electronic)9783030676612
ISBN (Print)9783030676605
DOIs
Publication statusPublished - 2021
Externally publishedYes
EventEuropean Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2020 - Ghent, Belgium
Duration: 14 Sept 202018 Sept 2020
https://ecmlpkdd2020.net/
https://link.springer.com/book/10.1007/978-3-030-67661-2 (Proceedings)

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume12458
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2020
Abbreviated titleECML PKDD 2020
Country/TerritoryBelgium
CityGhent
Period14/09/2018/09/20
Internet address

Keywords

  • Bayesian optimization
  • Missing data
  • Matrix factorization
  • Gaussian process

Cite this