GoGP: fast online regression with Gaussian processes

Trung Le, Khanh Nguyen, Vu Nguyen, Tu Dinh Nguyen, Dinh Phung

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

8 Citations (Scopus)

Abstract

One of the most current challenging problems in Gaussian process regression (GPR) is to handle large-scale datasets and to accommodate an online learning setting where data arrive irregularly on the fly. In this paper, we introduce a novel online Gaussian process model that could scale with massive datasets. Our approach is formulated based on alternative representation of the Gaussian process under geometric and optimization views, hence termed geometric-based online GP (GoGP). We developed theory to guarantee that with a good convergence rate our proposed algorithm always produces a (sparse) solution which is close to the true optima to any arbitrary level of approximation accuracy specified a priori. Furthermore, our method is proven to scale seamlessly not only with large-scale datasets, but also to adapt accurately with streaming data. We extensively evaluated our proposed model against state-of-the-art baselines using several large-scale datasets for online regression task. The experimental results show that our GoGP delivered comparable, or slightly better, predictive performance while achieving a magnitude of computational speedup compared with its rivals under online setting. More importantly, its convergence behavior is guaranteed through our theoretical analysis, which is rapid and stable while achieving lower errors.

Original languageEnglish
Title of host publicationProceedings - 17th IEEE International Conference on Data Mining, ICDM 2017
EditorsVijay Raghavan, Srinivas Aluru, George Karypis, Lucio Miele, Xindong Wu
Place of PublicationLos Alamitos CA USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages257-266
Number of pages10
ISBN (Print)9781538638347
DOIs
Publication statusPublished - 15 Dec 2017
Externally publishedYes
EventIEEE International Conference on Data Mining 2017 - New Orleans, United States of America
Duration: 18 Nov 201721 Nov 2017
Conference number: 17th
http://icdm2017.bigke.org/
https://ieeexplore.ieee.org/xpl/conhome/8211002/proceeding (Proceedings)

Conference

ConferenceIEEE International Conference on Data Mining 2017
Abbreviated titleICDM 2017
CountryUnited States of America
CityNew Orleans
Period18/11/1721/11/17
Internet address

Keywords

  • Gaussian Process regression
  • Online learning

Cite this