Large-scale heteroscedastic regression via Gaussian process

Haitao Liu, Yew-Soon Ong, Jianfei Cai

Research output: Contribution to journalArticleResearchpeer-review

10 Citations (Scopus)


Heteroscedastic regression considering the varying noises among observations has many applications in the fields, such as machine learning and statistics. Here, we focus on the heteroscedastic Gaussian process (HGP) regression that integrates the latent function and the noise function in a unified nonparametric Bayesian framework. Though showing remarkable performance, HGP suffers from the cubic time complexity, which strictly limits its application to big data. To improve the scalability, we first develop a variational sparse inference algorithm, named VSHGP, to handle large-scale data sets. Furthermore, two variants are developed to improve the scalability and capability of VSHGP. The first is stochastic VSHGP (SVSHGP) that derives a factorized evidence lower bound, thus enhancing efficient stochastic variational inference. The second is distributed VSHGP (DVSHGP) that follows the Bayesian committee machine formalism to distribute computations over multiple local VSHGP experts with many inducing points and adopts hybrid parameters for experts to guard against overfitting and capture local variety. The superiority of DVSHGP and SVSHGP compared to the existing scalable HGP/homoscedastic GP is then extensively verified on various data sets.

Original languageEnglish
Pages (from-to)708-721
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
Issue number2
Publication statusPublished - Feb 2021


  • Distributed learning
  • heteroscedastic GP (HGP)
  • large scale
  • sparse approximation
  • stochastic variational inference

Cite this