Robust variational learning for multiclass kernel models with Stein Refinement

Khanh Nguyen, Trung Le, Tu Nguyen, Geoffrey Webb, Dinh Phung

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Kernel-based models have a strong generalization ability, but most of them, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performances are sensitive to the hyperparameters tuning, which highly demands computational resources. These problems render kernel methods problematic when dealing with large-scale datasets. To this end, we first formulate the optimization problem in a kernel-based learning setting as a posterior inference problem, and then develop a rich family of Recurrent Neural Network-based variational inference techniques. Unlike existing literature, which stops at the variational distribution and uses it as the surrogate for the true posterior distribution, here we further leverage Stein Variational Gradient Descent to further bring the variational distribution closer to the true posterior, we refer to this step as Stein Refinement. Putting these altogether, we arrive at a robust and efficient variational learning method for multiclass kernel machines with extremely accurate approximation. Moreover, our formulation enables efficient learning of kernel parameters and hyperparameters which robustifies the proposed method against data uncertainties. The extensive experimental results show that our method, without tuning any parameter, obtains comparable performance to LIBSVM, a well-known implementation of SVM, and outperforms other baselines while being able to seamlessly scale with large-scale datasets.

Original languageEnglish
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
DOIs
Publication statusAccepted/In press - 1 Dec 2020

Keywords

  • Bayes methods
  • big data
  • Data models
  • Kernel
  • Kernel method
  • Optimization
  • random feature
  • Stein variational gradient descent
  • Support vector machines
  • Training
  • Tuning
  • variational inference

Cite this