TY - JOUR
T1 - Robust variational learning for multiclass kernel models with Stein Refinement
AU - Nguyen, Khanh
AU - Le, Trung
AU - Nguyen, Tu
AU - Webb, Geoffrey
AU - Phung, Dinh
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Kernel-based models have a strong generalization ability, but most of them, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performances are sensitive to the hyperparameters tuning, which highly demands computational resources. These problems render kernel methods problematic when dealing with large-scale datasets. To this end, we first formulate the optimization problem in a kernel-based learning setting as a posterior inference problem, and then develop a rich family of Recurrent Neural Network-based variational inference techniques. Unlike existing literature, which stops at the variational distribution and uses it as the surrogate for the true posterior distribution, here we further leverage Stein Variational Gradient Descent to further bring the variational distribution closer to the true posterior, we refer to this step as Stein Refinement. Putting these altogether, we arrive at a robust and efficient variational learning method for multiclass kernel machines with extremely accurate approximation. Moreover, our formulation enables efficient learning of kernel parameters and hyperparameters which robustifies the proposed method against data uncertainties. The extensive experimental results show that our method, without tuning any parameter, obtains comparable performance to LIBSVM, a well-known implementation of SVM, and outperforms other baselines while being able to seamlessly scale with large-scale datasets.
AB - Kernel-based models have a strong generalization ability, but most of them, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performances are sensitive to the hyperparameters tuning, which highly demands computational resources. These problems render kernel methods problematic when dealing with large-scale datasets. To this end, we first formulate the optimization problem in a kernel-based learning setting as a posterior inference problem, and then develop a rich family of Recurrent Neural Network-based variational inference techniques. Unlike existing literature, which stops at the variational distribution and uses it as the surrogate for the true posterior distribution, here we further leverage Stein Variational Gradient Descent to further bring the variational distribution closer to the true posterior, we refer to this step as Stein Refinement. Putting these altogether, we arrive at a robust and efficient variational learning method for multiclass kernel machines with extremely accurate approximation. Moreover, our formulation enables efficient learning of kernel parameters and hyperparameters which robustifies the proposed method against data uncertainties. The extensive experimental results show that our method, without tuning any parameter, obtains comparable performance to LIBSVM, a well-known implementation of SVM, and outperforms other baselines while being able to seamlessly scale with large-scale datasets.
KW - Bayes methods
KW - big data
KW - Data models
KW - Kernel
KW - Kernel method
KW - Optimization
KW - random feature
KW - Stein variational gradient descent
KW - Support vector machines
KW - Training
KW - Tuning
KW - variational inference
UR - http://www.scopus.com/inward/record.url?scp=85097386680&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2020.3041509
DO - 10.1109/TKDE.2020.3041509
M3 - Article
AN - SCOPUS:85097386680
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
SN - 1041-4347
ER -