Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Yuan Jin, Ming Liu, Yunfeng Li, Ruohua Xu, Lan Du, Longxiang Gao, Yong Xiang

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Non-negative tensor factorization models enable predictive analysis on count data. Among them, Bayesian Poisson–Gamma models can derive full posterior distributions of latent factors and are less sensitive to sparse count data. However, current inference methods for these Bayesian models adopt restricted update rules for the posterior parameters. They also fail to share the update information to better cope with the data sparsity. Moreover, these models are not endowed with a component that handles the imbalance in count data values. In this paper, we propose a novel variational auto-encoder framework called VAE-BPTF which addresses the above issues. It uses multi-layer perceptron networks to encode and share complex update information. The encoded information is then reweighted per data instance to penalize common data values before aggregated to compute the posterior parameters for the latent factors. Under synthetic data evaluation, VAE-BPTF tended to recover the right number of latent factors and posterior parameter values. It also outperformed current models in both reconstruction errors and latent factor (semantic) coherence across five real-world datasets. Furthermore, the latent factors inferred by VAE-BPTF are perceived to be meaningful and coherent under a qualitative analysis.
Original languageEnglish
Pages (from-to)505-532
Number of pages28
JournalData Mining and Knowledge Discovery
Volume35
DOIs
Publication statusPublished - 2021

Cite this