A Bayesian framework for learning shared and individual subspaces from multiple data sources

Sunil Kumar Gupta, Dinh Phung, Brett Adams, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

11 Citations (Scopus)

Abstract

This paper presents a novel Bayesian formulation to exploit shared structures across multiple data sources, constructing foundations for effective mining and retrieval across disparate domains. We jointly analyze diverse data sources using a unifying piece of metadata (textual tags). We propose a method based on Bayesian Probabilistic Matrix Factorization (BPMF) which is able to explicitly model the partial knowledge common to the datasets using shared subspaces and the knowledge specific to each dataset using individual subspaces. For the proposed model, we derive an efficient algorithm for learning the joint factorization based on Gibbs sampling. The effectiveness of the model is demonstrated by social media retrieval tasks across single and multiple media. The proposed solution is applicable to a wider context, providing a formal framework suitable for exploiting individual as well as mutual knowledge present across heterogeneous data sources of many kinds.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 15th Pacific-Asia Conference, PAKDD 2011, Proceedings
Pages136-147
Number of pages12
EditionPART 1
DOIs
Publication statusPublished - 8 Jun 2011
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2011 - Shenzhen, China
Duration: 24 May 201127 May 2011
Conference number: 15th
https://link.springer.com/book/10.1007/978-3-642-20841-6 (Proceedings)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6634 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2011
Abbreviated titlePAKDD 2011
Country/TerritoryChina
CityShenzhen
Period24/05/1127/05/11
Internet address

Cite this