Learning multi-faceted activities from heterogeneous data with the product space hierarchical Dirichlet processes

Thanh Binh Nguyen, Vu Nguyen, Svetha Venkatesh, Dinh Phung

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

1 Citation (Scopus)


Hierarchical Dirichlet processes (HDP) was originally designed and experimented for a single data channel. In this paper we enhanced its ability to model heterogeneous data using a richer structure for the base measure being a product-space. The enhanced model, called Product Space HDP (PS-HDP), can (1) simultaneously model heterogeneous data from multiple sources in a Bayesian nonparametric framework and (2) discover multilevel latent structures from data to result in different types of topics/latent structures that can be explained jointly. We experimented with the MDC dataset, a large and real-world data collected from mobile phones. Our goal was to discover identity–location– time (a.k.a who-where-when) patterns at different levels (globally for all groups and locally for each group). We provided analysis on the activities and patterns learned from our model, visualized, compared and contrasted with the ground-truth to demonstrate the merit of the proposed framework. We further quantitatively evaluated and reported its performance using standard metrics including F1-score, NMI, RI, and purity. We also compared the performance of the PS-HDP model with those of popular existing clustering methods (including K-Means, NNMF, GMM, DP-Means, and AP). Lastly, we demonstrate the ability of the model in learning activities with missing data, a common problem encountered in pervasive and ubiquitous computing applications.

Original languageEnglish
Title of host publicationTrends and Applications in Knowledge Discovery and Data Mining
Subtitle of host publicationPAKDD 2016 Workshops, BDM, MLSDA, PACC, WDMBF - Auckland, New Zealand, April 19, 2016 - Revised Selected Papers
EditorsHuiping Cao, Jinyan Li, Ruili Wang
Place of PublicationCham Switzerland
Number of pages13
ISBN (Electronic)9783319429960
ISBN (Print)9783319429953
Publication statusPublished - 2016
Externally publishedYes
EventPAKDD 2016 Workshops: BDM, MLSDA, PACC, WDMBF - Auckland, New Zealand
Duration: 19 Apr 201919 Apr 2019

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferencePAKDD 2016 Workshops
Country/TerritoryNew Zealand
Internet address

Cite this