Small-variance asymptotics for Bayesian nonparametric models with constraints

Cheng Li, Santu Rana, Dinh Phung, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

2 Citations (Scopus)


The users often have additional knowledge when Bayesian nonparametric models (BNP) are employed, e.g. for clustering there may be prior knowledge that some of the data instances should be in the same cluster (must-link constraint) or in different clusters (cannot-link constraint), and similarly for topic modeling some words should be grouped together or separately because of an underlying semantic. This can be achieved by imposing appropriate sampling probabilities based on such constraints. However, the traditional inference technique of BNP models via Gibbs sampling is time consuming and is not scalable for large data. Variational approximations are faster but many times they do not offer good solutions. Addressing this we present a small-variance asymptotic analysis of the MAP estimates of BNP models with constraints. We derive the objective function for Dirichlet process mixture model with constraints and devise a simple and efficient K-means type algorithm.We further extend the small-variance analysis to hierarchical BNP models with constraints and devise a similar simple objective function. Experiments on synthetic and real data sets demonstrate the efficiency and effectiveness of our algorithms.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication19th Pacific-Asia Conference, PAKDD 2015 Ho Chi Minh City, Vietnam, May 19–22, 2015 Proceedings, Part II
EditorsTru Cao, Ee-Peng Lim, Tu-Bao Ho, Zhi-Hua Zhou, Hiroshi Motoda, David Cheung
Place of PublicationCham Switzerland
Number of pages14
ISBN (Electronic)9783319180328
ISBN (Print)9783319180311
Publication statusPublished - 2015
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2015 - Ho Chi Minh City, Vietnam
Duration: 19 May 201522 May 2015
Conference number: 19th (Proceedings)

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2015
Abbreviated titlePAKDD 2015
CityHo Chi Minh City
Internet address

Cite this