Discovery of latent subcommunities in a blog's readership

Brett Adams, Dinh Phung, Svetha Venkatesh

Research output: Contribution to journalArticleResearchpeer-review

7 Citations (Scopus)


The blogosphere has grown to be a mainstream forum of social interaction as well as a commercially attractive source of information and influence. Tools are needed to better understand how communities that adhere to individual blogs are constituted in order to facilitate new personal, socially-focused browsing paradigms, and understand how blog content is consumed, which is of interest to blog authors, big media, and search. We present a novel approach to blog subcommunity characterization by modeling individual blog readers using mixtures of an extension to the LDA family that jointly models phrases and time, Ngram Topic over Time (NTOT), and cluster with a number of similarity measures using Affinity Propagation. We experiment with two datasets: a small set of blogs whose authors provide feedback, and a set of popular, highly commented blogs, which provide indicators of algorithm scalability and interpretability without prior knowledge of a given blog. The results offer useful insight to the blog authors about their commenting community, and are observed to offer an integrated perspective on the topics of discussion and members engaged in those discussions for unfamiliar blogs. Our approach also holds promise as a component of solutions to related problems, such as online entity resolution and role discovery.

Original languageEnglish
Article number12
Number of pages30
JournalACM Transactions on the Web
Issue number3
Publication statusPublished - 1 Jul 2010
Externally publishedYes


  • Affinity Propagation
  • Blog
  • Topic models
  • Web communities

Cite this