Variational extensions to EM and multinomial PCA

Wray Buntine

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

62 Citations (Scopus)


Several authors in recent years have proposed discrete analogues to principle component analysis intended to handle discrete or positive only data, for instance suited to analyzing sets of documents. Methods include non-negative matrix factorization, probabilistic latent semantic analysis, and latent Dirichlet allocation. This paper begins with a review of the basic theory of the variational extension to the expectation-maximization algorithm, and then presents discrete component finding algorithms in that light. Experiments are conducted on both bigram word data and document bag-of-word to expose some of the subtleties of this new class of algorithms.

Original languageEnglish
Title of host publicationMachine Learning
Subtitle of host publicationECML 2002 - 13th European Conference on Machine Learning, Proceedings
EditorsTapio Elomaa, Heikki Mannila, Hannu Toivonen
PublisherSpringer-Verlag London Ltd.
Number of pages12
ISBN (Print)9783540440369
Publication statusPublished - 1 Jan 2002
EventEuropean Conference on Machine Learning 2002 - Helsinki, Finland
Duration: 19 Aug 200223 Aug 2002
Conference number: 13th (Proceedings)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceEuropean Conference on Machine Learning 2002
Abbreviated titleECML 2002
Internet address

Cite this