A randomized approach for approximating the number of frequent sets

Mario Boley, Henrik Grosskreutz

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

15 Citations (Scopus)

Abstract

We investigate the problem of counting the number of frequent (item)sets-a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we empirically show that the algorithm still has the desired accuracy on real-world datasets when its running time is capped polynomially.

Original languageEnglish
Title of host publicationProceedings - 8th IEEE International Conference on Data Mining, ICDM 2008
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages43-52
Number of pages10
ISBN (Print)9780769535029
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event8th IEEE International Conference on Data Mining, ICDM 2008 - Pisa, Italy
Duration: 15 Dec 200819 Dec 2008

Conference

Conference8th IEEE International Conference on Data Mining, ICDM 2008
CountryItaly
CityPisa
Period15/12/0819/12/08

Cite this

Boley, M., & Grosskreutz, H. (2008). A randomized approach for approximating the number of frequent sets. In Proceedings - 8th IEEE International Conference on Data Mining, ICDM 2008 (pp. 43-52). [4781099] IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICDM.2008.85