Instant exceptional model mining using weighted controlled pattern sampling

Sandy Moens, Mario Boley

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

15 Citations (Scopus)

Abstract

When plugged into instant interactive data analytics processes, pattern mining algorithms are required to produce small collections of high quality patterns in short amounts of time. In the case of Exceptional Model Mining (EMM), even heuristic approaches like beam search can fail to deliver this requirement, because in EMM each search step requires a relatively expensive model induction. In this work, we extend previous work on high performance controlled pattern sampling by introducing extra weighting functionality, to give more importance to certain data records in a dataset. We use the extended framework to quickly obtain patterns that are likely to show highly deviating models. Additionally, we combine this randomized approach with a heuristic pruning procedure that optimizes the pattern quality further. Experiments show that in contrast to traditional beam search, this combined method is able to find higher quality patterns using short time budgets.

Original languageEnglish
Title of host publicationAdvances in Intelligent DataAnalysis XIII
Subtitle of host publication13th International Symposium, IDA 2014 Leuven, Belgium, October 30 – November 1, 2014 Proceedings
EditorsHendrik Blockeel, Matthijs van Leeuwen, Veronica Vinciotti
Place of PublicationCham Switzerland
PublisherSpringer
Pages203-214
Number of pages12
ISBN (Electronic)9783319125718
ISBN (Print)9783319125701
DOIs
Publication statusPublished - 2014
Externally publishedYes
EventInternational Symposium on Intelligent Data Analysis 2014 - Leuven, Belgium
Duration: 30 Oct 20141 Nov 2014
Conference number: 13th
http://www.ida2014.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume8819
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Symposium on Intelligent Data Analysis 2014
Abbreviated titleIDA 2014
CountryBelgium
CityLeuven
Period30/10/141/11/14
Internet address

Keywords

  • Controlled pattern sampling
  • Exceptional Model Mining
  • Subgroup discovery

Cite this