Efficient discovery of sets of co-occurring items in event sequences

Boris Cule, Len Feremans, Bart Goethals

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

    4 Citations (Scopus)

    Abstract

    Discovering patterns in long event sequences is an important data mining task. Most existing work focuses on frequency-based quality measures that allow algorithms to use the anti-monotonicity property to prune the search space and efficiently discover the most frequent patterns. In this work, we step away from such measures, and evaluate patterns using cohesion—a measure of how close to each other the items making up the pattern appear in the sequence on average. We tackle the fact that cohesion is not an anti-monotonic measure by developing a novel pruning technique in order to reduce the search space. By doing so, we are able to efficiently unearth rare, but strongly cohesive, patterns that existing methods often fail to discover. The data and software related to this paper are available at https://bitbucket.org/len feremans/ sequencepatternmining public.

    Original languageEnglish
    Title of host publicationMachine Learning and Knowledge Discovery in Databases
    Subtitle of host publicationEuropean Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19–23, 2016, Proceedings, Part I
    EditorsPaolo Frasconi, Niels Landwehr, Giuseppe Manco, Jilles Vreeken
    PublisherSpringer
    Pages361-377
    Number of pages17
    ISBN (Electronic)9783319461281
    ISBN (Print)9783319461274
    DOIs
    Publication statusPublished - 2016
    EventEuropean Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2016
    - Riva del Garda, Italy
    Duration: 19 Sep 201623 Sep 2016
    Conference number: 15th
    http://www.ecmlpkdd2016.org/
    https://link.springer.com/book/10.1007/978-3-319-46128-1 (Proceedings)

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume9851 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferenceEuropean Conference on Machine Learning European Conference on Principles and Practice of Knowledge Discovery in Databases 2016
    Abbreviated titleECML PKDD 2016
    CountryItaly
    CityRiva del Garda
    Period19/09/1623/09/16
    Internet address

    Cite this