Discovering reliable dependencies from data: hardness and improved algorithms

Panagiotis Mandros, Mario Boley, Jilles Vreeken

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)

Abstract

The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data. In this paper, we systematically explore the algorithmic implications of using this measure for optimization. We show that the problem is NP-hard, which justifies the usage of worst-case exponential-time as well as heuristic search methods. We then substantially improve the practical performance for both optimization styles by deriving a novel admissible bounding function that has an unbounded potential for additional pruning over the previously proposed one. Finally, we empirically investigate the approximation ratio of the greedy algorithm and show that it produces highly competitive results in a fraction of time needed for complete branch-and-bound style search.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Data Mining (ICDM 2018)
EditorsDacheng Tao, Bhavani Thuraisingham
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages317-326
Number of pages10
ISBN (Electronic)9781538691588, 9781538691595
ISBN (Print)9781538691601
DOIs
Publication statusPublished - 2018
Externally publishedYes
EventIEEE International Conference on Data Mining 2018 - Singapore, Singapore
Duration: 17 Nov 201820 Nov 2018
http://icdm2018.org/

Conference

ConferenceIEEE International Conference on Data Mining 2018
Abbreviated titleICDM 2018
CountrySingapore
CitySingapore
Period17/11/1820/11/18
Internet address

Keywords

  • Approximate functional dependency
  • Branch-and-bound
  • Information theory
  • Knowledge discovery
  • Optimization

Cite this

Mandros, P., Boley, M., & Vreeken, J. (2018). Discovering reliable dependencies from data: hardness and improved algorithms. In D. Tao, & B. Thuraisingham (Eds.), 2018 IEEE International Conference on Data Mining (ICDM 2018) (pp. 317-326). IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICDM.2018.00047