Extremely Fast Hoeffding Adaptive Tree

Chaitanya Manapragada, Mahsa Salehi, Geoffrey I. Webb

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

2 Citations (Scopus)

Abstract

Many real-world data streams are non-stationary. Subject to concept drift, the distributions change over time. To retain accuracy in the face of such drift, online decision tree learners must discard parts of the tree that are no longer accurate and replace them by new subtrees that reflect the new distribution. The longstanding state-of-the-art online decision tree learner for non-stationary streams is Hoeffding Adaptive Tree (HAT), which adds a drift detection and response mechanism to the classic Very Fast Decision Tree (VFDT) online decision tree learner. However, for stationary distributions, VFDT has been superseded by Extremely Fast Decision Tree (EFDT), which uses a statistically more efficient learning mechanism than VFDT. This learning mechanism needs to be coupled with a compensatory revision mechanism that can compensate for circumstances where the learning mechanism is too eager. The current work develops a strategy to combine the best of both these state-of-the-art approaches, exploiting both the statistically efficient learning mechanism from EFDT and the highly effective drift detection and response mechanism of HAT. To do so requires decoupling of the EFDT splitting and revision mechanisms, as the latter incorrectly triggers the HAT drift detection mechanism. The resulting learner, Extremely Fast Hoeffding Adaptive Tree, responds to drift more rapidly and effectively than either HAT or EFDT, and attains a statistically significant advantage in accuracy even on stationary streams.

Original languageEnglish
Title of host publicationProceedings - 22nd IEEE International Conference on Data Mining, ICDM 2022
EditorsXingquan Zhu, Sanjay Ranka, My T. Thai, Takashi Washio, Xindong Wu
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages319-328
Number of pages10
ISBN (Electronic)9781665450997
ISBN (Print)9781665451000
DOIs
Publication statusPublished - 2022
EventIEEE International Conference on Data Mining 2022 - Orlando, United States of America
Duration: 28 Nov 20221 Dec 2022
Conference number: 22nd
https://ieeexplore.ieee.org/xpl/conhome/10027565/proceeding (Proceedings)
https://icdm22.cse.usf.edu/ (Website)

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
PublisherIEEE, Institute of Electrical and Electronics Engineers
Volume2022-November
ISSN (Print)1550-4786
ISSN (Electronic)2374-8486

Conference

ConferenceIEEE International Conference on Data Mining 2022
Abbreviated titleICDM 2022
Country/TerritoryUnited States of America
CityOrlando
Period28/11/221/12/22
Internet address

Keywords

  • concept drift
  • data mining
  • decision trees
  • online learning

Cite this