MIR_MAD: An efficient and on-line approach for anomaly detection in dynamic data stream

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Anomaly detection in a dynamic data stream is a challenging task. The endless bound and high arriving rate of data prohibits anomaly detection models to store all observations in memory for processing. In addition, the dynamically moving properties of the data stream exhibit concept drift. While recent studies focus on feature extraction for anomaly detection, majority of them assume data stream are static ignoring the possibility of concept drift occurring. Anomaly detection models must operate efficiently in order to deal with high volume and velocity data, that is to have low complexity and to learn incrementally from each arriving observation. Incremental learning allows the model to adapt to concept drift. In cases where drifting rate is higher than adaptation rate, the capability to detect concept drift and retraining a new model is much preferable to minimize the performance losses. In this paper, we propose MIR_MAD, an approach based on multiple incremental robust Mahalanobis estimators that is efficient, learns incrementally and has the capability to detect concept drift. MIR_MAD is fast, can be initialized with small amount of data, and is able to estimate the drift location on the data stream. Our empirical results show that MIR_MAD achieves state-of-the-art performance and is significantly faster. We also performed a case study to show that detecting concept drift is critical to minimize the reduction in model's performance.

Original languageEnglish
Title of host publicationProceedings - 20th IEEE International Conference on Data Mining Workshops, ICDMW 2020
EditorsGiuseppe Di Fatta, Victor Sheng, Alfredo Cuzzocrea, Carlo Zaniolo, Xindong Wu
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages424-431
Number of pages8
ISBN (Electronic)9781728190129
DOIs
Publication statusPublished - 2020
EventICDM Workshop on Continual Learning and Adaptation for Time
Evolving Data 2020
- Virtual, Sorrento, Italy
Duration: 17 Nov 202020 Nov 2020
http://icdm2020.bigke.org/ (Website)
https://ieeexplore.ieee.org/xpl/conhome/9346288/proceeding (Proceedings)

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
PublisherThe Institute of Electrical and Electronics Engineers, Inc
Volume2020-November
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

ConferenceICDM Workshop on Continual Learning and Adaptation for Time
Evolving Data 2020
Abbreviated titleCLEATED 2020
CountryItaly
CitySorrento
Period17/11/2020/11/20
Internet address

Keywords

  • Anomaly detection
  • Concept drift
  • Dynamic data stream
  • Incremental learning
  • Unsupervised

Cite this