Fast memory efficient local outlier detection in data streams

Mahsa Salehi, Christopher Leckie, James Bezdek, Tharshan Vaithianathan, Xuyun Zhang

Research output: Contribution to journalArticleResearchpeer-review

133 Citations (Scopus)

Abstract

Outlier detection is an important task in data mining, with applications ranging from intrusion detection to human gait analysis. With the growing need to analyze high speed data streams, the task of outlier detection becomes even more challenging as traditional outlier detection techniques can no longer assume that all the data can be stored for processing. While the well-known Local Outlier Factor (LOF) algorithm has an incremental version, it assumes unbounded memory to keep all previous data points. In this paper, we propose a memory efficient incremental local outlier (MiLOF) detection algorithm for data streams, and a more flexible version (MiLOF-F), both have an accuracy close to Incremental LOF but within a fixed memory bound. Our experimental results show that both proposed approaches have better memory and time complexity than Incremental LOF while having comparable accuracy. In addition, we show that MiLOF-F is robust to changes in the number of data points, the number of underlying clusters and the number of dimensions in the data stream. These results show that MiLOF/MiLOF-F are well suited to application environments with limited memory (e.g., wireless sensor networks), and can be applied to high volume data streams.

Original languageEnglish
Pages (from-to)3246-3260
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume28
Issue number12
DOIs
Publication statusPublished - 1 Dec 2016
Externally publishedYes

Keywords

  • Local outlier
  • Memory efficiency
  • Outlier detection
  • Stream data mining

Cite this