Fast memory efficient local outlier detection in data streams

Mahsa Salehi, Christopher Leckie, James Bezdek, Tharshan Vaithianathan, Xuyun Zhang

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Outlier detection is an important task in data mining, with applications ranging from intrusion detection to human gait analysis. With the growing need to analyze high speed data streams, the task of outlier detection becomes even more challenging as traditional outlier detection techniques can no longer assume that all the data can be stored for processing. While the well-known Local Outlier Factor (LOF) algorithm has an incremental version, it assumes unbounded memory to keep all previous data points. In this paper, we propose a memory efficient incremental local outlier (MiLOF) detection algorithm for data streams, and a more flexible version (MiLOF-F), both have an accuracy close to Incremental LOF but within a fixed memory bound. Our experimental results show that both proposed approaches have better memory and time complexity than Incremental LOF while having comparable accuracy. In addition, we show that MiLOF-F is robust to changes in the number of data points, the number of underlying clusters and the number of dimensions in the data stream. These results show that MiLOF/MiLOF-F are well suited to application environments with limited memory (e.g., wireless sensor networks), and can be applied to high volume data streams.

Original languageEnglish
Article number7530918
Pages (from-to)3246-3260
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume28
Issue number12
DOIs
Publication statusPublished - 1 Dec 2016
Externally publishedYes

Keywords

  • Local outlier
  • Memory efficiency
  • Outlier detection
  • Stream data mining

Cite this

Salehi, Mahsa ; Leckie, Christopher ; Bezdek, James ; Vaithianathan, Tharshan ; Zhang, Xuyun. / Fast memory efficient local outlier detection in data streams. In: IEEE Transactions on Knowledge and Data Engineering. 2016 ; Vol. 28, No. 12. pp. 3246-3260.
@article{c2e3e352c23e4c44a2e1fb5bcb70fbcd,
title = "Fast memory efficient local outlier detection in data streams",
abstract = "Outlier detection is an important task in data mining, with applications ranging from intrusion detection to human gait analysis. With the growing need to analyze high speed data streams, the task of outlier detection becomes even more challenging as traditional outlier detection techniques can no longer assume that all the data can be stored for processing. While the well-known Local Outlier Factor (LOF) algorithm has an incremental version, it assumes unbounded memory to keep all previous data points. In this paper, we propose a memory efficient incremental local outlier (MiLOF) detection algorithm for data streams, and a more flexible version (MiLOF-F), both have an accuracy close to Incremental LOF but within a fixed memory bound. Our experimental results show that both proposed approaches have better memory and time complexity than Incremental LOF while having comparable accuracy. In addition, we show that MiLOF-F is robust to changes in the number of data points, the number of underlying clusters and the number of dimensions in the data stream. These results show that MiLOF/MiLOF-F are well suited to application environments with limited memory (e.g., wireless sensor networks), and can be applied to high volume data streams.",
keywords = "Local outlier, Memory efficiency, Outlier detection, Stream data mining",
author = "Mahsa Salehi and Christopher Leckie and James Bezdek and Tharshan Vaithianathan and Xuyun Zhang",
year = "2016",
month = "12",
day = "1",
doi = "10.1109/TKDE.2016.2597833",
language = "English",
volume = "28",
pages = "3246--3260",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
number = "12",

}

Fast memory efficient local outlier detection in data streams. / Salehi, Mahsa; Leckie, Christopher; Bezdek, James ; Vaithianathan, Tharshan; Zhang, Xuyun.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 28, No. 12, 7530918, 01.12.2016, p. 3246-3260.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Fast memory efficient local outlier detection in data streams

AU - Salehi, Mahsa

AU - Leckie, Christopher

AU - Bezdek, James

AU - Vaithianathan, Tharshan

AU - Zhang, Xuyun

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Outlier detection is an important task in data mining, with applications ranging from intrusion detection to human gait analysis. With the growing need to analyze high speed data streams, the task of outlier detection becomes even more challenging as traditional outlier detection techniques can no longer assume that all the data can be stored for processing. While the well-known Local Outlier Factor (LOF) algorithm has an incremental version, it assumes unbounded memory to keep all previous data points. In this paper, we propose a memory efficient incremental local outlier (MiLOF) detection algorithm for data streams, and a more flexible version (MiLOF-F), both have an accuracy close to Incremental LOF but within a fixed memory bound. Our experimental results show that both proposed approaches have better memory and time complexity than Incremental LOF while having comparable accuracy. In addition, we show that MiLOF-F is robust to changes in the number of data points, the number of underlying clusters and the number of dimensions in the data stream. These results show that MiLOF/MiLOF-F are well suited to application environments with limited memory (e.g., wireless sensor networks), and can be applied to high volume data streams.

AB - Outlier detection is an important task in data mining, with applications ranging from intrusion detection to human gait analysis. With the growing need to analyze high speed data streams, the task of outlier detection becomes even more challenging as traditional outlier detection techniques can no longer assume that all the data can be stored for processing. While the well-known Local Outlier Factor (LOF) algorithm has an incremental version, it assumes unbounded memory to keep all previous data points. In this paper, we propose a memory efficient incremental local outlier (MiLOF) detection algorithm for data streams, and a more flexible version (MiLOF-F), both have an accuracy close to Incremental LOF but within a fixed memory bound. Our experimental results show that both proposed approaches have better memory and time complexity than Incremental LOF while having comparable accuracy. In addition, we show that MiLOF-F is robust to changes in the number of data points, the number of underlying clusters and the number of dimensions in the data stream. These results show that MiLOF/MiLOF-F are well suited to application environments with limited memory (e.g., wireless sensor networks), and can be applied to high volume data streams.

KW - Local outlier

KW - Memory efficiency

KW - Outlier detection

KW - Stream data mining

UR - http://www.scopus.com/inward/record.url?scp=84996567093&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2016.2597833

DO - 10.1109/TKDE.2016.2597833

M3 - Article

VL - 28

SP - 3246

EP - 3260

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 12

M1 - 7530918

ER -