Anomaly detection in streaming nonstationary temporal data

Priyanga Dilini Talagala, Rob J. Hyndman, Kate Smith-Miles, Sevvandi Kandanaarachchi, Mario A. Muñoz

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This article proposes a framework that provides early detection of anomalous series within a large collection of nonstationary streaming time-series data. We define an anomaly as an observation, that is, very unlikely given the recent distribution of a given system. The proposed framework first calculates a boundary for the system’s typical behavior using extreme value theory. Then a sliding window is used to test for anomalous series within a newly arrived collection of series. The model uses time series features as inputs, and a density-based comparison to detect any significant changes in the distribution of the features. Using various synthetic and real world datasets, we demonstrate the wide applicability and usefulness of our proposed framework. We show that the proposed algorithm can work well in the presence of noisy nonstationarity data within multiple classes of time series. This framework is implemented in the open source R package oddstream. R code and data are available in the online supplementary materials.

Original languageEnglish
Number of pages15
JournalJournal of Computational and Graphical Statistics
DOIs
Publication statusAccepted/In press - 2019

Keywords

  • Concept drift
  • Extreme value theory
  • Feature-based time series analysis
  • Kernel-based density estimation
  • Multivariate time series
  • Outlier detection

Cite this

@article{196efd2cf2984e73b5e710a12d8c9868,
title = "Anomaly detection in streaming nonstationary temporal data",
abstract = "This article proposes a framework that provides early detection of anomalous series within a large collection of nonstationary streaming time-series data. We define an anomaly as an observation, that is, very unlikely given the recent distribution of a given system. The proposed framework first calculates a boundary for the system’s typical behavior using extreme value theory. Then a sliding window is used to test for anomalous series within a newly arrived collection of series. The model uses time series features as inputs, and a density-based comparison to detect any significant changes in the distribution of the features. Using various synthetic and real world datasets, we demonstrate the wide applicability and usefulness of our proposed framework. We show that the proposed algorithm can work well in the presence of noisy nonstationarity data within multiple classes of time series. This framework is implemented in the open source R package oddstream. R code and data are available in the online supplementary materials.",
keywords = "Concept drift, Extreme value theory, Feature-based time series analysis, Kernel-based density estimation, Multivariate time series, Outlier detection",
author = "Talagala, {Priyanga Dilini} and Hyndman, {Rob J.} and Kate Smith-Miles and Sevvandi Kandanaarachchi and Mu{\~n}oz, {Mario A.}",
year = "2019",
doi = "10.1080/10618600.2019.1617160",
language = "English",
journal = "Journal of Computational and Graphical Statistics",
issn = "1061-8600",
publisher = "Taylor & Francis",

}

Anomaly detection in streaming nonstationary temporal data. / Talagala, Priyanga Dilini; Hyndman, Rob J.; Smith-Miles, Kate; Kandanaarachchi, Sevvandi; Muñoz, Mario A.

In: Journal of Computational and Graphical Statistics, 2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Anomaly detection in streaming nonstationary temporal data

AU - Talagala, Priyanga Dilini

AU - Hyndman, Rob J.

AU - Smith-Miles, Kate

AU - Kandanaarachchi, Sevvandi

AU - Muñoz, Mario A.

PY - 2019

Y1 - 2019

N2 - This article proposes a framework that provides early detection of anomalous series within a large collection of nonstationary streaming time-series data. We define an anomaly as an observation, that is, very unlikely given the recent distribution of a given system. The proposed framework first calculates a boundary for the system’s typical behavior using extreme value theory. Then a sliding window is used to test for anomalous series within a newly arrived collection of series. The model uses time series features as inputs, and a density-based comparison to detect any significant changes in the distribution of the features. Using various synthetic and real world datasets, we demonstrate the wide applicability and usefulness of our proposed framework. We show that the proposed algorithm can work well in the presence of noisy nonstationarity data within multiple classes of time series. This framework is implemented in the open source R package oddstream. R code and data are available in the online supplementary materials.

AB - This article proposes a framework that provides early detection of anomalous series within a large collection of nonstationary streaming time-series data. We define an anomaly as an observation, that is, very unlikely given the recent distribution of a given system. The proposed framework first calculates a boundary for the system’s typical behavior using extreme value theory. Then a sliding window is used to test for anomalous series within a newly arrived collection of series. The model uses time series features as inputs, and a density-based comparison to detect any significant changes in the distribution of the features. Using various synthetic and real world datasets, we demonstrate the wide applicability and usefulness of our proposed framework. We show that the proposed algorithm can work well in the presence of noisy nonstationarity data within multiple classes of time series. This framework is implemented in the open source R package oddstream. R code and data are available in the online supplementary materials.

KW - Concept drift

KW - Extreme value theory

KW - Feature-based time series analysis

KW - Kernel-based density estimation

KW - Multivariate time series

KW - Outlier detection

UR - http://www.scopus.com/inward/record.url?scp=85068173207&partnerID=8YFLogxK

U2 - 10.1080/10618600.2019.1617160

DO - 10.1080/10618600.2019.1617160

M3 - Article

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

SN - 1061-8600

ER -