Abstract
Clustering data streams is an emerging challenge with a wide range of applications in areas including Wireless Sensor Networks, the Internet of Things, finance and social media. In an evolving data stream, a clustering algorithm is desired to both (a) assign observations to clusters and (b) identify anomalies in real-time. Current state-of-the-art algorithms in the literature do not address feature (b) as they only consider the spatial proximity of data, which results in (1) poor clustering and (2) poor demonstration of the temporal evolution of data in noisy environments. In this paper, we propose an online clustering algorithm that considers the temporal proximity of observations as well as their spatial proximity to identify anomalies in real-time. It identifies the evolution of clusters in noisy streams, incrementally updates the model and calculates the minimum window length over the evolving data stream without jeopardizing performance. To the best of our knowledge, this is the first online clustering algorithm that identifies anomalies in real-time and discovers the temporal evolution of clusters. Our contributions are supported by synthetic as well as real-world data experiments.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining |
Subtitle of host publication | 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018 - Proceedings, Part II |
Editors | Dinh Phung, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 508-521 |
Number of pages | 14 |
ISBN (Electronic) | 9783319930374 |
ISBN (Print) | 9783319930367 |
DOIs | |
Publication status | Published - 2018 |
Event | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2018 - Grand Hyatt, Melbourne, Australia Duration: 3 Jun 2018 → 6 Jun 2018 Conference number: 22nd http://pakdd2018.medmeeting.org/Content/92892 https://link.springer.com/book/10.1007/978-3-319-93034-3 (Proceedings) |
Publication series
Name | Lecture Notes in Artificial Intelligence |
---|---|
Publisher | Springer |
Volume | 10938 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2018 |
---|---|
Abbreviated title | PAKDD 2018 |
Country/Territory | Australia |
City | Melbourne |
Period | 3/06/18 → 6/06/18 |
Internet address |