CGStream: Continuous correlated graph query for data streams

Shirui Pan, Xingquan Zhu

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)

Abstract

In this paper, we propose to query correlated graph in a data stream scenario, where given a query graph q an algorithm is required to retrieve all the subgraphs whose Pearson's correlation coefficients with q are greater than a threshold θ over some graph data flowing in a stream fashion. Due to the dynamic changing nature of the stream data and the inherent complexity of the graph query process, treating graph streams as static datasets is computationally infeasible or ineffective. In the paper, we propose a novel algorithm, CGStream, to identify correlated graphs from data stream, by using a sliding window which covers a number of consecutive batches of stream data records. Our theme is to regard stream query as the traversing along a data stream and the query is achieved at a number of outlooks over the data stream. For each outlook, we derive a lower frequency bound to mine a set of frequent subgraph candidates, where the lower bound guarantees that no pattern is missing from the current outlook to the next outlook. On top of that, we derive an upper correlation bound and a heuristic rule to prune the candidate size, which helps reduce the computation cost at each outlook. Experimental results demonstrate that the proposed algorithm is several times, or even an order of magnitude, more efficient than the straightforward algorithm. Meanwhile, our algorithm achieves good performance in terms of query precision.

Original languageEnglish
Title of host publicationCIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
Pages1183-1192
Number of pages10
DOIs
Publication statusPublished - 19 Dec 2012
Externally publishedYes
EventACM International Conference on Information and Knowledge Management 2012 - Maui, United States of America
Duration: 29 Oct 20122 Nov 2012
Conference number: 21st
https://dl.acm.org/doi/proceedings/10.1145/2396761

Conference

ConferenceACM International Conference on Information and Knowledge Management 2012
Abbreviated titleCIKM 2012
CountryUnited States of America
CityMaui
Period29/10/122/11/12
Internet address

Keywords

  • correlated graph
  • data stream
  • pearson's correlation coefficient

Cite this