Probabilistic n-of-N skyline computation over uncertain data streams

Wenjie Zhang, Aiping Li, Muhammad Aamir Cheema, Ying Zhang, Lijun Chang

Research output: Contribution to journalArticleResearchpeer-review

10 Citations (Scopus)

Abstract

Skyline operator is a useful tool in multi-criteria decision making in various applications. Uncertainty is inherent in real applications due to various reasons. In this paper, we consider the problem of efficiently computing probabilistic skylines against the most recent N uncertain elements in a data stream seen so far. Specifically, we study the problem in the n-of-N model; that is, computing the probabilistic skyline for the most recent n (∀ n ≤ N) elements, where an element is a probabilistic skyline element if its skyline probability is not below a given probability threshold q. Firstly, an effective pruning technique to minimize the number of uncertain elements to be kept is developed. It can be shown that on average storing only O(logdN) uncertain elements from the most recent N elements is sufficient to support the precise computation of all probabilistic n-of-N skyline queries in a d-dimension space if the data distribution on each dimension is independent. A novel encoding scheme is then proposed together with efficient update techniques so that computing a probabilistic n-of-N skyline query in a d-dimension space is reduced to O(dloglogN + s) if the data distribution is independent, where s is the number of skyline points. A trigger based technique is provided to process continuous n-of-N skyline queries. Extensive experiments demonstrate that the new techniques on uncertain data streams can support on-line probabilistic skyline query computation over rapid data streams.
Original languageEnglish
Pages (from-to)1331-1350
Number of pages20
JournalWorld Wide Web-Internet and Web Information Systems
Volume18
Issue number5
DOIs
Publication statusPublished - Sept 2015
Externally publishedYes

Keywords

  • Skyline
  • Stream
  • Query processing
  • Uncertain

Cite this