Abstract

Efficient processing of skyline queries has been an area of growing interest. Many of the earlier skyline techniques assumed that the skyline query is applied to a single data table. Naturally, these algorithms were not suitable for many applications in which the skyline query may involve attributes belonging to multiple data sources. In other words, if the data used in the skyline query are stored in multiple tables, then join operations would be required before the skyline can be searched. The task of computing skylines on multiple data sources has been coined as the skyline-join problem and various skyline-join algorithms have been proposed. However, the current proposals suffer several drawbacks: they often need to scan the input tables exhaustively in order to obtain the set of skyline-join results; moreover, the pruning techniques employed to eliminate the tuples are largely based on expensive pairwise tuple-to-tuple comparisons. In this article, we aim to address these shortcomings by proposing two novel skyline-join algorithms, namely skyline-sensitive join (S<sup>2</sup>J) and symmetric skyline-sensitive join (S<sup>3</sup>J), to process skyline queries over two data sources. Our approaches compute the results using a novel layer/region pruning technique (LR-pruning) that prunes the join space in blocks as opposed to individual data points, thereby avoiding excessive pairwise point-to-point dominance checks. Furthermore, the S<sup>3</sup>J algorithm utilizes an early stopping condition in order to successfully compute the skyline results by accessing only a subset of the input tables. In addition to S<sup>2</sup>J and S<sup>3</sup>J, we also propose the S<sup>2</sup>J-M and S<sup>3</sup>J-M algorithms. These algorithms extend S<sup>2</sup>J's and S<sup>3</sup>J's two-way skyline-join ability to efficiently process skyline-join queries over more than two data sources. S<sup>2</sup>J-M and S<sup>3</sup>J-M leverage the extended concept of LR-pruning, called M-way LR-pruning, to compute multi-way skyline-joins in which more than two data sources are integrated during skyline processing. We report extensive experimental results that confirm the advantages of the proposed algorithms over state-of-the-art skyline-join techniques.

Original languageEnglish (US)
Article number10
JournalACM Transactions on Database Systems
Volume40
Issue number2
DOIs
StatePublished - Jun 1 2015

Keywords

  • Algorithms
  • Design
  • Performance

ASJC Scopus subject areas

  • Information Systems

Cite this