TY - JOUR
T1 - Efficient processing of skyline-join queries over multiple data sources
AU - Nagendra, Mithila
AU - Candan, Kasim
PY - 2015/6/1
Y1 - 2015/6/1
N2 - Efficient processing of skyline queries has been an area of growing interest. Many of the earlier skyline techniques assumed that the skyline query is applied to a single data table. Naturally, these algorithms were not suitable for many applications in which the skyline query may involve attributes belonging to multiple data sources. In other words, if the data used in the skyline query are stored in multiple tables, then join operations would be required before the skyline can be searched. The task of computing skylines on multiple data sources has been coined as the skyline-join problem and various skyline-join algorithms have been proposed. However, the current proposals suffer several drawbacks: they often need to scan the input tables exhaustively in order to obtain the set of skyline-join results; moreover, the pruning techniques employed to eliminate the tuples are largely based on expensive pairwise tuple-to-tuple comparisons. In this article, we aim to address these shortcomings by proposing two novel skyline-join algorithms, namely skyline-sensitive join (S2J) and symmetric skyline-sensitive join (S3J), to process skyline queries over two data sources. Our approaches compute the results using a novel layer/region pruning technique (LR-pruning) that prunes the join space in blocks as opposed to individual data points, thereby avoiding excessive pairwise point-to-point dominance checks. Furthermore, the S3J algorithm utilizes an early stopping condition in order to successfully compute the skyline results by accessing only a subset of the input tables. In addition to S2J and S3J, we also propose the S2J-M and S3J-M algorithms. These algorithms extend S2J's and S3J's two-way skyline-join ability to efficiently process skyline-join queries over more than two data sources. S2J-M and S3J-M leverage the extended concept of LR-pruning, called M-way LR-pruning, to compute multi-way skyline-joins in which more than two data sources are integrated during skyline processing. We report extensive experimental results that confirm the advantages of the proposed algorithms over state-of-the-art skyline-join techniques.
AB - Efficient processing of skyline queries has been an area of growing interest. Many of the earlier skyline techniques assumed that the skyline query is applied to a single data table. Naturally, these algorithms were not suitable for many applications in which the skyline query may involve attributes belonging to multiple data sources. In other words, if the data used in the skyline query are stored in multiple tables, then join operations would be required before the skyline can be searched. The task of computing skylines on multiple data sources has been coined as the skyline-join problem and various skyline-join algorithms have been proposed. However, the current proposals suffer several drawbacks: they often need to scan the input tables exhaustively in order to obtain the set of skyline-join results; moreover, the pruning techniques employed to eliminate the tuples are largely based on expensive pairwise tuple-to-tuple comparisons. In this article, we aim to address these shortcomings by proposing two novel skyline-join algorithms, namely skyline-sensitive join (S2J) and symmetric skyline-sensitive join (S3J), to process skyline queries over two data sources. Our approaches compute the results using a novel layer/region pruning technique (LR-pruning) that prunes the join space in blocks as opposed to individual data points, thereby avoiding excessive pairwise point-to-point dominance checks. Furthermore, the S3J algorithm utilizes an early stopping condition in order to successfully compute the skyline results by accessing only a subset of the input tables. In addition to S2J and S3J, we also propose the S2J-M and S3J-M algorithms. These algorithms extend S2J's and S3J's two-way skyline-join ability to efficiently process skyline-join queries over more than two data sources. S2J-M and S3J-M leverage the extended concept of LR-pruning, called M-way LR-pruning, to compute multi-way skyline-joins in which more than two data sources are integrated during skyline processing. We report extensive experimental results that confirm the advantages of the proposed algorithms over state-of-the-art skyline-join techniques.
KW - Algorithms
KW - Design
KW - Performance
UR - http://www.scopus.com/inward/record.url?scp=84934768251&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84934768251&partnerID=8YFLogxK
U2 - 10.1145/2699483
DO - 10.1145/2699483
M3 - Article
AN - SCOPUS:84934768251
SN - 0362-5915
VL - 40
JO - ACM Transactions on Database Systems
JF - ACM Transactions on Database Systems
IS - 2
M1 - 10
ER -