TY - GEN
T1 - Layered processing of skyline-window-join (SWJ) queries using iteration-fabric
AU - Nagendra, Mithila
AU - Candan, Kasim
PY - 2013
Y1 - 2013
N2 - The problem of finding interesting tuples in a data set, more commonly known as the skyline problem, has been extensively studied in scenarios where the data is static. More recently, skyline research has moved towards data streaming environments, where tuples arrive/expire in a continuous manner. Several algorithms have been developed to track skyline changes over sliding windows; however, existing methods focus on skyline analysis in which all required skyline attributes belong to a single incoming data stream. This constraint renders current algorithms unsuitable for applications that require a real-time "join" operation to be carried out between multiple incoming data streams, arriving from different sources, before the skyline query can be answered. Based on this motivation, in this paper, we address the problem of computing skyline-window-join (SWJ) queries over pairs of data streams, considering sliding windows that take into account only the most recent tuples. In particular, we propose a Layered Skyline-window-Join (LSJ) operator that (a) partitions the overall process into processing layers and (b) maintains skyline-join results in an incremental manner by continuously monitoring the changes in all layers of the process. We combine the advantages of existing skyline methods (including those that efficiently maintain skyline results over a single stream, and those that compute the skyline of pairs of static data sets) to develop a novel iteration-fabric skyline-window-join processing structure. Using the iteration-fabric, LSJ eliminates redundant work across consecutive windows by leveraging shared data across all iteration layers of the windowed skyline-join processing. To the best of our knowledge, this is the first paper that addresses join-based skyline queries over sliding windows. Extensive experimental evaluations over real and simulated data show that LSJ provides large gains over naive extensions of existing schemes which are not designed to eliminate redundant work across multiple processing layers.
AB - The problem of finding interesting tuples in a data set, more commonly known as the skyline problem, has been extensively studied in scenarios where the data is static. More recently, skyline research has moved towards data streaming environments, where tuples arrive/expire in a continuous manner. Several algorithms have been developed to track skyline changes over sliding windows; however, existing methods focus on skyline analysis in which all required skyline attributes belong to a single incoming data stream. This constraint renders current algorithms unsuitable for applications that require a real-time "join" operation to be carried out between multiple incoming data streams, arriving from different sources, before the skyline query can be answered. Based on this motivation, in this paper, we address the problem of computing skyline-window-join (SWJ) queries over pairs of data streams, considering sliding windows that take into account only the most recent tuples. In particular, we propose a Layered Skyline-window-Join (LSJ) operator that (a) partitions the overall process into processing layers and (b) maintains skyline-join results in an incremental manner by continuously monitoring the changes in all layers of the process. We combine the advantages of existing skyline methods (including those that efficiently maintain skyline results over a single stream, and those that compute the skyline of pairs of static data sets) to develop a novel iteration-fabric skyline-window-join processing structure. Using the iteration-fabric, LSJ eliminates redundant work across consecutive windows by leveraging shared data across all iteration layers of the windowed skyline-join processing. To the best of our knowledge, this is the first paper that addresses join-based skyline queries over sliding windows. Extensive experimental evaluations over real and simulated data show that LSJ provides large gains over naive extensions of existing schemes which are not designed to eliminate redundant work across multiple processing layers.
UR - http://www.scopus.com/inward/record.url?scp=84881336326&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881336326&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2013.6544891
DO - 10.1109/ICDE.2013.6544891
M3 - Conference contribution
AN - SCOPUS:84881336326
SN - 9781467349086
T3 - Proceedings - International Conference on Data Engineering
SP - 985
EP - 996
BT - ICDE 2013 - 29th International Conference on Data Engineering
T2 - 29th International Conference on Data Engineering, ICDE 2013
Y2 - 8 April 2013 through 11 April 2013
ER -