TY - GEN
T1 - Safety Guarantee of continuous join queries over punctuated data streams
AU - Li, Hua Gang
AU - Chen, Songting
AU - Tatemura, Junichi
AU - Agrawal, Divyakant
AU - Candan, K. Selçuk
AU - Hsiung, Wang Pin
PY - 2006
Y1 - 2006
N2 - Continuous join queries (CJQ) are needed for correlating data from multiple streams. One fundamental problem for processing such queries is that since the data streams are infinite, this would require the join operator to store infinite states and eventually run out of space. Punctuation semantics has been proposed to specifically address this problem. In particular, punctuations explicitly mark the end of a subset of data and, hence, enable purging of the stored data which will not contribute to any new query results. Given a set of available punctuation schemes, if one can identify that a CJQ still requires unbounded storage, then this query can be flagged as unsafe and can be prevented from running. Unfortunately, while Punctuation semantics is clearly useful, the mechanisms to identify if and how a particular CJQ could benefit from a given set of punctuation schemes are not yet known. In this paper, we provide sufficient and necessary conditions for checking whether a CJQ can be safely executed under a given set of punctuation schemes or not. In Particular, we introduce a novel punctuation graph to aid the analysis of the safety for a given query. We show that the safety checking Problem can be done in polynomial time based on this punctuation graph construct. In addition, various issues and challenges related to the safety checking of CJQs are highlighted.
AB - Continuous join queries (CJQ) are needed for correlating data from multiple streams. One fundamental problem for processing such queries is that since the data streams are infinite, this would require the join operator to store infinite states and eventually run out of space. Punctuation semantics has been proposed to specifically address this problem. In particular, punctuations explicitly mark the end of a subset of data and, hence, enable purging of the stored data which will not contribute to any new query results. Given a set of available punctuation schemes, if one can identify that a CJQ still requires unbounded storage, then this query can be flagged as unsafe and can be prevented from running. Unfortunately, while Punctuation semantics is clearly useful, the mechanisms to identify if and how a particular CJQ could benefit from a given set of punctuation schemes are not yet known. In this paper, we provide sufficient and necessary conditions for checking whether a CJQ can be safely executed under a given set of punctuation schemes or not. In Particular, we introduce a novel punctuation graph to aid the analysis of the safety for a given query. We show that the safety checking Problem can be done in polynomial time based on this punctuation graph construct. In addition, various issues and challenges related to the safety checking of CJQs are highlighted.
UR - http://www.scopus.com/inward/record.url?scp=34547990984&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547990984&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:34547990984
SN - 1595933859
SN - 9781595933850
T3 - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
SP - 19
EP - 30
BT - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
PB - Association for Computing Machinery
T2 - 32nd International Conference on Very Large Data Bases, VLDB 2006
Y2 - 12 September 2006 through 15 September 2006
ER -