TY - JOUR
T1 - Detecting common subexpressions for multiple query optimization over loosely-coupled heterogeneous data sources
AU - Chaudhari, Mahesh B.
AU - Dietrich, Suzanne
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. 0915325. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - The research presented in this paper supports the identification of common subexpressions as candidates for potential materialized views that form the basis of multiple query optimization in a loosely-coupled distributed system where query expressions access heterogeneous data sources, including relations and data-centric XML. This paper introduces a unifying mixed multigraph formalism to represent SQL, XQuery, and LINQ queries in a common query graph model and a heuristics-based algorithm to detect common subexpressions. The identified common subexpressions represent an opportunity for defining a materialized view to avoid repeating computation. The common subexpressions may access only relations, only XML, or a combination of relations and XML. The mixed multigraph model and the heuristic rules presented in this paper have distinguished advantages over the existing approaches that consider only relational or XML data sources individually. The mixed multigraph model can present SQL, XQuery, and LINQ queries in a single graph model and the heuristic rules are designed to consider the identical and subsumed conditions at the same time. A prototype implementation of the algorithm illustrates the applicability of the approach using various examples from the research literature as well as scenarios over a Criminal Justice enterprise that include common subexpressions across relational and XML data sources.
AB - The research presented in this paper supports the identification of common subexpressions as candidates for potential materialized views that form the basis of multiple query optimization in a loosely-coupled distributed system where query expressions access heterogeneous data sources, including relations and data-centric XML. This paper introduces a unifying mixed multigraph formalism to represent SQL, XQuery, and LINQ queries in a common query graph model and a heuristics-based algorithm to detect common subexpressions. The identified common subexpressions represent an opportunity for defining a materialized view to avoid repeating computation. The common subexpressions may access only relations, only XML, or a combination of relations and XML. The mixed multigraph model and the heuristic rules presented in this paper have distinguished advantages over the existing approaches that consider only relational or XML data sources individually. The mixed multigraph model can present SQL, XQuery, and LINQ queries in a single graph model and the heuristic rules are designed to consider the identical and subsumed conditions at the same time. A prototype implementation of the algorithm illustrates the applicability of the approach using various examples from the research literature as well as scenarios over a Criminal Justice enterprise that include common subexpressions across relational and XML data sources.
KW - Common subexpressions
KW - Distributed databases
KW - Event and stream processing
KW - Heuristic rules
KW - LINQ
KW - SQL
KW - XQuery
UR - http://www.scopus.com/inward/record.url?scp=84914124860&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84914124860&partnerID=8YFLogxK
U2 - 10.1007/s10619-014-7166-6
DO - 10.1007/s10619-014-7166-6
M3 - Article
AN - SCOPUS:84914124860
VL - 34
SP - 119
EP - 143
JO - Distributed and Parallel Databases
JF - Distributed and Parallel Databases
SN - 0926-8782
IS - 2
ER -