Query Caching and Optimization in Distributed Mediator Systems

S. Adali, Kasim Candan, Y. Papakonstantinou, V. S. Subrahmanian

Research output: Contribution to journalArticle

204 Citations (Scopus)

Abstract

Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.

Original languageEnglish (US)
Pages (from-to)137-148
Number of pages12
JournalSIGMOD Record (ACM Special Interest Group on Management of Data)
Volume25
Issue number2
StatePublished - Jun 1996
Externally publishedYes

Fingerprint

Statistics
Costs
Query processing
Semantics

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software

Cite this

Query Caching and Optimization in Distributed Mediator Systems. / Adali, S.; Candan, Kasim; Papakonstantinou, Y.; Subrahmanian, V. S.

In: SIGMOD Record (ACM Special Interest Group on Management of Data), Vol. 25, No. 2, 06.1996, p. 137-148.

Research output: Contribution to journalArticle

Adali, S. ; Candan, Kasim ; Papakonstantinou, Y. ; Subrahmanian, V. S. / Query Caching and Optimization in Distributed Mediator Systems. In: SIGMOD Record (ACM Special Interest Group on Management of Data). 1996 ; Vol. 25, No. 2. pp. 137-148.
@article{c949103c0f014ec5b695de8e49c0acec,
title = "Query Caching and Optimization in Distributed Mediator Systems",
abstract = "Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.",
author = "S. Adali and Kasim Candan and Y. Papakonstantinou and Subrahmanian, {V. S.}",
year = "1996",
month = "6",
language = "English (US)",
volume = "25",
pages = "137--148",
journal = "SIGMOD Record",
issn = "0163-5808",
publisher = "Association for Computing Machinery (ACM)",
number = "2",

}

TY - JOUR

T1 - Query Caching and Optimization in Distributed Mediator Systems

AU - Adali, S.

AU - Candan, Kasim

AU - Papakonstantinou, Y.

AU - Subrahmanian, V. S.

PY - 1996/6

Y1 - 1996/6

N2 - Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.

AB - Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.

UR - http://www.scopus.com/inward/record.url?scp=0030156987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030156987&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0030156987

VL - 25

SP - 137

EP - 148

JO - SIGMOD Record

JF - SIGMOD Record

SN - 0163-5808

IS - 2

ER -