Query Caching and Optimization in Distributed Mediator Systems

S. Adali, K. S. Candan, Y. Papakonstantinou, V. S. Subrahmanian

Research output: Contribution to journalArticle

204 Scopus citations

Abstract

Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.

Original languageEnglish (US)
Pages (from-to)137-148
Number of pages12
JournalSIGMOD Record (ACM Special Interest Group on Management of Data)
Volume25
Issue number2
StatePublished - Jun 1 1996
Externally publishedYes

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this