Joint optimization of cost and coverage of query plans in data integration

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

Existing approaches for optimizing queries in data integration use decoupled strategies-attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive planning as well as highly inefficient plans. In this paper we present techniques for joint optimization of cost and coverage of the query plans. Our algorithms search in the space of parallel query plans that support multiple sources for each subgoal conjunct. The refinement of the partial plans takes into account the potential parallelism between source calls, and the binding compatibilities between the sources included in the plan. We start by introducing and motivating our query plan representation. We then briefly review how to compute the cost and coverage of a parallel plan. Next, we provide both a System-R style query optimization algorithm as well as a greedy local search algorithm for searching in the space of such query plans. Finally we present a simulation study that demonstrates that the plans generated by our approach will be significantly better, both in terms of planning cost, and in terms of plan execution cost, compared to the existing approaches.

Original languageEnglish (US)
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
EditorsH. Paques, L. Liu
Pages223-230
Number of pages8
StatePublished - 2001
EventProceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management - Atlanta, GA, United States
Duration: Nov 5 2001Nov 10 2001

Other

OtherProceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management
CountryUnited States
CityAtlanta, GA
Period11/5/0111/10/01

Fingerprint

Data integration
Query
Costs
Planning
Execution costs
Query optimization
Local search
Simulation study
Compatibility

ASJC Scopus subject areas

  • Business, Management and Accounting(all)

Cite this

Nie, Z., & Kambhampati, S. (2001). Joint optimization of cost and coverage of query plans in data integration. In H. Paques, & L. Liu (Eds.), International Conference on Information and Knowledge Management, Proceedings (pp. 223-230)

Joint optimization of cost and coverage of query plans in data integration. / Nie, Zaiqing; Kambhampati, Subbarao.

International Conference on Information and Knowledge Management, Proceedings. ed. / H. Paques; L. Liu. 2001. p. 223-230.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nie, Z & Kambhampati, S 2001, Joint optimization of cost and coverage of query plans in data integration. in H Paques & L Liu (eds), International Conference on Information and Knowledge Management, Proceedings. pp. 223-230, Proceedings of the 2001 ACM CIKM: 10th International Conference on Information and Knowledge Management, Atlanta, GA, United States, 11/5/01.
Nie Z, Kambhampati S. Joint optimization of cost and coverage of query plans in data integration. In Paques H, Liu L, editors, International Conference on Information and Knowledge Management, Proceedings. 2001. p. 223-230
Nie, Zaiqing ; Kambhampati, Subbarao. / Joint optimization of cost and coverage of query plans in data integration. International Conference on Information and Knowledge Management, Proceedings. editor / H. Paques ; L. Liu. 2001. pp. 223-230
@inproceedings{fce24260f9674b4b857c2aa9d6a1727e,
title = "Joint optimization of cost and coverage of query plans in data integration",
abstract = "Existing approaches for optimizing queries in data integration use decoupled strategies-attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive planning as well as highly inefficient plans. In this paper we present techniques for joint optimization of cost and coverage of the query plans. Our algorithms search in the space of parallel query plans that support multiple sources for each subgoal conjunct. The refinement of the partial plans takes into account the potential parallelism between source calls, and the binding compatibilities between the sources included in the plan. We start by introducing and motivating our query plan representation. We then briefly review how to compute the cost and coverage of a parallel plan. Next, we provide both a System-R style query optimization algorithm as well as a greedy local search algorithm for searching in the space of such query plans. Finally we present a simulation study that demonstrates that the plans generated by our approach will be significantly better, both in terms of planning cost, and in terms of plan execution cost, compared to the existing approaches.",
author = "Zaiqing Nie and Subbarao Kambhampati",
year = "2001",
language = "English (US)",
pages = "223--230",
editor = "H. Paques and L. Liu",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - Joint optimization of cost and coverage of query plans in data integration

AU - Nie, Zaiqing

AU - Kambhampati, Subbarao

PY - 2001

Y1 - 2001

N2 - Existing approaches for optimizing queries in data integration use decoupled strategies-attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive planning as well as highly inefficient plans. In this paper we present techniques for joint optimization of cost and coverage of the query plans. Our algorithms search in the space of parallel query plans that support multiple sources for each subgoal conjunct. The refinement of the partial plans takes into account the potential parallelism between source calls, and the binding compatibilities between the sources included in the plan. We start by introducing and motivating our query plan representation. We then briefly review how to compute the cost and coverage of a parallel plan. Next, we provide both a System-R style query optimization algorithm as well as a greedy local search algorithm for searching in the space of such query plans. Finally we present a simulation study that demonstrates that the plans generated by our approach will be significantly better, both in terms of planning cost, and in terms of plan execution cost, compared to the existing approaches.

AB - Existing approaches for optimizing queries in data integration use decoupled strategies-attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive planning as well as highly inefficient plans. In this paper we present techniques for joint optimization of cost and coverage of the query plans. Our algorithms search in the space of parallel query plans that support multiple sources for each subgoal conjunct. The refinement of the partial plans takes into account the potential parallelism between source calls, and the binding compatibilities between the sources included in the plan. We start by introducing and motivating our query plan representation. We then briefly review how to compute the cost and coverage of a parallel plan. Next, we provide both a System-R style query optimization algorithm as well as a greedy local search algorithm for searching in the space of such query plans. Finally we present a simulation study that demonstrates that the plans generated by our approach will be significantly better, both in terms of planning cost, and in terms of plan execution cost, compared to the existing approaches.

UR - http://www.scopus.com/inward/record.url?scp=0035747459&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035747459&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0035747459

SP - 223

EP - 230

BT - International Conference on Information and Knowledge Management, Proceedings

A2 - Paques, H.

A2 - Liu, L.

ER -