Benchmarking the chase

Michael Benedikt, Boris Motik, George Konstantinidis, Paolo Papotti, Efthymia Tsamoura, Giansalvatore Mecca, Donatello Santoro

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

The chase is a family of algorithms used in a number of data management tasks, such as data exchange, answering queries under dependencies, query reformulation with constraints, and data cleaning. It is well established as a theoretical tool for understanding these tasks, and in addition a number of prototype systems have been developed. While individual chase-based systems and particular optimizations of the chase have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark - test infrastructure and a set of test scenarios - for evaluating chase implementations across a wide range of assumptions about the dependencies and the data. We used our benchmark to compare chase-based systems on data exchange and query answering tasks with one another, as well as with systems that can solve similar tasks developed in closely related communities. Our evaluation provided us with a number of new insights concerning the factors that impact the performance of chase implementations.

Original languageEnglish (US)
Title of host publicationPODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
PublisherAssociation for Computing Machinery
Pages37-52
Number of pages16
VolumePart F127745
ISBN (Electronic)9781450341981
DOIs
StatePublished - May 9 2017
Event36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017 - Chicago, United States
Duration: May 14 2017May 19 2017

Other

Other36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017
CountryUnited States
CityChicago
Period5/14/175/19/17

Fingerprint

Electronic data interchange
Benchmarking
Information management
Cleaning

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Cite this

Benedikt, M., Motik, B., Konstantinidis, G., Papotti, P., Tsamoura, E., Mecca, G., & Santoro, D. (2017). Benchmarking the chase. In PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (Vol. Part F127745, pp. 37-52). Association for Computing Machinery. https://doi.org/10.1145/3034786.3034796

Benchmarking the chase. / Benedikt, Michael; Motik, Boris; Konstantinidis, George; Papotti, Paolo; Tsamoura, Efthymia; Mecca, Giansalvatore; Santoro, Donatello.

PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. Vol. Part F127745 Association for Computing Machinery, 2017. p. 37-52.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Benedikt, M, Motik, B, Konstantinidis, G, Papotti, P, Tsamoura, E, Mecca, G & Santoro, D 2017, Benchmarking the chase. in PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. vol. Part F127745, Association for Computing Machinery, pp. 37-52, 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, United States, 5/14/17. https://doi.org/10.1145/3034786.3034796
Benedikt M, Motik B, Konstantinidis G, Papotti P, Tsamoura E, Mecca G et al. Benchmarking the chase. In PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. Vol. Part F127745. Association for Computing Machinery. 2017. p. 37-52 https://doi.org/10.1145/3034786.3034796
Benedikt, Michael ; Motik, Boris ; Konstantinidis, George ; Papotti, Paolo ; Tsamoura, Efthymia ; Mecca, Giansalvatore ; Santoro, Donatello. / Benchmarking the chase. PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. Vol. Part F127745 Association for Computing Machinery, 2017. pp. 37-52
@inproceedings{62d4905dab244450a480b5905acbece1,
title = "Benchmarking the chase",
abstract = "The chase is a family of algorithms used in a number of data management tasks, such as data exchange, answering queries under dependencies, query reformulation with constraints, and data cleaning. It is well established as a theoretical tool for understanding these tasks, and in addition a number of prototype systems have been developed. While individual chase-based systems and particular optimizations of the chase have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark - test infrastructure and a set of test scenarios - for evaluating chase implementations across a wide range of assumptions about the dependencies and the data. We used our benchmark to compare chase-based systems on data exchange and query answering tasks with one another, as well as with systems that can solve similar tasks developed in closely related communities. Our evaluation provided us with a number of new insights concerning the factors that impact the performance of chase implementations.",
author = "Michael Benedikt and Boris Motik and George Konstantinidis and Paolo Papotti and Efthymia Tsamoura and Giansalvatore Mecca and Donatello Santoro",
year = "2017",
month = "5",
day = "9",
doi = "10.1145/3034786.3034796",
language = "English (US)",
volume = "Part F127745",
pages = "37--52",
booktitle = "PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Benchmarking the chase

AU - Benedikt, Michael

AU - Motik, Boris

AU - Konstantinidis, George

AU - Papotti, Paolo

AU - Tsamoura, Efthymia

AU - Mecca, Giansalvatore

AU - Santoro, Donatello

PY - 2017/5/9

Y1 - 2017/5/9

N2 - The chase is a family of algorithms used in a number of data management tasks, such as data exchange, answering queries under dependencies, query reformulation with constraints, and data cleaning. It is well established as a theoretical tool for understanding these tasks, and in addition a number of prototype systems have been developed. While individual chase-based systems and particular optimizations of the chase have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark - test infrastructure and a set of test scenarios - for evaluating chase implementations across a wide range of assumptions about the dependencies and the data. We used our benchmark to compare chase-based systems on data exchange and query answering tasks with one another, as well as with systems that can solve similar tasks developed in closely related communities. Our evaluation provided us with a number of new insights concerning the factors that impact the performance of chase implementations.

AB - The chase is a family of algorithms used in a number of data management tasks, such as data exchange, answering queries under dependencies, query reformulation with constraints, and data cleaning. It is well established as a theoretical tool for understanding these tasks, and in addition a number of prototype systems have been developed. While individual chase-based systems and particular optimizations of the chase have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark - test infrastructure and a set of test scenarios - for evaluating chase implementations across a wide range of assumptions about the dependencies and the data. We used our benchmark to compare chase-based systems on data exchange and query answering tasks with one another, as well as with systems that can solve similar tasks developed in closely related communities. Our evaluation provided us with a number of new insights concerning the factors that impact the performance of chase implementations.

UR - http://www.scopus.com/inward/record.url?scp=85021227498&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021227498&partnerID=8YFLogxK

U2 - 10.1145/3034786.3034796

DO - 10.1145/3034786.3034796

M3 - Conference contribution

VL - Part F127745

SP - 37

EP - 52

BT - PODS 2017 - Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

PB - Association for Computing Machinery

ER -