The LLUNATIC data-cleaning framework

Floris Geerts, Giansalvatore Mecca, Paolo Papotti, Donatello Santoro

Research output: Contribution to journalArticle

94 Citations (Scopus)

Abstract

Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

Original languageEnglish (US)
Pages (from-to)625-636
Number of pages12
JournalUnknown Journal
Volume6
Issue number9
StatePublished - 2013
Externally publishedYes

Fingerprint

Cleaning
Repair
Databases
Preferred numbers
Semantics
Scalability

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Geerts, F., Mecca, G., Papotti, P., & Santoro, D. (2013). The LLUNATIC data-cleaning framework. Unknown Journal, 6(9), 625-636.

The LLUNATIC data-cleaning framework. / Geerts, Floris; Mecca, Giansalvatore; Papotti, Paolo; Santoro, Donatello.

In: Unknown Journal, Vol. 6, No. 9, 2013, p. 625-636.

Research output: Contribution to journalArticle

Geerts, F, Mecca, G, Papotti, P & Santoro, D 2013, 'The LLUNATIC data-cleaning framework', Unknown Journal, vol. 6, no. 9, pp. 625-636.
Geerts F, Mecca G, Papotti P, Santoro D. The LLUNATIC data-cleaning framework. Unknown Journal. 2013;6(9):625-636.
Geerts, Floris ; Mecca, Giansalvatore ; Papotti, Paolo ; Santoro, Donatello. / The LLUNATIC data-cleaning framework. In: Unknown Journal. 2013 ; Vol. 6, No. 9. pp. 625-636.
@article{464f38080dda47cdae158d8aea581757,
title = "The LLUNATIC data-cleaning framework",
abstract = "Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.",
author = "Floris Geerts and Giansalvatore Mecca and Paolo Papotti and Donatello Santoro",
year = "2013",
language = "English (US)",
volume = "6",
pages = "625--636",
journal = "Scanning Electron Microscopy",
issn = "0586-5581",
publisher = "Scanning Microscopy International",
number = "9",

}

TY - JOUR

T1 - The LLUNATIC data-cleaning framework

AU - Geerts, Floris

AU - Mecca, Giansalvatore

AU - Papotti, Paolo

AU - Santoro, Donatello

PY - 2013

Y1 - 2013

N2 - Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

AB - Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

UR - http://www.scopus.com/inward/record.url?scp=84882696854&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84882696854&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84882696854

VL - 6

SP - 625

EP - 636

JO - Scanning Electron Microscopy

JF - Scanning Electron Microscopy

SN - 0586-5581

IS - 9

ER -