SmartInt: Using mined attribute dependencies to integrate fragmented web databases

Ravi Gummadi, Anupam Khulbe, Aravind Kalavagattu, Sanil Salvi, Subbarao Kambhampati

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Many web databases can be seen as providing partial and overlapping information about entities in the world. To answer queries effectively, we need to integrate the information about the individual entities that are fragmented over multiple sources. At first blush this is just the inverse of traditional database normalization problem - rather than go from a universal relation to normalized tables, we want to reconstruct the universal relation given the tables (sources). The standard way of reconstructing the entities will involve joining the tables. Unfortunately, because of the autonomous and decentralized way in which the sources are populated, they often do not have Primary Key - Foreign Key relations. While tables do share attributes, direct joins over these shared attributes can result in reconstruction of many spurious entities thus seriously compromising precision. We present a unified approach that supports intelligent retrieval over fragmented web databases by mining and using inter-table dependencies. Experiments with the prototype implementation, SmartInt, show that its retrieval strikes a good balance between precision and recall.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference Companion on World Wide Web, WWW 2011
Pages51-52
Number of pages2
DOIs
StatePublished - Apr 29 2011
Event20th International Conference Companion on World Wide Web, WWW 2011 - Hyderabad, India
Duration: Mar 28 2011Apr 1 2011

Publication series

NameProceedings of the 20th International Conference Companion on World Wide Web, WWW 2011

Other

Other20th International Conference Companion on World Wide Web, WWW 2011
CountryIndia
CityHyderabad
Period3/28/114/1/11

Keywords

  • entity completion
  • loss of pk-fk
  • web databases

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint Dive into the research topics of 'SmartInt: Using mined attribute dependencies to integrate fragmented web databases'. Together they form a unique fingerprint.

  • Cite this

    Gummadi, R., Khulbe, A., Kalavagattu, A., Salvi, S., & Kambhampati, S. (2011). SmartInt: Using mined attribute dependencies to integrate fragmented web databases. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 (pp. 51-52). (Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011). https://doi.org/10.1145/1963192.1963219