Automatically building probabilistic databases from the web

Lorenzo Blanco, Mirko Bronzi, Valter Crescenzi, Paolo Merialdo, Paolo Papotti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

A relevant number of web sites publish structured data about recognizable concepts (such as stock quotes, movies, restau- rants, etc.). There is a great chance to create applications that rely on a huge amount of data taken from the Web. We present an automatic and domain independent system that performs all the steps required to benefit from these data: it discovers data intensive web sites containing information about an entity of interest, extracts and integrate the published data, and finally performs a probabilistic analysis to characterize the impreciseness of the data and the accuracy of the sources. The results of the processing can be used to populate a probabilistic database.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference Companion on World Wide Web, WWW 2011
Pages185-188
Number of pages4
DOIs
StatePublished - 2011
Externally publishedYes
Event20th International Conference Companion on World Wide Web, WWW 2011 - Hyderabad, India
Duration: Mar 28 2011Apr 1 2011

Publication series

NameProceedings of the 20th International Conference Companion on World Wide Web, WWW 2011

Other

Other20th International Conference Companion on World Wide Web, WWW 2011
Country/TerritoryIndia
CityHyderabad
Period3/28/114/1/11

Keywords

  • data integration
  • probabilistic data
  • web data extraction

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Automatically building probabilistic databases from the web'. Together they form a unique fingerprint.

Cite this