SourceRank: Relevance and trust assessment for deep web sources based on inter-source agreement

Raju Balakrishnan, Subbarao Kambhampati

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specically, they fail to account for the fact that sources can vary in trustworthiness and individual results can vary in importance. In response, we formulate a global measure to calculate relevance and trustworthiness of a source based on agreement between the answers provided by different sources. Agreement is modeled as a graph with sources at the vertices. On this agreement graph, source quality scores - namely SourceRank - are calculated as the stationary visit probability of a weighted random walk. Our experiments on online databases and 675 book sources from Google Base show that SourceRank improves relevance of the results by 25-40% over existing methods and Google Base ranking. SourceRank also reduces linearly with the corruption levels of the sources.

Original languageEnglish (US)
Title of host publicationProceedings of the 19th International Conference on World Wide Web, WWW '10
Pages1055-1056
Number of pages2
DOIs
StatePublished - Jul 20 2010
Event19th International World Wide Web Conference, WWW2010 - Raleigh, NC, United States
Duration: Apr 26 2010Apr 30 2010

Publication series

NameProceedings of the 19th International Conference on World Wide Web, WWW '10

Other

Other19th International World Wide Web Conference, WWW2010
CountryUnited States
CityRaleigh, NC
Period4/26/104/30/10

    Fingerprint

Keywords

  • deep web
  • source selection
  • source trust
  • web databases

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Balakrishnan, R., & Kambhampati, S. (2010). SourceRank: Relevance and trust assessment for deep web sources based on inter-source agreement. In Proceedings of the 19th International Conference on World Wide Web, WWW '10 (pp. 1055-1056). (Proceedings of the 19th International Conference on World Wide Web, WWW '10). https://doi.org/10.1145/1772690.1772801