Discovering web document associations for web site summarization

Kasim Candan, Wen Syan Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Complex web information structures prevent search engines from providing satisfactory context-sensitive retrieval. We see that in order to overcome this obstacle, it is essential to use techniques that re- cover the web authors' intentions and superimpose them with the users' retrieval contexts in summarizing web sites. Therefore, in this paper, we present a framework for discovering implicit associations among web documents for effective web site summarization. In the proposed frame- work, associations of web documents are induced by the web structure embedding them, as well as the contents of the documents and users' interests. We analyze the semantics of document associations and describe an algorithm which capture these semantics for enumerating and ranking possible document associations. We then use these asociations in creating context-sensitive summaries of web neighborhoods.

Original languageEnglish (US)
Title of host publicationData Warehousing and Knowledge Discovery - 3rd International Conference, DaWaK 2001, Proceedings
EditorsWerner Winiwarter, Yahiko Kambayashi, Masatoshi Arikawa
PublisherSpringer Verlag
Pages152-161
Number of pages10
ISBN (Print)3540425535, 9783540425533
DOIs
StatePublished - 2001
Event3rd International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2001 - Munich, Germany
Duration: Sep 5 2001Sep 7 2001

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2114
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2001
Country/TerritoryGermany
CityMunich
Period9/5/019/7/01

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Discovering web document associations for web site summarization'. Together they form a unique fingerprint.

Cite this