WebDB: A system for querying semi-structured data on the web

Wen Syan Li, Junho Shim, K. Selçuk Candan

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

The World-Wide Web can be viewed as a collection of semi-structured multimedia documents in the form of Web pages connected through hyperlinks. Unlike most web search engines, which primarily focus on information retrieval functionality, WebDB aims at supporting a comprehensive database-like query functionality, including selection, aggregation, sorting, summary, grouping, and projection. WebDB allows users to access (1) document level information, such as title, URL, length, keywords types and last modified date; (2) intra-document structures, such as tables, forms and images and (3) inter-document linkage information, such as destination URLs and anchors. With these three types of information, comprehensive queries for complex Web-based applications, such as Web mining and Web site management, can be answered. WebDB is based on object-relational concepts: Object-oriented modeling and relational query language. In this paper, we present the data model, language and implementation of WebDB. We also present the novel visual query/browsing interface for semi-structured Web and Web documents. Our system provides high usability compared with other existing systems.

Original languageEnglish (US)
Pages (from-to)3-33
Number of pages31
JournalJournal of Visual Languages and Computing
Volume13
Issue number1
DOIs
StatePublished - Feb 2002
Externally publishedYes

Keywords

  • Object-relational DBMS
  • SQL3
  • Semi-structured data
  • Visual user interface
  • WWW
  • Web database
  • Web query language

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Computer Science Applications

Fingerprint Dive into the research topics of 'WebDB: A system for querying semi-structured data on the web'. Together they form a unique fingerprint.

Cite this