OntoMiner: Bootstrapping ontologies from overlapping domain specific web sites

Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nagarajan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

In this paper, we present automated techniques for boot-strapping and populating specialized domain ontologies by organizing and mining a set of relevant overlapping Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships. We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News, Travel, and Shopping domains indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall.

Original languageEnglish (US)
Title of host publicationThirteenth International World Wide Web Conference Proceedings, WWW2004
Pages1232-1233
Number of pages2
StatePublished - Dec 1 2004
EventThirteenth International World Wide Web Conference Proceedings, WWW2004 - New York, NY, United States
Duration: May 17 2004May 22 2004

Publication series

NameThirteenth International World Wide Web Conference Proceedings, WWW2004

Other

OtherThirteenth International World Wide Web Conference Proceedings, WWW2004
CountryUnited States
CityNew York, NY
Period5/17/045/22/04

Keywords

  • Data Mining
  • Ontology
  • Semantic Web
  • Web Mining

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'OntoMiner: Bootstrapping ontologies from overlapping domain specific web sites'. Together they form a unique fingerprint.

  • Cite this

    Davulcu, H., Vadrevu, S., & Nagarajan, S. (2004). OntoMiner: Bootstrapping ontologies from overlapping domain specific web sites. In Thirteenth International World Wide Web Conference Proceedings, WWW2004 (pp. 1232-1233). (Thirteenth International World Wide Web Conference Proceedings, WWW2004).