Abstract

RDF/XML has been widely recognized as the standard for annotating online Web documents and for transforming the HTML Web to the so called Semantic Web. In order to enable widespread usability for the Semantic Web there is a need to bootstrap large, rich and up-to-date domain ontologies that organize most relevant concepts, their relationships and instances. In this paper, we present automated tech-niques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships.We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News and Hotels do-main indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003
PublisherAssociation for Computing Machinery, Inc
Pages245-262
Number of pages18
StatePublished - 2003
Event1st International Conference on Semantic Web and Databases, SWDB 2003 - Berlin, Germany
Duration: Sep 7 2003Sep 8 2003

Other

Other1st International Conference on Semantic Web and Databases, SWDB 2003
CountryGermany
CityBerlin
Period9/7/039/8/03

Fingerprint

World Wide Web
Ontology
Websites
HTML
Semantic Web
XML
Hotels
Labels
Semantics

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications

Cite this

Davulcu, H., Vadrevu, S., & Nagarajan, S. (2003). OntoMiner: Bootstrapping and populating ontologies from domain specific web sites. In Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003 (pp. 245-262). Association for Computing Machinery, Inc.

OntoMiner : Bootstrapping and populating ontologies from domain specific web sites. / Davulcu, Hasan; Vadrevu, Srinivas; Nagarajan, Saravanakumar.

Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003. Association for Computing Machinery, Inc, 2003. p. 245-262.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Davulcu, H, Vadrevu, S & Nagarajan, S 2003, OntoMiner: Bootstrapping and populating ontologies from domain specific web sites. in Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003. Association for Computing Machinery, Inc, pp. 245-262, 1st International Conference on Semantic Web and Databases, SWDB 2003, Berlin, Germany, 9/7/03.
Davulcu H, Vadrevu S, Nagarajan S. OntoMiner: Bootstrapping and populating ontologies from domain specific web sites. In Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003. Association for Computing Machinery, Inc. 2003. p. 245-262
Davulcu, Hasan ; Vadrevu, Srinivas ; Nagarajan, Saravanakumar. / OntoMiner : Bootstrapping and populating ontologies from domain specific web sites. Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003. Association for Computing Machinery, Inc, 2003. pp. 245-262
@inproceedings{b94221387a0542ec999a12c0fd529403,
title = "OntoMiner: Bootstrapping and populating ontologies from domain specific web sites",
abstract = "RDF/XML has been widely recognized as the standard for annotating online Web documents and for transforming the HTML Web to the so called Semantic Web. In order to enable widespread usability for the Semantic Web there is a need to bootstrap large, rich and up-to-date domain ontologies that organize most relevant concepts, their relationships and instances. In this paper, we present automated tech-niques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships.We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News and Hotels do-main indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall.",
author = "Hasan Davulcu and Srinivas Vadrevu and Saravanakumar Nagarajan",
year = "2003",
language = "English (US)",
pages = "245--262",
booktitle = "Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - OntoMiner

T2 - Bootstrapping and populating ontologies from domain specific web sites

AU - Davulcu, Hasan

AU - Vadrevu, Srinivas

AU - Nagarajan, Saravanakumar

PY - 2003

Y1 - 2003

N2 - RDF/XML has been widely recognized as the standard for annotating online Web documents and for transforming the HTML Web to the so called Semantic Web. In order to enable widespread usability for the Semantic Web there is a need to bootstrap large, rich and up-to-date domain ontologies that organize most relevant concepts, their relationships and instances. In this paper, we present automated tech-niques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships.We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News and Hotels do-main indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall.

AB - RDF/XML has been widely recognized as the standard for annotating online Web documents and for transforming the HTML Web to the so called Semantic Web. In order to enable widespread usability for the Semantic Web there is a need to bootstrap large, rich and up-to-date domain ontologies that organize most relevant concepts, their relationships and instances. In this paper, we present automated tech-niques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships.We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News and Hotels do-main indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall.

UR - http://www.scopus.com/inward/record.url?scp=84899383095&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899383095&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84899383095

SP - 245

EP - 262

BT - Proceedings of the 1st International Conference on Semantic Web and Databases, SWDB 2003

PB - Association for Computing Machinery, Inc

ER -