TY - GEN
T1 - Semantic partitioning of web pages
AU - Vadrevu, Srinivas
AU - Gelgi, Fatih
AU - Davulcu, Hasan
PY - 2005
Y1 - 2005
N2 - In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchical content structures. These content structures enable us to automatically annotate labels in the Web pages with their semantic roles, thus yielding meta-data and instance information for the Web pages, Experimental results with the TAP knowledge base and computer science department Web sites, comprising 16,861 Web pages indicate that our algorithm is able gather meta-data accurately from various types of Web pages. The algorithm is able to achieve this performance without any domain specific engineering requirement.
AB - In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchical content structures. These content structures enable us to automatically annotate labels in the Web pages with their semantic roles, thus yielding meta-data and instance information for the Web pages, Experimental results with the TAP knowledge base and computer science department Web sites, comprising 16,861 Web pages indicate that our algorithm is able gather meta-data accurately from various types of Web pages. The algorithm is able to achieve this performance without any domain specific engineering requirement.
UR - http://www.scopus.com/inward/record.url?scp=33744788630&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33744788630&partnerID=8YFLogxK
U2 - 10.1007/11581062_9
DO - 10.1007/11581062_9
M3 - Conference contribution
AN - SCOPUS:33744788630
SN - 3540300171
SN - 9783540300175
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 107
EP - 118
BT - Web Information Systems Engineering, WISE 2005 - 6th International Conference on Web Information Systems Engineering, Proceedings
T2 - 6th International Conference on Web Information Systems Engineering, WISE 2005
Y2 - 20 November 2005 through 22 November 2005
ER -