Mining "hidden phrase" definitions from the web

Hung V. Nguyen, P. Velamuru, D. Kolippakkam, Hasan Davulcu, Huan Liu, M. Ates

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A "hidden phrase" is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use co-occurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.

Original languageEnglish (US)
Pages (from-to)156-165
Number of pages10
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2642
StatePublished - 2003

Fingerprint

Mining
Search Engine
Association rules
Search engines
Cluster Analysis
Websites
Association Rule Mining
High-dimensional Data
Placement
Ranking
Clustering
Query
Experimental Results
Datasets

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Mining "hidden phrase" definitions from the web. / Nguyen, Hung V.; Velamuru, P.; Kolippakkam, D.; Davulcu, Hasan; Liu, Huan; Ates, M.

In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 2642, 2003, p. 156-165.

Research output: Contribution to journalArticle

@article{f9aba66ec76a457bbe113bb42269f6f9,
title = "Mining {"}hidden phrase{"} definitions from the web",
abstract = "Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A {"}hidden phrase{"} is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use co-occurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.",
author = "Nguyen, {Hung V.} and P. Velamuru and D. Kolippakkam and Hasan Davulcu and Huan Liu and M. Ates",
year = "2003",
language = "English (US)",
volume = "2642",
pages = "156--165",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Mining "hidden phrase" definitions from the web

AU - Nguyen, Hung V.

AU - Velamuru, P.

AU - Kolippakkam, D.

AU - Davulcu, Hasan

AU - Liu, Huan

AU - Ates, M.

PY - 2003

Y1 - 2003

N2 - Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A "hidden phrase" is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use co-occurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.

AB - Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A "hidden phrase" is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use co-occurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.

UR - http://www.scopus.com/inward/record.url?scp=33748519101&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33748519101&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:33748519101

VL - 2642

SP - 156

EP - 165

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -