Acclimatizing taxonomic semantics for hierarchical content classification

Lei Tang, Jianping Zhang, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Scopus citations

Abstract

Hierarchical models have been shown to be effective in content classification. However, we observe through empirical study that the performance of a hierarchical model varies with given taxonomies; even a semantically sound taxonomy has potential to change its structure for better classification. By scrutinizing typical cases, we elucidate why a given semantics-based hierarchy does not work well in content classification, and how it could be improved for accurate hierarchical classification. With these understandings, we propose effective localized solutions that modify the given taxonomy for accurate classification. We conduct extensive experiments on both toy and real-world data sets, report improved performance and interesting findings, and provide further analysis of algorithmic issues such as time complexity, robustness, and sensitivity to the number of features.

Original languageEnglish (US)
Title of host publicationKDD 2006
Subtitle of host publicationProceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages384-393
Number of pages10
StatePublished - 2006
EventKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Philadelphia, PA, United States
Duration: Aug 20 2006Aug 23 2006

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2006

Conference

ConferenceKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/TerritoryUnited States
CityPhiladelphia, PA
Period8/20/068/23/06

Keywords

  • Hierarchical Classification
  • Hierarchical Modeling
  • Taxonomy Adjustment
  • Text Classification

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Acclimatizing taxonomic semantics for hierarchical content classification'. Together they form a unique fingerprint.

Cite this