A balanced ensemble approach to weighting classifiers for text classification

Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Haixun Wang, David W. Cheung, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Scopus citations

Abstract

This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function, which combines the decisions of the individual classifiers in the ensemble. We show that the classification performance is affected by three weight components and they should be included in deriving an effective combination function. They are: (1) Global effectiveness, which measures the effectiveness of a member classifier in classifying a set of unseen documents; (2) Local effectiveness, which measures the effectiveness of a member classifier in classifying the particular domain of an unseen document; and (3) Decision confidence, which describes how confident a classifier is when making a decision when classifying a specific unseen document. We propose a new balanced combination function, called Dynamic Classifier Weighting (DCW), that incorporates the aforementioned three components. The empirical study demonstrates that the new combination function is highly effective for text classification.

Original languageEnglish (US)
Title of host publicationProceedings - Sixth International Conference on Data Mining, ICDM 2006
Pages869-873
Number of pages5
DOIs
StatePublished - Dec 1 2006
Event6th International Conference on Data Mining, ICDM 2006 - Hong Kong, China
Duration: Dec 18 2006Dec 22 2006

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other6th International Conference on Data Mining, ICDM 2006
CountryChina
CityHong Kong
Period12/18/0612/22/06

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'A balanced ensemble approach to weighting classifiers for text classification'. Together they form a unique fingerprint.

Cite this