A selective sampling approach to active feature selection

Huan Liu, Hiroshi Motoda, Lei Yu

Research output: Contribution to journalArticle

107 Scopus citations

Abstract

Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.

Original languageEnglish (US)
Pages (from-to)49-74
Number of pages26
JournalArtificial Intelligence
Volume159
Issue number1-2
DOIs
StatePublished - Nov 1 2004

    Fingerprint

Keywords

  • Dimensionality reduction
  • Feature selection and ranking
  • Learning
  • Sampling

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Cite this