A selective sampling approach to active feature selection

Huan Liu, Hiroshi Motoda, Lei Yu

Research output: Contribution to journalArticle

103 Citations (Scopus)

Abstract

Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.

Original languageEnglish (US)
Pages (from-to)49-74
Number of pages26
JournalArtificial Intelligence
Volume159
Issue number1-2
DOIs
StatePublished - Nov 2004

Fingerprint

Feature extraction
Sampling
experiment
learning
performance
savings
Deterioration
Learning systems
Feature Selection
Experiments
evaluation
Experiment

Keywords

  • Dimensionality reduction
  • Feature selection and ranking
  • Learning
  • Sampling

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics

Cite this

A selective sampling approach to active feature selection. / Liu, Huan; Motoda, Hiroshi; Yu, Lei.

In: Artificial Intelligence, Vol. 159, No. 1-2, 11.2004, p. 49-74.

Research output: Contribution to journalArticle

Liu, Huan ; Motoda, Hiroshi ; Yu, Lei. / A selective sampling approach to active feature selection. In: Artificial Intelligence. 2004 ; Vol. 159, No. 1-2. pp. 49-74.
@article{077658bde6354cc09b32a8e247f1c0be,
title = "A selective sampling approach to active feature selection",
abstract = "Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.",
keywords = "Dimensionality reduction, Feature selection and ranking, Learning, Sampling",
author = "Huan Liu and Hiroshi Motoda and Lei Yu",
year = "2004",
month = "11",
doi = "10.1016/j.artint.2004.05.009",
language = "English (US)",
volume = "159",
pages = "49--74",
journal = "Artificial Intelligence",
issn = "0004-3702",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - A selective sampling approach to active feature selection

AU - Liu, Huan

AU - Motoda, Hiroshi

AU - Yu, Lei

PY - 2004/11

Y1 - 2004/11

N2 - Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.

AB - Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.

KW - Dimensionality reduction

KW - Feature selection and ranking

KW - Learning

KW - Sampling

UR - http://www.scopus.com/inward/record.url?scp=4644347255&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4644347255&partnerID=8YFLogxK

U2 - 10.1016/j.artint.2004.05.009

DO - 10.1016/j.artint.2004.05.009

M3 - Article

AN - SCOPUS:4644347255

VL - 159

SP - 49

EP - 74

JO - Artificial Intelligence

JF - Artificial Intelligence

SN - 0004-3702

IS - 1-2

ER -