Active feature selection using classes

Huan Liu, Lei Yu, Manoranjan Dash, Hiroshi Motoda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Scopus citations

Abstract

Feature selection is frequently used in data pre-processing for data mining. When the training data set is too large, sampling is commonly used to overcome the difficulty. This work investigates the applicability of active sampling in feature selection in a filter model setting. Our objective is to partition data by taking advantage of class information so as to achieve the same or better performance for feature selection with fewer but more relevant instances than random sampling. Two versions of active feature selection that employ class information are proposed and empirically evaluated. In comparison with random sampling, we conduct extensive experiments with benchmark data sets, and analyze reasons why class-based active feature selection works in the way it does. The results will help us deal with large data sets and provide ideas to scale up other feature selection algorithms.

Original languageEnglish (US)
Title of host publicationAdvances in Knowledge Discovery and Data Mining
EditorsKyuseok Shim, Kyu-Young Wang, Jongwoo Jeon, Jaideep Srivastava
PublisherSpringer Verlag
Pages474-485
Number of pages12
ISBN (Electronic)3540047603, 9783540047605
StatePublished - Jan 1 2003
Event7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2003 - Seoul, Korea, Republic of
Duration: Apr 30 2003May 2 2003

Publication series

NameLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
Volume2637
ISSN (Print)0302-9743

Conference

Conference7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2003
CountryKorea, Republic of
CitySeoul
Period4/30/035/2/03

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Liu, H., Yu, L., Dash, M., & Motoda, H. (2003). Active feature selection using classes. In K. Shim, K-Y. Wang, J. Jeon, & J. Srivastava (Eds.), Advances in Knowledge Discovery and Data Mining (pp. 474-485). (Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science); Vol. 2637). Springer Verlag.