Redundancy based feature selection for microarray data

Lei Yu, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

155 Scopus citations

Abstract

In gene expression microarray data analysis, selecting a small number of discriminative genes from thousands of genes is an important problem for accurate classification of diseases or phenotypes. The problem becomes particularly challenging due to the large number of features (genes) and small sample size. Traditional gene selection methods often select the top-ranked genes according to their individual discriminative power without handling the high degree of redundancy among the genes. Latest research shows that removing redundant genes among selected ones can achieve a better representation of the characteristics of the targeted phenotypes and lead to improved classification accuracy. Hence, we study in this paper the relationship between feature relevance and redundancy and propose an efficient method that can effectively remove redundant genes. The efficiency and effectiveness of our method in comparison with representative methods has been demonstrated through an empirical study using public microarray data sets.

Original languageEnglish (US)
Title of host publicationKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages737-742
Number of pages6
ISBN (Print)1581138881, 9781581138887
DOIs
StatePublished - 2004
EventKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Seattle, WA, United States
Duration: Aug 22 2004Aug 25 2004

Publication series

NameKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

OtherKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CitySeattle, WA
Period8/22/048/25/04

Keywords

  • Feature redundancy
  • Gene selection
  • Microarray data

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Redundancy based feature selection for microarray data'. Together they form a unique fingerprint.

Cite this