Regularized discriminant analysis for high dimensional, low sample size data

Jieping Ye, Tie Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

Linear and Quadratic Discriminant Analysis have been used widely in many areas of data mining, machine learning, and bioinformatics. Friedman proposed a compromise between Linear and Quadratic Discriminant Analysis, called Regularized Discriminant Analysis (RDA), which has been shown to be more flexible in dealing with various class distributions. RDA applies the regularization techniques by employing two regularization parameters, which are chosen to jointly maximize the classification performance. The optimal pair of parameters is commonly estimated via crossvalidation from a set of candidate pairs. It is computationally prohibitive for high dimensional data, especially when the candidate set is large, which limits the applications of RDA to low dimensional data. In this paper, a novel algorithm for RDA is presented for high dimensional data. It can estimate the optimal regularization parameters from a large set of parameter candidates efficiently. Experiments on a variety of datasets confirm the claimed theoretical estimate of the efficiency, and also show that, for a properly chosen pair of regularization parameters, RDA performs favorably in classification, in comparison with other existing classification methods.

Original languageEnglish (US)
Title of host publicationKDD 2006
Subtitle of host publicationProceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages454-463
Number of pages10
ISBN (Print)1595933395, 9781595933393
DOIs
StatePublished - 2006
EventKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Philadelphia, PA, United States
Duration: Aug 20 2006Aug 23 2006

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2006

Conference

ConferenceKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CityPhiladelphia, PA
Period8/20/068/23/06

Keywords

  • Cross-validation
  • Dimensionality reduction
  • Quadratic Discriminant Analysis
  • Regularization

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint Dive into the research topics of 'Regularized discriminant analysis for high dimensional, low sample size data'. Together they form a unique fingerprint.

Cite this