An optimization criterion for generalized discriminant analysis on undersampled problems

Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park

Research output: Contribution to journalArticle

203 Citations (Scopus)

Abstract

An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

Original languageEnglish (US)
Pages (from-to)982-994
Number of pages13
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume26
Issue number8
DOIs
StatePublished - Aug 2004
Externally publishedYes

Fingerprint

Generalized Singular Value Decomposition
Discriminant analysis
Singular value decomposition
Discriminant Analysis
Approximation algorithms
Pseudo-inverse
Optimization
Approximation Algorithms
Computational complexity
Decomposition Techniques
Exact Algorithms
Scatter
Centroid
Justification
Exceed
Computational Complexity
Sample Size
Optimization Problem
Experiments
Experiment

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Cite this

An optimization criterion for generalized discriminant analysis on undersampled problems. / Ye, Jieping; Janardan, Ravi; Park, Cheong Hee; Park, Haesun.

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 8, 08.2004, p. 982-994.

Research output: Contribution to journalArticle

Ye, Jieping ; Janardan, Ravi ; Park, Cheong Hee ; Park, Haesun. / An optimization criterion for generalized discriminant analysis on undersampled problems. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004 ; Vol. 26, No. 8. pp. 982-994.
@article{d12a617ce3af45baaf0f722ec904d5e9,
title = "An optimization criterion for generalized discriminant analysis on undersampled problems",
abstract = "An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.",
author = "Jieping Ye and Ravi Janardan and Park, {Cheong Hee} and Haesun Park",
year = "2004",
month = "8",
doi = "10.1109/TPAMI.2004.37",
language = "English (US)",
volume = "26",
pages = "982--994",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "8",

}

TY - JOUR

T1 - An optimization criterion for generalized discriminant analysis on undersampled problems

AU - Ye, Jieping

AU - Janardan, Ravi

AU - Park, Cheong Hee

AU - Park, Haesun

PY - 2004/8

Y1 - 2004/8

N2 - An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

AB - An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

UR - http://www.scopus.com/inward/record.url?scp=3242767684&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3242767684&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2004.37

DO - 10.1109/TPAMI.2004.37

M3 - Article

C2 - 15641729

AN - SCOPUS:3242767684

VL - 26

SP - 982

EP - 994

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 8

ER -