Efficient model selection for regularized linear discriminant analysis

Jieping Ye, Tao Xiong, Qi Li, Ravi Janardan, Jinbo Bi, Vladimir Cherkassky, Chandra Kambhamettu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

Original languageEnglish (US)
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
Pages532-539
Number of pages8
DOIs
StatePublished - 2006
Event15th ACM Conference on Information and Knowledge Management, CIKM 2006 - Arlington, VA, United States
Duration: Nov 6 2006Nov 11 2006

Other

Other15th ACM Conference on Information and Knowledge Management, CIKM 2006
CountryUnited States
CityArlington, VA
Period11/6/0611/11/06

Fingerprint

Model selection
Discriminant analysis
Regularization
Cross-validation
Experiment
Singularity
Dimension reduction
Support vector machine
Overfitting
Small sample
Sample size
Costs
Justification
Intrinsic

Keywords

  • Dimension reduction
  • Linear discriminant analysis
  • Model selection
  • Regularization

ASJC Scopus subject areas

  • Business, Management and Accounting(all)

Cite this

Ye, J., Xiong, T., Li, Q., Janardan, R., Bi, J., Cherkassky, V., & Kambhamettu, C. (2006). Efficient model selection for regularized linear discriminant analysis. In International Conference on Information and Knowledge Management, Proceedings (pp. 532-539) https://doi.org/10.1145/1183614.1183691

Efficient model selection for regularized linear discriminant analysis. / Ye, Jieping; Xiong, Tao; Li, Qi; Janardan, Ravi; Bi, Jinbo; Cherkassky, Vladimir; Kambhamettu, Chandra.

International Conference on Information and Knowledge Management, Proceedings. 2006. p. 532-539.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ye, J, Xiong, T, Li, Q, Janardan, R, Bi, J, Cherkassky, V & Kambhamettu, C 2006, Efficient model selection for regularized linear discriminant analysis. in International Conference on Information and Knowledge Management, Proceedings. pp. 532-539, 15th ACM Conference on Information and Knowledge Management, CIKM 2006, Arlington, VA, United States, 11/6/06. https://doi.org/10.1145/1183614.1183691
Ye J, Xiong T, Li Q, Janardan R, Bi J, Cherkassky V et al. Efficient model selection for regularized linear discriminant analysis. In International Conference on Information and Knowledge Management, Proceedings. 2006. p. 532-539 https://doi.org/10.1145/1183614.1183691
Ye, Jieping ; Xiong, Tao ; Li, Qi ; Janardan, Ravi ; Bi, Jinbo ; Cherkassky, Vladimir ; Kambhamettu, Chandra. / Efficient model selection for regularized linear discriminant analysis. International Conference on Information and Knowledge Management, Proceedings. 2006. pp. 532-539
@inproceedings{950702d018834d0aaf312f3180d238cd,
title = "Efficient model selection for regularized linear discriminant analysis",
abstract = "Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).",
keywords = "Dimension reduction, Linear discriminant analysis, Model selection, Regularization",
author = "Jieping Ye and Tao Xiong and Qi Li and Ravi Janardan and Jinbo Bi and Vladimir Cherkassky and Chandra Kambhamettu",
year = "2006",
doi = "10.1145/1183614.1183691",
language = "English (US)",
isbn = "1595934332",
pages = "532--539",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - Efficient model selection for regularized linear discriminant analysis

AU - Ye, Jieping

AU - Xiong, Tao

AU - Li, Qi

AU - Janardan, Ravi

AU - Bi, Jinbo

AU - Cherkassky, Vladimir

AU - Kambhamettu, Chandra

PY - 2006

Y1 - 2006

N2 - Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

AB - Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

KW - Dimension reduction

KW - Linear discriminant analysis

KW - Model selection

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=34547621415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547621415&partnerID=8YFLogxK

U2 - 10.1145/1183614.1183691

DO - 10.1145/1183614.1183691

M3 - Conference contribution

SN - 1595934332

SN - 9781595934338

SP - 532

EP - 539

BT - International Conference on Information and Knowledge Management, Proceedings

ER -