Efficient model selection for regularized linear discriminant analysis

Jieping Ye; Tao Xiong; Qi Li; Ravi Janardan; Jinbo Bi; Vladimir Cherkassky; Chandra Kambhamettu

doi:10.1145/1183614.1183691

Efficient model selection for regularized linear discriminant analysis

Jieping Ye, Tao Xiong, Qi Li, Ravi Janardan, Jinbo Bi, Vladimir Cherkassky, Chandra Kambhamettu

Computing and Augmented Intelligence, School of (IAFSE-SCAI)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

49 Scopus citations

Abstract

Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

Original language	English (US)
Title of host publication	Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006
Pages	532-539
Number of pages	8
DOIs	https://doi.org/10.1145/1183614.1183691
State	Published - 2006
Event	15th ACM Conference on Information and Knowledge Management, CIKM 2006 - Arlington, VA, United States Duration: Nov 6 2006 → Nov 11 2006

Publication series

Name	International Conference on Information and Knowledge Management, Proceedings

Other

Other	15th ACM Conference on Information and Knowledge Management, CIKM 2006
Country/Territory	United States
City	Arlington, VA
Period	11/6/06 → 11/11/06

Keywords

Dimension reduction
Linear discriminant analysis
Model selection
Regularization

ASJC Scopus subject areas

General Decision Sciences
General Business, Management and Accounting

Access to Document

10.1145/1183614.1183691

Cite this

Ye, J., Xiong, T., Li, Q., Janardan, R., Bi, J., Cherkassky, V., & Kambhamettu, C. (2006). Efficient model selection for regularized linear discriminant analysis. In Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006 (pp. 532-539). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1183614.1183691

Efficient model selection for regularized linear discriminant analysis. / Ye, Jieping; Xiong, Tao; Li, Qi et al.
Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006. 2006. p. 532-539 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Ye, J, Xiong, T, Li, Q, Janardan, R, Bi, J, Cherkassky, V & Kambhamettu, C 2006, Efficient model selection for regularized linear discriminant analysis. in Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006. International Conference on Information and Knowledge Management, Proceedings, pp. 532-539, 15th ACM Conference on Information and Knowledge Management, CIKM 2006, Arlington, VA, United States, 11/6/06. https://doi.org/10.1145/1183614.1183691

@inproceedings{950702d018834d0aaf312f3180d238cd,

title = "Efficient model selection for regularized linear discriminant analysis",

abstract = "Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).",

keywords = "Dimension reduction, Linear discriminant analysis, Model selection, Regularization",

author = "Jieping Ye and Tao Xiong and Qi Li and Ravi Janardan and Jinbo Bi and Vladimir Cherkassky and Chandra Kambhamettu",

year = "2006",

doi = "10.1145/1183614.1183691",

language = "English (US)",

isbn = "1595934332",

series = "International Conference on Information and Knowledge Management, Proceedings",

pages = "532--539",

booktitle = "Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006",

}

TY - GEN

T1 - Efficient model selection for regularized linear discriminant analysis

AU - Ye, Jieping

AU - Xiong, Tao

AU - Li, Qi

AU - Janardan, Ravi

AU - Bi, Jinbo

AU - Cherkassky, Vladimir

AU - Kambhamettu, Chandra

PY - 2006

Y1 - 2006

N2 - Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

AB - Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

KW - Dimension reduction

KW - Linear discriminant analysis

KW - Model selection

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=34547621415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547621415&partnerID=8YFLogxK

U2 - 10.1145/1183614.1183691

DO - 10.1145/1183614.1183691

M3 - Conference contribution

AN - SCOPUS:34547621415

SN - 1595934332

SN - 9781595934338

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 532

EP - 539

BT - Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006

T2 - 15th ACM Conference on Information and Knowledge Management, CIKM 2006

Y2 - 6 November 2006 through 11 November 2006

ER -

Efficient model selection for regularized linear discriminant analysis

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this