Discriminative K-means for clustering

Jieping Ye, Zheng Zhao, Mingrui Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

101 Citations (Scopus)

Abstract

We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure. Based on this equivalence relationship, we propose the Discriminative K-means (DisKmeans) algorithm for simultaneous LDA subspace selection and clustering, as well as an automatic parameter estimation procedure. We also present the nonlinear extension of DisKmeans using kernels. We show that the learning of the kernel matrix over a convex set of pre-specified kernel matrices can be incorporated into the clustering formulation. The connection between DisKmeans and several other clustering algorithms is also analyzed. The presented theories and algorithms are evaluated through experiments on a collection of benchmark data sets.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
StatePublished - 2009
Event21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada
Duration: Dec 3 2007Dec 6 2007

Other

Other21st Annual Conference on Neural Information Processing Systems, NIPS 2007
CountryCanada
CityVancouver, BC
Period12/3/0712/6/07

Fingerprint

Discriminant analysis
Clustering algorithms
Parameter estimation
Experiments

ASJC Scopus subject areas

  • Information Systems

Cite this

Ye, J., Zhao, Z., & Wu, M. (2009). Discriminative K-means for clustering. In Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Discriminative K-means for clustering. / Ye, Jieping; Zhao, Zheng; Wu, Mingrui.

Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ye, J, Zhao, Z & Wu, M 2009, Discriminative K-means for clustering. in Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 21st Annual Conference on Neural Information Processing Systems, NIPS 2007, Vancouver, BC, Canada, 12/3/07.
Ye J, Zhao Z, Wu M. Discriminative K-means for clustering. In Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009
Ye, Jieping ; Zhao, Zheng ; Wu, Mingrui. / Discriminative K-means for clustering. Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009.
@inproceedings{7b02a84897ee4a46a51309045f35c6f5,
title = "Discriminative K-means for clustering",
abstract = "We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure. Based on this equivalence relationship, we propose the Discriminative K-means (DisKmeans) algorithm for simultaneous LDA subspace selection and clustering, as well as an automatic parameter estimation procedure. We also present the nonlinear extension of DisKmeans using kernels. We show that the learning of the kernel matrix over a convex set of pre-specified kernel matrices can be incorporated into the clustering formulation. The connection between DisKmeans and several other clustering algorithms is also analyzed. The presented theories and algorithms are evaluated through experiments on a collection of benchmark data sets.",
author = "Jieping Ye and Zheng Zhao and Mingrui Wu",
year = "2009",
language = "English (US)",
isbn = "160560352X",
booktitle = "Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference",

}

TY - GEN

T1 - Discriminative K-means for clustering

AU - Ye, Jieping

AU - Zhao, Zheng

AU - Wu, Mingrui

PY - 2009

Y1 - 2009

N2 - We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure. Based on this equivalence relationship, we propose the Discriminative K-means (DisKmeans) algorithm for simultaneous LDA subspace selection and clustering, as well as an automatic parameter estimation procedure. We also present the nonlinear extension of DisKmeans using kernels. We show that the learning of the kernel matrix over a convex set of pre-specified kernel matrices can be incorporated into the clustering formulation. The connection between DisKmeans and several other clustering algorithms is also analyzed. The presented theories and algorithms are evaluated through experiments on a collection of benchmark data sets.

AB - We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure. Based on this equivalence relationship, we propose the Discriminative K-means (DisKmeans) algorithm for simultaneous LDA subspace selection and clustering, as well as an automatic parameter estimation procedure. We also present the nonlinear extension of DisKmeans using kernels. We show that the learning of the kernel matrix over a convex set of pre-specified kernel matrices can be incorporated into the clustering formulation. The connection between DisKmeans and several other clustering algorithms is also analyzed. The presented theories and algorithms are evaluated through experiments on a collection of benchmark data sets.

UR - http://www.scopus.com/inward/record.url?scp=84858787727&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858787727&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84858787727

SN - 160560352X

SN - 9781605603520

BT - Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

ER -