Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset

Min Zhang, Yinlin Fu, Kevin M. Bennett, Teresa Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (∼ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.

Original languageEnglish (US)
Title of host publicationIEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509034406
DOIs
StatePublished - Aug 16 2016
Event2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016 - Kunming, China
Duration: Jul 6 2016Jul 8 2016

Other

Other2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016
CountryChina
CityKunming
Period7/6/167/8/16

Fingerprint

Clustering algorithms
Rats

Keywords

  • coreset
  • Variational Bayesian Gaussian Mixture Model (VBGMM)

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems

Cite this

Zhang, M., Fu, Y., Bennett, K. M., & Wu, T. (2016). Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset. In IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems [7546405] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CITS.2016.7546405

Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset. / Zhang, Min; Fu, Yinlin; Bennett, Kevin M.; Wu, Teresa.

IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. Institute of Electrical and Electronics Engineers Inc., 2016. 7546405.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, M, Fu, Y, Bennett, KM & Wu, T 2016, Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset. in IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems., 7546405, Institute of Electrical and Electronics Engineers Inc., 2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016, Kunming, China, 7/6/16. https://doi.org/10.1109/CITS.2016.7546405
Zhang M, Fu Y, Bennett KM, Wu T. Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset. In IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. Institute of Electrical and Electronics Engineers Inc. 2016. 7546405 https://doi.org/10.1109/CITS.2016.7546405
Zhang, Min ; Fu, Yinlin ; Bennett, Kevin M. ; Wu, Teresa. / Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset. IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. Institute of Electrical and Electronics Engineers Inc., 2016.
@inproceedings{82be9b83852d41a2bcdf0c7a994775ed,
title = "Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset",
abstract = "Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (∼ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.",
keywords = "coreset, Variational Bayesian Gaussian Mixture Model (VBGMM)",
author = "Min Zhang and Yinlin Fu and Bennett, {Kevin M.} and Teresa Wu",
year = "2016",
month = "8",
day = "16",
doi = "10.1109/CITS.2016.7546405",
language = "English (US)",
booktitle = "IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset

AU - Zhang, Min

AU - Fu, Yinlin

AU - Bennett, Kevin M.

AU - Wu, Teresa

PY - 2016/8/16

Y1 - 2016/8/16

N2 - Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (∼ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.

AB - Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (∼ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.

KW - coreset

KW - Variational Bayesian Gaussian Mixture Model (VBGMM)

UR - http://www.scopus.com/inward/record.url?scp=84987642689&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84987642689&partnerID=8YFLogxK

U2 - 10.1109/CITS.2016.7546405

DO - 10.1109/CITS.2016.7546405

M3 - Conference contribution

AN - SCOPUS:84987642689

BT - IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems

PB - Institute of Electrical and Electronics Engineers Inc.

ER -