Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation

Shihong Yue, Teresa Wu, Yamin Wang, Kai Zhang, Weixia Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages499-519
Number of pages21
Volume6401 LNAI
DOIs
StatePublished - 2010
Event5th International Conference on Rough Set and Knowledge Technology, RSKT 2010 - Beijing, China
Duration: Oct 15 2010Oct 17 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6401 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other5th International Conference on Rough Set and Knowledge Technology, RSKT 2010
CountryChina
CityBeijing
Period10/15/1010/17/10

Fingerprint

Evidence Theory
Fusion
Fusion reactions
Clustering
Validity Index
Evaluation
Unsupervised learning
Electric fuses
Clustering algorithms
Distance Function
Clustering Methods
Evaluate
Unsupervised Learning
Complex Structure
Clustering Algorithm
Experiments
Correctness
Unstable

Keywords

  • clustering algorithm
  • Dampster-Shafer evidence theory
  • data structure
  • Validity index

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Yue, S., Wu, T., Wang, Y., Zhang, K., & Liu, W. (2010). Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6401 LNAI, pp. 499-519). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6401 LNAI). https://doi.org/10.1007/978-3-642-16248-0_70

Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation. / Yue, Shihong; Wu, Teresa; Wang, Yamin; Zhang, Kai; Liu, Weixia.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6401 LNAI 2010. p. 499-519 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6401 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yue, S, Wu, T, Wang, Y, Zhang, K & Liu, W 2010, Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6401 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6401 LNAI, pp. 499-519, 5th International Conference on Rough Set and Knowledge Technology, RSKT 2010, Beijing, China, 10/15/10. https://doi.org/10.1007/978-3-642-16248-0_70
Yue S, Wu T, Wang Y, Zhang K, Liu W. Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6401 LNAI. 2010. p. 499-519. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-16248-0_70
Yue, Shihong ; Wu, Teresa ; Wang, Yamin ; Zhang, Kai ; Liu, Weixia. / Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6401 LNAI 2010. pp. 499-519 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{da86130b5390435f983a6d62b621517c,
title = "Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation",
abstract = "Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.",
keywords = "clustering algorithm, Dampster-Shafer evidence theory, data structure, Validity index",
author = "Shihong Yue and Teresa Wu and Yamin Wang and Kai Zhang and Weixia Liu",
year = "2010",
doi = "10.1007/978-3-642-16248-0_70",
language = "English (US)",
isbn = "3642162479",
volume = "6401 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "499--519",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Dampster-shafer evidence theory based multi-characteristics fusion for clustering evaluation

AU - Yue, Shihong

AU - Wu, Teresa

AU - Wang, Yamin

AU - Zhang, Kai

AU - Liu, Weixia

PY - 2010

Y1 - 2010

N2 - Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.

AB - Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.

KW - clustering algorithm

KW - Dampster-Shafer evidence theory

KW - data structure

KW - Validity index

UR - http://www.scopus.com/inward/record.url?scp=78349277289&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78349277289&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-16248-0_70

DO - 10.1007/978-3-642-16248-0_70

M3 - Conference contribution

AN - SCOPUS:78349277289

SN - 3642162479

SN - 9783642162473

VL - 6401 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 499

EP - 519

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -