Size Matters: A Comparative Analysis of Community Detection Algorithms

Paul Wagenseller, Feng Wang, Weili Wu

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.

Original languageEnglish (US)
JournalIEEE Transactions on Computational Social Systems
DOIs
StateAccepted/In press - Jan 1 2018

Fingerprint

Community Detection
Comparative Analysis
community
Social Media
social media
community size
Conductance
Modularity
Metric
Community
User Modeling
Community Structure
Personalization
Transitivity
Clique
twitter
personalization
Control Algorithm
Divides
Overlap

Keywords

  • Clique
  • community detection
  • Detection algorithms
  • Dunbar's number
  • Network topology
  • overlapping community.
  • Size measurement
  • Topology
  • Twitter

ASJC Scopus subject areas

  • Modeling and Simulation
  • Social Sciences (miscellaneous)
  • Human-Computer Interaction

Cite this

Size Matters : A Comparative Analysis of Community Detection Algorithms. / Wagenseller, Paul; Wang, Feng; Wu, Weili.

In: IEEE Transactions on Computational Social Systems, 01.01.2018.

Research output: Contribution to journalArticle

@article{0bcf85b5a3824d32ac067e7fe16b3b7c,
title = "Size Matters: A Comparative Analysis of Community Detection Algorithms",
abstract = "Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.",
keywords = "Clique, community detection, Detection algorithms, Dunbar's number, Network topology, overlapping community., Size measurement, Topology, Twitter",
author = "Paul Wagenseller and Feng Wang and Weili Wu",
year = "2018",
month = "1",
day = "1",
doi = "10.1109/TCSS.2018.2875626",
language = "English (US)",
journal = "IEEE Transactions on Computational Social Systems",
issn = "2329-924X",
publisher = "IEEE Systems, Man, and Cybernetics Society",

}

TY - JOUR

T1 - Size Matters

T2 - A Comparative Analysis of Community Detection Algorithms

AU - Wagenseller, Paul

AU - Wang, Feng

AU - Wu, Weili

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.

AB - Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.

KW - Clique

KW - community detection

KW - Detection algorithms

KW - Dunbar's number

KW - Network topology

KW - overlapping community.

KW - Size measurement

KW - Topology

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=85055869325&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055869325&partnerID=8YFLogxK

U2 - 10.1109/TCSS.2018.2875626

DO - 10.1109/TCSS.2018.2875626

M3 - Article

AN - SCOPUS:85055869325

JO - IEEE Transactions on Computational Social Systems

JF - IEEE Transactions on Computational Social Systems

SN - 2329-924X

ER -