Size Matters: A Comparative Analysis of Community Detection Algorithms

Paul Wagenseller, Feng Wang, Weili Wu

Research output: Contribution to journalArticle

9 Scopus citations

Abstract

Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.

Original languageEnglish (US)
JournalIEEE Transactions on Computational Social Systems
DOIs
StateAccepted/In press - Jan 1 2018

Keywords

  • Clique
  • community detection
  • Detection algorithms
  • Dunbar's number
  • Network topology
  • overlapping community.
  • Size measurement
  • Topology
  • Twitter

ASJC Scopus subject areas

  • Modeling and Simulation
  • Social Sciences (miscellaneous)
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'Size Matters: A Comparative Analysis of Community Detection Algorithms'. Together they form a unique fingerprint.

  • Cite this