TY - JOUR
T1 - Size Matters
T2 - A Comparative Analysis of Community Detection Algorithms
AU - Wagenseller, Paul
AU - Wang, Feng
AU - Wu, Weili
N1 - Funding Information:
Manuscript received January 13, 2018; revised April 28, 2018 and July 22, 2018; accepted September 27, 2018. Date of publication October 30, 2018; date of current version December 3, 2018. This work was supported by the National Science Foundation under Grant 1737861. (Corresponding author: Feng Wang.) P. Wagenseller, III, and F. Wang are with the School of Mathematical and Natural Sciences, Arizona State University, Glendale, AZ 85306 USA (e-mail: paul.wagenseller@asu.edu; fwang25@asu.edu).
Publisher Copyright:
© 2014 IEEE.
PY - 2018/12
Y1 - 2018/12
N2 - Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.
AB - Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar's number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.
KW - Clique
KW - Community detection
KW - Dunbar's number
KW - Overlapping community
UR - http://www.scopus.com/inward/record.url?scp=85055869325&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85055869325&partnerID=8YFLogxK
U2 - 10.1109/TCSS.2018.2875626
DO - 10.1109/TCSS.2018.2875626
M3 - Article
AN - SCOPUS:85055869325
SN - 2329-924X
VL - 5
SP - 951
EP - 960
JO - IEEE Transactions on Computational Social Systems
JF - IEEE Transactions on Computational Social Systems
IS - 4
M1 - 8515044
ER -