TY - JOUR
T1 - Community extraction for social networks
AU - Zhao, Yunpeng
AU - Levina, Elizaveta
AU - Zhu, Ji
PY - 2011/5/3
Y1 - 2011/5/3
N2 - Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields and has diverse applications. Most community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that extracts one community at a time, allowing for arbitrary structure in the remainder of the network, which can include weakly connected nodes. The main idea is that the strength of a community should depend on ties between its members and ties to the outside world, but not on ties between nonmembers. The proposed extraction criterion has a natural probabilistic interpretation in a wide class of models and performs well on simulated and real networks. For the case of the block model, we establish asymptotic consistency of estimated node labels and propose a hypothesis test for determining the number of communities.
AB - Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields and has diverse applications. Most community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that extracts one community at a time, allowing for arbitrary structure in the remainder of the network, which can include weakly connected nodes. The main idea is that the strength of a community should depend on ties between its members and ties to the outside world, but not on ties between nonmembers. The proposed extraction criterion has a natural probabilistic interpretation in a wide class of models and performs well on simulated and real networks. For the case of the block model, we establish asymptotic consistency of estimated node labels and propose a hypothesis test for determining the number of communities.
UR - http://www.scopus.com/inward/record.url?scp=79956327022&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79956327022&partnerID=8YFLogxK
U2 - 10.1073/pnas.1006642108
DO - 10.1073/pnas.1006642108
M3 - Article
C2 - 21502538
AN - SCOPUS:79956327022
SN - 0027-8424
VL - 108
SP - 7321
EP - 7326
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 18
ER -