TY - GEN
T1 - A novel visual analytics approach for clustering large-scale social data
AU - Wang, Zhangye
AU - Chen, Chang
AU - Zhou, Juanxia
AU - Liao, Jiyuan
AU - Chen, Wei
AU - Maciejewski, Ross
PY - 2013
Y1 - 2013
N2 - Social data refers to data individuals create that is knowingly and voluntarily shared by them and is an exciting avenue into gaining insight into interpersonal behaviors and interaction. However, such data is large, heterogeneous and often incomplete, properties that make the analysis of such data extremely challenging. One common method of exploring such data is through cluster analysis, which can enable analysts to find groups of related users, behaviors and interactions. This paper presents a novel visual analysis approach for detecting clusters within large-scale social networks by utilizing a divide-analyze-recombine scheme that sequentially performs data partitioning, subset clustering and result recombination within an integrated visual interface. A case study on a microblog messaging data (with 4.8 millions users) is used to demonstrate the feasibility of this approach and comparisons are also provided to illustrate the performance benefits of this approach with respect to existing solutions.
AB - Social data refers to data individuals create that is knowingly and voluntarily shared by them and is an exciting avenue into gaining insight into interpersonal behaviors and interaction. However, such data is large, heterogeneous and often incomplete, properties that make the analysis of such data extremely challenging. One common method of exploring such data is through cluster analysis, which can enable analysts to find groups of related users, behaviors and interactions. This paper presents a novel visual analysis approach for detecting clusters within large-scale social networks by utilizing a divide-analyze-recombine scheme that sequentially performs data partitioning, subset clustering and result recombination within an integrated visual interface. A case study on a microblog messaging data (with 4.8 millions users) is used to demonstrate the feasibility of this approach and comparisons are also provided to illustrate the performance benefits of this approach with respect to existing solutions.
KW - Cluster Analysis
KW - Divide and Recombine
KW - K-means
KW - Visual Analysis
UR - http://www.scopus.com/inward/record.url?scp=84893326502&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893326502&partnerID=8YFLogxK
U2 - 10.1109/BigData.2013.6691718
DO - 10.1109/BigData.2013.6691718
M3 - Conference contribution
AN - SCOPUS:84893326502
SN - 9781479912926
T3 - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
SP - 79
EP - 86
BT - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
PB - IEEE Computer Society
T2 - 2013 IEEE International Conference on Big Data, Big Data 2013
Y2 - 6 October 2013 through 9 October 2013
ER -