TY - GEN

T1 - PaCK

T2 - 9th SIAM International Conference on Data Mining 2009, SDM 2009

AU - He, Jingrui

AU - Tong, Hanghang

AU - Papadimitriou, Spiros

AU - Eliassi-Rad, Tina

AU - Faloutsos, Christos

AU - Carbonell, Jaime

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Given an author-paper-conference graph, how can we automatically find groups for author, paper and conference respectively. Existing work either (1) requires fine tuning of several parameters, or (2) can only be applied to bipartite graphs (e.g., author-paper graph, or paper-conference graph). To address this problem, in this paper, we propose PaCK for clustering such k-partite graphs. By optimizing an information-theoretic criterion, PaCK searches for the best number of clusters for each type of object and generates the corresponding clustering. The unique feature of PaCK over existing methods for clustering k-partite graphs lies in its parameter-free nature. Furthermore, it can be easily generalized to the cases where certain connectivity relations are expressed as tensors, e.g., time-evolving data. The proposed algorithm is scalable in the sense that it is linear with respect to the total number of edges in the graphs. We present the theoretical analysis as well as the experimental evaluations to demonstrate both its effectiveness and efficiency.

AB - Given an author-paper-conference graph, how can we automatically find groups for author, paper and conference respectively. Existing work either (1) requires fine tuning of several parameters, or (2) can only be applied to bipartite graphs (e.g., author-paper graph, or paper-conference graph). To address this problem, in this paper, we propose PaCK for clustering such k-partite graphs. By optimizing an information-theoretic criterion, PaCK searches for the best number of clusters for each type of object and generates the corresponding clustering. The unique feature of PaCK over existing methods for clustering k-partite graphs lies in its parameter-free nature. Furthermore, it can be easily generalized to the cases where certain connectivity relations are expressed as tensors, e.g., time-evolving data. The proposed algorithm is scalable in the sense that it is linear with respect to the total number of edges in the graphs. We present the theoretical analysis as well as the experimental evaluations to demonstrate both its effectiveness and efficiency.

UR - http://www.scopus.com/inward/record.url?scp=73449135163&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=73449135163&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:73449135163

SN - 9781615671090

T3 - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics

SP - 1278

EP - 1287

BT - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics 133

Y2 - 30 April 2009 through 2 May 2009

ER -