Center-piece subgraphs: Problem definition and fast solutions

Hanghang Tong; Christos Faloutsos

Center-piece subgraphs: Problem definition and fast solutions

Hanghang Tong, Christos Faloutsos

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

Original language	English (US)
Title of host publication	KDD 2006
Subtitle of host publication	Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages	404-413
Number of pages	10
State	Published - 2006
Externally published	Yes
Event	KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Philadelphia, PA, United States Duration: Aug 20 2006 → Aug 23 2006

Publication series

Name	Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume	2006

Conference

Conference	KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/Territory	United States
City	Philadelphia, PA
Period	8/20/06 → 8/23/06

Keywords

Center-piece subgraph
Goodness score
K_softAND

ASJC Scopus subject areas

Software
Information Systems

Cite this

Center-piece subgraphs: Problem definition and fast solutions. / Tong, Hanghang; Faloutsos, Christos.
KDD 2006: Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006. p. 404-413 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Vol. 2006).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Tong, H & Faloutsos, C 2006, Center-piece subgraphs: Problem definition and fast solutions. in KDD 2006: Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2006, pp. 404-413, KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, United States, 8/20/06.

@inproceedings{680c4d506b32476ebbc9b258e9c6953d,

title = "Center-piece subgraphs: Problem definition and fast solutions",

abstract = "Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.",

keywords = "Center-piece subgraph, Goodness score, K_softAND",

author = "Hanghang Tong and Christos Faloutsos",

year = "2006",

language = "English (US)",

isbn = "1595933395",

series = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

pages = "404--413",

booktitle = "KDD 2006",

note = "KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ; Conference date: 20-08-2006 Through 23-08-2006",

}

TY - GEN

T1 - Center-piece subgraphs

T2 - KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

AU - Tong, Hanghang

AU - Faloutsos, Christos

PY - 2006

Y1 - 2006

N2 - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

AB - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

KW - Center-piece subgraph

KW - Goodness score

KW - K_softAND

UR - http://www.scopus.com/inward/record.url?scp=33749577591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749577591&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33749577591

SN - 1595933395

SN - 9781595933393

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 404

EP - 413

BT - KDD 2006

Y2 - 20 August 2006 through 23 August 2006

ER -

Center-piece subgraphs: Problem definition and fast solutions

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this