Center-piece subgraphs

Problem definition and fast solutions

Hanghang Tong, Christos Faloutsos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

177 Citations (Scopus)

Abstract

Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

Original languageEnglish (US)
Title of host publicationProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages404-413
Number of pages10
Volume2006
StatePublished - 2006
Externally publishedYes
EventKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Philadelphia, PA, United States
Duration: Aug 20 2006Aug 23 2006

Other

OtherKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CityPhiladelphia, PA
Period8/20/068/23/06

Fingerprint

Proteins
Law enforcement
Marketing
Clocks
Genes
Experiments

Keywords

  • Center-piece subgraph
  • Goodness score
  • K_softAND

ASJC Scopus subject areas

  • Information Systems

Cite this

Tong, H., & Faloutsos, C. (2006). Center-piece subgraphs: Problem definition and fast solutions. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. 2006, pp. 404-413)

Center-piece subgraphs : Problem definition and fast solutions. / Tong, Hanghang; Faloutsos, Christos.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 2006 2006. p. 404-413.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tong, H & Faloutsos, C 2006, Center-piece subgraphs: Problem definition and fast solutions. in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. vol. 2006, pp. 404-413, KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, United States, 8/20/06.
Tong H, Faloutsos C. Center-piece subgraphs: Problem definition and fast solutions. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 2006. 2006. p. 404-413
Tong, Hanghang ; Faloutsos, Christos. / Center-piece subgraphs : Problem definition and fast solutions. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 2006 2006. pp. 404-413
@inproceedings{680c4d506b32476ebbc9b258e9c6953d,
title = "Center-piece subgraphs: Problem definition and fast solutions",
abstract = "Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.",
keywords = "Center-piece subgraph, Goodness score, K_softAND",
author = "Hanghang Tong and Christos Faloutsos",
year = "2006",
language = "English (US)",
isbn = "1595933395",
volume = "2006",
pages = "404--413",
booktitle = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Center-piece subgraphs

T2 - Problem definition and fast solutions

AU - Tong, Hanghang

AU - Faloutsos, Christos

PY - 2006

Y1 - 2006

N2 - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

AB - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.

KW - Center-piece subgraph

KW - Goodness score

KW - K_softAND

UR - http://www.scopus.com/inward/record.url?scp=33749577591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749577591&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1595933395

SN - 9781595933393

VL - 2006

SP - 404

EP - 413

BT - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ER -