Mining connection pathways for marked nodes in large graphs

Leman Akoglu, Jilles Vreeken, Hanghang Tong, Duen Horng Chau, Nikolaj Tatti, Christos Faloutsos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Suppose we arc given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce Dot2Dot, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.

Original languageEnglish (US)
Title of host publicationSIAM International Conference on Data Mining 2013, SMD 2013
PublisherSociety for Industrial and Applied Mathematics Publications
Pages37-45
Number of pages9
ISBN (Print)9781627487245
StatePublished - 2013
Externally publishedYes
Event13th SIAM International Conference on Data Mining, SMD 2013 - Austin, United States
Duration: May 2 2013May 4 2013

Other

Other13th SIAM International Conference on Data Mining, SMD 2013
CountryUnited States
CityAustin
Period5/2/135/4/13

Fingerprint

Mining
Pathway
Graph in graph theory
Vertex of a graph
Computational complexity
Partitioning
Path
Experimentation
Arc of a curve
Efficient Algorithms
NP-complete problem

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Signal Processing
  • Software

Cite this

Akoglu, L., Vreeken, J., Tong, H., Chau, D. H., Tatti, N., & Faloutsos, C. (2013). Mining connection pathways for marked nodes in large graphs. In SIAM International Conference on Data Mining 2013, SMD 2013 (pp. 37-45). Society for Industrial and Applied Mathematics Publications.

Mining connection pathways for marked nodes in large graphs. / Akoglu, Leman; Vreeken, Jilles; Tong, Hanghang; Chau, Duen Horng; Tatti, Nikolaj; Faloutsos, Christos.

SIAM International Conference on Data Mining 2013, SMD 2013. Society for Industrial and Applied Mathematics Publications, 2013. p. 37-45.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Akoglu, L, Vreeken, J, Tong, H, Chau, DH, Tatti, N & Faloutsos, C 2013, Mining connection pathways for marked nodes in large graphs. in SIAM International Conference on Data Mining 2013, SMD 2013. Society for Industrial and Applied Mathematics Publications, pp. 37-45, 13th SIAM International Conference on Data Mining, SMD 2013, Austin, United States, 5/2/13.
Akoglu L, Vreeken J, Tong H, Chau DH, Tatti N, Faloutsos C. Mining connection pathways for marked nodes in large graphs. In SIAM International Conference on Data Mining 2013, SMD 2013. Society for Industrial and Applied Mathematics Publications. 2013. p. 37-45
Akoglu, Leman ; Vreeken, Jilles ; Tong, Hanghang ; Chau, Duen Horng ; Tatti, Nikolaj ; Faloutsos, Christos. / Mining connection pathways for marked nodes in large graphs. SIAM International Conference on Data Mining 2013, SMD 2013. Society for Industrial and Applied Mathematics Publications, 2013. pp. 37-45
@inproceedings{058cf0ba23cf4d9eae200a1acb71600f,
title = "Mining connection pathways for marked nodes in large graphs",
abstract = "Suppose we arc given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce Dot2Dot, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.",
author = "Leman Akoglu and Jilles Vreeken and Hanghang Tong and Chau, {Duen Horng} and Nikolaj Tatti and Christos Faloutsos",
year = "2013",
language = "English (US)",
isbn = "9781627487245",
pages = "37--45",
booktitle = "SIAM International Conference on Data Mining 2013, SMD 2013",
publisher = "Society for Industrial and Applied Mathematics Publications",

}

TY - GEN

T1 - Mining connection pathways for marked nodes in large graphs

AU - Akoglu, Leman

AU - Vreeken, Jilles

AU - Tong, Hanghang

AU - Chau, Duen Horng

AU - Tatti, Nikolaj

AU - Faloutsos, Christos

PY - 2013

Y1 - 2013

N2 - Suppose we arc given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce Dot2Dot, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.

AB - Suppose we arc given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce Dot2Dot, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.

UR - http://www.scopus.com/inward/record.url?scp=84960472274&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960472274&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781627487245

SP - 37

EP - 45

BT - SIAM International Conference on Data Mining 2013, SMD 2013

PB - Society for Industrial and Applied Mathematics Publications

ER -