Restreaming graph partitioning

Simple versatile algorithms for advanced balancing

Joel Nishimura, Johan Ugander

Research output: Chapter in Book/Report/Conference proceedingConference contribution

55 Citations (Scopus)

Abstract

Partitioning large graphs is difficult, especially when performed in the limited models of computation afforded to modern large scale computing systems. In this work we introduce restreaming graph partitioning and develop algorithms that scale similarly to streaming partitioning algorithms yet empirically perform as well as fully offline algorithms. In streaming partitioning, graphs are partitioned serially in a single pass. Restreaming partitioning is motivated by scenarios where approximately the same dataset is routinely streamed, making it possible to transform streaming partitioning algorithms into an iterative procedure. This combination of simplicity and powerful performance allows restreaming algorithms to be easily adapted to efficiently tackle more challenging partitioning objectives. In particular, we consider the problem of stratified graph partitioning, where each of many node attribute strata are balanced simultaneously. As such, stratified partitioning is well suited for the study of network effects on social networks, where it is desirable to isolate disjoint dense subgraphs with representative user demographics. To demonstrate, we partition a large social network such that each partition exhibits the same degree distribution in the original graph -A novel achievement for non-regular graphs. As part of our results, we also observe a fundamental difference in the ease with which social graphs are partitioned when compared to web graphs. Namely, the modular structure of web graphs appears to motivate full offline optimization, whereas the locally dense structure of social graphs precludes significant gains from global manipulations.

Original languageEnglish (US)
Title of host publicationKDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1106-1114
Number of pages9
VolumePart F128815
ISBN (Electronic)9781450321747
DOIs
StatePublished - Aug 11 2013
Externally publishedYes
Event19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013 - Chicago, United States
Duration: Aug 11 2013Aug 14 2013

Other

Other19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013
CountryUnited States
CityChicago
Period8/11/138/14/13

Keywords

  • Balanced partitioning
  • Graph clustering
  • Multi-constraint balance
  • Social networks
  • Stratified partitioning

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Nishimura, J., & Ugander, J. (2013). Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. In KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. Part F128815, pp. 1106-1114). [2487696] Association for Computing Machinery. https://doi.org/10.1145/2487575.2487696

Restreaming graph partitioning : Simple versatile algorithms for advanced balancing. / Nishimura, Joel; Ugander, Johan.

KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F128815 Association for Computing Machinery, 2013. p. 1106-1114 2487696.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nishimura, J & Ugander, J 2013, Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. in KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. vol. Part F128815, 2487696, Association for Computing Machinery, pp. 1106-1114, 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, United States, 8/11/13. https://doi.org/10.1145/2487575.2487696
Nishimura J, Ugander J. Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. In KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F128815. Association for Computing Machinery. 2013. p. 1106-1114. 2487696 https://doi.org/10.1145/2487575.2487696
Nishimura, Joel ; Ugander, Johan. / Restreaming graph partitioning : Simple versatile algorithms for advanced balancing. KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F128815 Association for Computing Machinery, 2013. pp. 1106-1114
@inproceedings{8096effec8b24cc2ba807b3a3a3857d1,
title = "Restreaming graph partitioning: Simple versatile algorithms for advanced balancing",
abstract = "Partitioning large graphs is difficult, especially when performed in the limited models of computation afforded to modern large scale computing systems. In this work we introduce restreaming graph partitioning and develop algorithms that scale similarly to streaming partitioning algorithms yet empirically perform as well as fully offline algorithms. In streaming partitioning, graphs are partitioned serially in a single pass. Restreaming partitioning is motivated by scenarios where approximately the same dataset is routinely streamed, making it possible to transform streaming partitioning algorithms into an iterative procedure. This combination of simplicity and powerful performance allows restreaming algorithms to be easily adapted to efficiently tackle more challenging partitioning objectives. In particular, we consider the problem of stratified graph partitioning, where each of many node attribute strata are balanced simultaneously. As such, stratified partitioning is well suited for the study of network effects on social networks, where it is desirable to isolate disjoint dense subgraphs with representative user demographics. To demonstrate, we partition a large social network such that each partition exhibits the same degree distribution in the original graph -A novel achievement for non-regular graphs. As part of our results, we also observe a fundamental difference in the ease with which social graphs are partitioned when compared to web graphs. Namely, the modular structure of web graphs appears to motivate full offline optimization, whereas the locally dense structure of social graphs precludes significant gains from global manipulations.",
keywords = "Balanced partitioning, Graph clustering, Multi-constraint balance, Social networks, Stratified partitioning",
author = "Joel Nishimura and Johan Ugander",
year = "2013",
month = "8",
day = "11",
doi = "10.1145/2487575.2487696",
language = "English (US)",
volume = "Part F128815",
pages = "1106--1114",
booktitle = "KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Restreaming graph partitioning

T2 - Simple versatile algorithms for advanced balancing

AU - Nishimura, Joel

AU - Ugander, Johan

PY - 2013/8/11

Y1 - 2013/8/11

N2 - Partitioning large graphs is difficult, especially when performed in the limited models of computation afforded to modern large scale computing systems. In this work we introduce restreaming graph partitioning and develop algorithms that scale similarly to streaming partitioning algorithms yet empirically perform as well as fully offline algorithms. In streaming partitioning, graphs are partitioned serially in a single pass. Restreaming partitioning is motivated by scenarios where approximately the same dataset is routinely streamed, making it possible to transform streaming partitioning algorithms into an iterative procedure. This combination of simplicity and powerful performance allows restreaming algorithms to be easily adapted to efficiently tackle more challenging partitioning objectives. In particular, we consider the problem of stratified graph partitioning, where each of many node attribute strata are balanced simultaneously. As such, stratified partitioning is well suited for the study of network effects on social networks, where it is desirable to isolate disjoint dense subgraphs with representative user demographics. To demonstrate, we partition a large social network such that each partition exhibits the same degree distribution in the original graph -A novel achievement for non-regular graphs. As part of our results, we also observe a fundamental difference in the ease with which social graphs are partitioned when compared to web graphs. Namely, the modular structure of web graphs appears to motivate full offline optimization, whereas the locally dense structure of social graphs precludes significant gains from global manipulations.

AB - Partitioning large graphs is difficult, especially when performed in the limited models of computation afforded to modern large scale computing systems. In this work we introduce restreaming graph partitioning and develop algorithms that scale similarly to streaming partitioning algorithms yet empirically perform as well as fully offline algorithms. In streaming partitioning, graphs are partitioned serially in a single pass. Restreaming partitioning is motivated by scenarios where approximately the same dataset is routinely streamed, making it possible to transform streaming partitioning algorithms into an iterative procedure. This combination of simplicity and powerful performance allows restreaming algorithms to be easily adapted to efficiently tackle more challenging partitioning objectives. In particular, we consider the problem of stratified graph partitioning, where each of many node attribute strata are balanced simultaneously. As such, stratified partitioning is well suited for the study of network effects on social networks, where it is desirable to isolate disjoint dense subgraphs with representative user demographics. To demonstrate, we partition a large social network such that each partition exhibits the same degree distribution in the original graph -A novel achievement for non-regular graphs. As part of our results, we also observe a fundamental difference in the ease with which social graphs are partitioned when compared to web graphs. Namely, the modular structure of web graphs appears to motivate full offline optimization, whereas the locally dense structure of social graphs precludes significant gains from global manipulations.

KW - Balanced partitioning

KW - Graph clustering

KW - Multi-constraint balance

KW - Social networks

KW - Stratified partitioning

UR - http://www.scopus.com/inward/record.url?scp=85021199540&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021199540&partnerID=8YFLogxK

U2 - 10.1145/2487575.2487696

DO - 10.1145/2487575.2487696

M3 - Conference contribution

VL - Part F128815

SP - 1106

EP - 1114

BT - KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -