Random walk with restart

Fast solutions and applications

Hanghang Tong, Christos Faloutsos, Jia Yu Pan

Research output: Contribution to journalArticle

187 Citations (Scopus)

Abstract

How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the "connection subgraphs", personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) block-wise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman-Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150 × speed up with 90%+ quality preservation.

Original languageEnglish (US)
Pages (from-to)327-346
Number of pages20
JournalKnowledge and Information Systems
Volume14
Issue number3
DOIs
StatePublished - Mar 2008
Externally publishedYes

Fingerprint

Costs

Keywords

  • Graph Mining
  • Random walk with restart
  • Relevance score

ASJC Scopus subject areas

  • Information Systems

Cite this

Random walk with restart : Fast solutions and applications. / Tong, Hanghang; Faloutsos, Christos; Pan, Jia Yu.

In: Knowledge and Information Systems, Vol. 14, No. 3, 03.2008, p. 327-346.

Research output: Contribution to journalArticle

Tong, Hanghang ; Faloutsos, Christos ; Pan, Jia Yu. / Random walk with restart : Fast solutions and applications. In: Knowledge and Information Systems. 2008 ; Vol. 14, No. 3. pp. 327-346.
@article{c105b126e3a94aa2844952ca6cff2d83,
title = "Random walk with restart: Fast solutions and applications",
abstract = "How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the {"}connection subgraphs{"}, personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) block-wise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman-Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150 × speed up with 90{\%}+ quality preservation.",
keywords = "Graph Mining, Random walk with restart, Relevance score",
author = "Hanghang Tong and Christos Faloutsos and Pan, {Jia Yu}",
year = "2008",
month = "3",
doi = "10.1007/s10115-007-0094-2",
language = "English (US)",
volume = "14",
pages = "327--346",
journal = "Knowledge and Information Systems",
issn = "0219-1377",
publisher = "Springer London",
number = "3",

}

TY - JOUR

T1 - Random walk with restart

T2 - Fast solutions and applications

AU - Tong, Hanghang

AU - Faloutsos, Christos

AU - Pan, Jia Yu

PY - 2008/3

Y1 - 2008/3

N2 - How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the "connection subgraphs", personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) block-wise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman-Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150 × speed up with 90%+ quality preservation.

AB - How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the "connection subgraphs", personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) block-wise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman-Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150 × speed up with 90%+ quality preservation.

KW - Graph Mining

KW - Random walk with restart

KW - Relevance score

UR - http://www.scopus.com/inward/record.url?scp=41149096059&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=41149096059&partnerID=8YFLogxK

U2 - 10.1007/s10115-007-0094-2

DO - 10.1007/s10115-007-0094-2

M3 - Article

VL - 14

SP - 327

EP - 346

JO - Knowledge and Information Systems

JF - Knowledge and Information Systems

SN - 0219-1377

IS - 3

ER -