Multi-scale spectral decomposition of massive graphs

Si Si; Donghyuk Shin; Inderjit S. Dhillon; Beresford N. Parlett

Multi-scale spectral decomposition of massive graphs

Si Si, Donghyuk Shin, Inderjit S. Dhillon, Beresford N. Parlett

Research output: Contribution to journal › Conference article › peer-review

Abstract

Computing the k dominant eigenvalues and eigenvectors of massive graphs is a key operation in numerous machine learning applications; however, popular solvers suffer from slow convergence, especially when k is reasonably large. In this paper, we propose and analyze a novel multi-scale spectral decomposition method (MSEIGS), which first clusters the graph into smaller clusters whose spectral decomposition can be computed efficiently and independently. We show theoretically as well as empirically that the union of all cluster's subspaces has significant overlap with the dominant subspace of the original graph, provided that the graph is clustered appropriately. Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph. We further use hierarchical clustering to speed up the computation and adopt a fast early termination strategy to compute quality approximations. Our method outperforms widely used solvers in terms of convergence speed and approximation quality. Furthermore, our method is naturally parallelizable and exhibits significant speedups in shared-memory parallel settings. For example, on a graph with more than 82 million nodes and 3.6 billion edges, MSEIGS takes less than 3 hours on a single-core machine while Randomized SVD takes more than 6 hours, to obtain a similar approximation of the top-50 eigenvectors. Using 16 cores, we can reduce this time to less than 40 minutes.

Original language	English (US)
Pages (from-to)	2798-2806
Number of pages	9
Journal	Advances in Neural Information Processing Systems
Volume	4
Issue number	January
State	Published - 2014
Externally published	Yes
Event	28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada Duration: Dec 8 2014 → Dec 13 2014

ASJC Scopus subject areas

Computer Networks and Communications
Information Systems
Signal Processing

Cite this

@article{9085e78a2c724db7b8c6f15ff4557f14,

title = "Multi-scale spectral decomposition of massive graphs",

abstract = "Computing the k dominant eigenvalues and eigenvectors of massive graphs is a key operation in numerous machine learning applications; however, popular solvers suffer from slow convergence, especially when k is reasonably large. In this paper, we propose and analyze a novel multi-scale spectral decomposition method (MSEIGS), which first clusters the graph into smaller clusters whose spectral decomposition can be computed efficiently and independently. We show theoretically as well as empirically that the union of all cluster's subspaces has significant overlap with the dominant subspace of the original graph, provided that the graph is clustered appropriately. Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph. We further use hierarchical clustering to speed up the computation and adopt a fast early termination strategy to compute quality approximations. Our method outperforms widely used solvers in terms of convergence speed and approximation quality. Furthermore, our method is naturally parallelizable and exhibits significant speedups in shared-memory parallel settings. For example, on a graph with more than 82 million nodes and 3.6 billion edges, MSEIGS takes less than 3 hours on a single-core machine while Randomized SVD takes more than 6 hours, to obtain a similar approximation of the top-50 eigenvectors. Using 16 cores, we can reduce this time to less than 40 minutes.",

author = "Si Si and Donghyuk Shin and Dhillon, {Inderjit S.} and Parlett, {Beresford N.}",

year = "2014",

language = "English (US)",

volume = "4",

pages = "2798--2806",

journal = "Advances in Neural Information Processing Systems",

issn = "1049-5258",

number = "January",

note = "28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 ; Conference date: 08-12-2014 Through 13-12-2014",

}

TY - JOUR

T1 - Multi-scale spectral decomposition of massive graphs

AU - Si, Si

AU - Shin, Donghyuk

AU - Dhillon, Inderjit S.

AU - Parlett, Beresford N.

PY - 2014

Y1 - 2014

N2 - Computing the k dominant eigenvalues and eigenvectors of massive graphs is a key operation in numerous machine learning applications; however, popular solvers suffer from slow convergence, especially when k is reasonably large. In this paper, we propose and analyze a novel multi-scale spectral decomposition method (MSEIGS), which first clusters the graph into smaller clusters whose spectral decomposition can be computed efficiently and independently. We show theoretically as well as empirically that the union of all cluster's subspaces has significant overlap with the dominant subspace of the original graph, provided that the graph is clustered appropriately. Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph. We further use hierarchical clustering to speed up the computation and adopt a fast early termination strategy to compute quality approximations. Our method outperforms widely used solvers in terms of convergence speed and approximation quality. Furthermore, our method is naturally parallelizable and exhibits significant speedups in shared-memory parallel settings. For example, on a graph with more than 82 million nodes and 3.6 billion edges, MSEIGS takes less than 3 hours on a single-core machine while Randomized SVD takes more than 6 hours, to obtain a similar approximation of the top-50 eigenvectors. Using 16 cores, we can reduce this time to less than 40 minutes.

AB - Computing the k dominant eigenvalues and eigenvectors of massive graphs is a key operation in numerous machine learning applications; however, popular solvers suffer from slow convergence, especially when k is reasonably large. In this paper, we propose and analyze a novel multi-scale spectral decomposition method (MSEIGS), which first clusters the graph into smaller clusters whose spectral decomposition can be computed efficiently and independently. We show theoretically as well as empirically that the union of all cluster's subspaces has significant overlap with the dominant subspace of the original graph, provided that the graph is clustered appropriately. Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph. We further use hierarchical clustering to speed up the computation and adopt a fast early termination strategy to compute quality approximations. Our method outperforms widely used solvers in terms of convergence speed and approximation quality. Furthermore, our method is naturally parallelizable and exhibits significant speedups in shared-memory parallel settings. For example, on a graph with more than 82 million nodes and 3.6 billion edges, MSEIGS takes less than 3 hours on a single-core machine while Randomized SVD takes more than 6 hours, to obtain a similar approximation of the top-50 eigenvectors. Using 16 cores, we can reduce this time to less than 40 minutes.

UR - http://www.scopus.com/inward/record.url?scp=84937842514&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937842514&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84937842514

SN - 1049-5258

VL - 4

SP - 2798

EP - 2806

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

IS - January

T2 - 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014

Y2 - 8 December 2014 through 13 December 2014

ER -

Multi-scale spectral decomposition of massive graphs

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this