Abstract

Visually analyzing citation networks poses challenges to many fields of the data mining research. How can we summarize a large citation graph according to the user's interest? In particular, how can we illustrate the impact of a highly influential paper through the summarization? Can we maintain the sensory node-link graph structure while revealing the flow-based influence patterns and preserving a fine readability? The state-of-the-art influence maximization algorithms can detect the most influential node in a citation network, but fail to summarize a graph structure to account for its influence. On the other hand, existing graph summarization methods fold large graphs into clustered views, but can not reveal the hidden influence patterns underneath the citation network. In this paper, we first formally define the Influence Graph Summarization problem on citation networks. Second, we propose a matrix decomposition based algorithm pipeline to solve the IGS problem. Our method can not only highlight the flow-based influence patterns, but also easily extend to support the rich attribute information. A prototype system called VEGAS implementing this pipeline is also developed. Third, we present a theoretical analysis on our main algorithm, which is equivalent to the kernel k-mean clustering. It can be proved that the matrix decomposition based algorithm can approximate the objective of the proposed IGS problem. Last, we conduct comprehensive experiments with real-world citation networks to compare the proposed algorithm with classical graph summarization methods. Evaluation results demonstrate that our method significantly outperforms the previous ones in optimizing both the quantitative IGS objective and the quality of the visual summarizations.

Original languageEnglish (US)
Article number7152908
Pages (from-to)3417-3431
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume27
Issue number12
DOIs
StatePublished - Dec 1 2015

Fingerprint

Pipelines
Decomposition
Data mining
Experiments

Keywords

  • citation network
  • Influence summarization
  • visualization

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Computer Science Applications

Cite this

VEGAS : Visual influEnce GrAph Summarization on Citation Networks. / Shi, Lei; Tong, Hanghang; Tang, Jie; Lin, Chuang.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 27, No. 12, 7152908, 01.12.2015, p. 3417-3431.

Research output: Contribution to journalArticle

@article{49cc5e5dcaca480b8250fa891077d20a,
title = "VEGAS: Visual influEnce GrAph Summarization on Citation Networks",
abstract = "Visually analyzing citation networks poses challenges to many fields of the data mining research. How can we summarize a large citation graph according to the user's interest? In particular, how can we illustrate the impact of a highly influential paper through the summarization? Can we maintain the sensory node-link graph structure while revealing the flow-based influence patterns and preserving a fine readability? The state-of-the-art influence maximization algorithms can detect the most influential node in a citation network, but fail to summarize a graph structure to account for its influence. On the other hand, existing graph summarization methods fold large graphs into clustered views, but can not reveal the hidden influence patterns underneath the citation network. In this paper, we first formally define the Influence Graph Summarization problem on citation networks. Second, we propose a matrix decomposition based algorithm pipeline to solve the IGS problem. Our method can not only highlight the flow-based influence patterns, but also easily extend to support the rich attribute information. A prototype system called VEGAS implementing this pipeline is also developed. Third, we present a theoretical analysis on our main algorithm, which is equivalent to the kernel k-mean clustering. It can be proved that the matrix decomposition based algorithm can approximate the objective of the proposed IGS problem. Last, we conduct comprehensive experiments with real-world citation networks to compare the proposed algorithm with classical graph summarization methods. Evaluation results demonstrate that our method significantly outperforms the previous ones in optimizing both the quantitative IGS objective and the quality of the visual summarizations.",
keywords = "citation network, Influence summarization, visualization",
author = "Lei Shi and Hanghang Tong and Jie Tang and Chuang Lin",
year = "2015",
month = "12",
day = "1",
doi = "10.1109/TKDE.2015.2453957",
language = "English (US)",
volume = "27",
pages = "3417--3431",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "12",

}

TY - JOUR

T1 - VEGAS

T2 - Visual influEnce GrAph Summarization on Citation Networks

AU - Shi, Lei

AU - Tong, Hanghang

AU - Tang, Jie

AU - Lin, Chuang

PY - 2015/12/1

Y1 - 2015/12/1

N2 - Visually analyzing citation networks poses challenges to many fields of the data mining research. How can we summarize a large citation graph according to the user's interest? In particular, how can we illustrate the impact of a highly influential paper through the summarization? Can we maintain the sensory node-link graph structure while revealing the flow-based influence patterns and preserving a fine readability? The state-of-the-art influence maximization algorithms can detect the most influential node in a citation network, but fail to summarize a graph structure to account for its influence. On the other hand, existing graph summarization methods fold large graphs into clustered views, but can not reveal the hidden influence patterns underneath the citation network. In this paper, we first formally define the Influence Graph Summarization problem on citation networks. Second, we propose a matrix decomposition based algorithm pipeline to solve the IGS problem. Our method can not only highlight the flow-based influence patterns, but also easily extend to support the rich attribute information. A prototype system called VEGAS implementing this pipeline is also developed. Third, we present a theoretical analysis on our main algorithm, which is equivalent to the kernel k-mean clustering. It can be proved that the matrix decomposition based algorithm can approximate the objective of the proposed IGS problem. Last, we conduct comprehensive experiments with real-world citation networks to compare the proposed algorithm with classical graph summarization methods. Evaluation results demonstrate that our method significantly outperforms the previous ones in optimizing both the quantitative IGS objective and the quality of the visual summarizations.

AB - Visually analyzing citation networks poses challenges to many fields of the data mining research. How can we summarize a large citation graph according to the user's interest? In particular, how can we illustrate the impact of a highly influential paper through the summarization? Can we maintain the sensory node-link graph structure while revealing the flow-based influence patterns and preserving a fine readability? The state-of-the-art influence maximization algorithms can detect the most influential node in a citation network, but fail to summarize a graph structure to account for its influence. On the other hand, existing graph summarization methods fold large graphs into clustered views, but can not reveal the hidden influence patterns underneath the citation network. In this paper, we first formally define the Influence Graph Summarization problem on citation networks. Second, we propose a matrix decomposition based algorithm pipeline to solve the IGS problem. Our method can not only highlight the flow-based influence patterns, but also easily extend to support the rich attribute information. A prototype system called VEGAS implementing this pipeline is also developed. Third, we present a theoretical analysis on our main algorithm, which is equivalent to the kernel k-mean clustering. It can be proved that the matrix decomposition based algorithm can approximate the objective of the proposed IGS problem. Last, we conduct comprehensive experiments with real-world citation networks to compare the proposed algorithm with classical graph summarization methods. Evaluation results demonstrate that our method significantly outperforms the previous ones in optimizing both the quantitative IGS objective and the quality of the visual summarizations.

KW - citation network

KW - Influence summarization

KW - visualization

UR - http://www.scopus.com/inward/record.url?scp=84959462491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959462491&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2015.2453957

DO - 10.1109/TKDE.2015.2453957

M3 - Article

AN - SCOPUS:84959462491

VL - 27

SP - 3417

EP - 3431

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 12

M1 - 7152908

ER -