Abstract

Attributed subgraph matching is a powerful tool for explorative mining of large attributed networks. In many applications (e.g., network science of teams, intelligence analysis, finance informatics), the user might not know what exactly s/he is looking for, and thus require the user to constantly revise the initial query graph based on what s/he finds from the current matching results. A major bottleneck in such an interactive matching scenario is the efficiency, as simply rerunning the matching algorithm on the revised query graph is computationally prohibitive. In this paper, we propose a family of effective and efficient algorithms (FIRST) to support interactive attributed subgraph matching. There are two key ideas behind the proposed methods. The first is to recast the attributed subgraph matching problem as a cross-network node similarity problem, whose major computation lies in solving a Sylvester equation for the query graph and the underlying data graph. The second key idea is to explore the smoothness between the initial and revised queries, which allows us to solve the new/updated Sylvester equation incrementally, without re-solving it from scratch. Experimental results show that our method can achieve (1) up to 16x speed-up when applying on networks with 6M+ nodes; (2) preserving more than 90% accuracy compared with existing methods; and (3) scales linearly with respect to the size of the data graph.

Original languageEnglish (US)
Title of host publicationKDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1447-1456
Number of pages10
VolumePart F129685
ISBN (Electronic)9781450348874
DOIs
StatePublished - Aug 13 2017
Event23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017 - Halifax, Canada
Duration: Aug 13 2017Aug 17 2017

Other

Other23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017
CountryCanada
CityHalifax
Period8/13/178/17/17

Fingerprint

Finance

Keywords

  • Cross-network similarity
  • Inexact matching
  • Interactive attributed subgraph matching

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Du, B., Zhang, S., Cao, N., & Tong, H. (2017). FIRST: Fast interactive attributed subgraph matching. In KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. Part F129685, pp. 1447-1456). Association for Computing Machinery. https://doi.org/10.1145/3097983.3098040

FIRST : Fast interactive attributed subgraph matching. / Du, Boxin; Zhang, Si; Cao, Nan; Tong, Hanghang.

KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F129685 Association for Computing Machinery, 2017. p. 1447-1456.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Du, B, Zhang, S, Cao, N & Tong, H 2017, FIRST: Fast interactive attributed subgraph matching. in KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. vol. Part F129685, Association for Computing Machinery, pp. 1447-1456, 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, Halifax, Canada, 8/13/17. https://doi.org/10.1145/3097983.3098040
Du B, Zhang S, Cao N, Tong H. FIRST: Fast interactive attributed subgraph matching. In KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F129685. Association for Computing Machinery. 2017. p. 1447-1456 https://doi.org/10.1145/3097983.3098040
Du, Boxin ; Zhang, Si ; Cao, Nan ; Tong, Hanghang. / FIRST : Fast interactive attributed subgraph matching. KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. Part F129685 Association for Computing Machinery, 2017. pp. 1447-1456
@inproceedings{363e52059924480e8ab4dcc97313aa84,
title = "FIRST: Fast interactive attributed subgraph matching",
abstract = "Attributed subgraph matching is a powerful tool for explorative mining of large attributed networks. In many applications (e.g., network science of teams, intelligence analysis, finance informatics), the user might not know what exactly s/he is looking for, and thus require the user to constantly revise the initial query graph based on what s/he finds from the current matching results. A major bottleneck in such an interactive matching scenario is the efficiency, as simply rerunning the matching algorithm on the revised query graph is computationally prohibitive. In this paper, we propose a family of effective and efficient algorithms (FIRST) to support interactive attributed subgraph matching. There are two key ideas behind the proposed methods. The first is to recast the attributed subgraph matching problem as a cross-network node similarity problem, whose major computation lies in solving a Sylvester equation for the query graph and the underlying data graph. The second key idea is to explore the smoothness between the initial and revised queries, which allows us to solve the new/updated Sylvester equation incrementally, without re-solving it from scratch. Experimental results show that our method can achieve (1) up to 16x speed-up when applying on networks with 6M+ nodes; (2) preserving more than 90{\%} accuracy compared with existing methods; and (3) scales linearly with respect to the size of the data graph.",
keywords = "Cross-network similarity, Inexact matching, Interactive attributed subgraph matching",
author = "Boxin Du and Si Zhang and Nan Cao and Hanghang Tong",
year = "2017",
month = "8",
day = "13",
doi = "10.1145/3097983.3098040",
language = "English (US)",
volume = "Part F129685",
pages = "1447--1456",
booktitle = "KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - FIRST

T2 - Fast interactive attributed subgraph matching

AU - Du, Boxin

AU - Zhang, Si

AU - Cao, Nan

AU - Tong, Hanghang

PY - 2017/8/13

Y1 - 2017/8/13

N2 - Attributed subgraph matching is a powerful tool for explorative mining of large attributed networks. In many applications (e.g., network science of teams, intelligence analysis, finance informatics), the user might not know what exactly s/he is looking for, and thus require the user to constantly revise the initial query graph based on what s/he finds from the current matching results. A major bottleneck in such an interactive matching scenario is the efficiency, as simply rerunning the matching algorithm on the revised query graph is computationally prohibitive. In this paper, we propose a family of effective and efficient algorithms (FIRST) to support interactive attributed subgraph matching. There are two key ideas behind the proposed methods. The first is to recast the attributed subgraph matching problem as a cross-network node similarity problem, whose major computation lies in solving a Sylvester equation for the query graph and the underlying data graph. The second key idea is to explore the smoothness between the initial and revised queries, which allows us to solve the new/updated Sylvester equation incrementally, without re-solving it from scratch. Experimental results show that our method can achieve (1) up to 16x speed-up when applying on networks with 6M+ nodes; (2) preserving more than 90% accuracy compared with existing methods; and (3) scales linearly with respect to the size of the data graph.

AB - Attributed subgraph matching is a powerful tool for explorative mining of large attributed networks. In many applications (e.g., network science of teams, intelligence analysis, finance informatics), the user might not know what exactly s/he is looking for, and thus require the user to constantly revise the initial query graph based on what s/he finds from the current matching results. A major bottleneck in such an interactive matching scenario is the efficiency, as simply rerunning the matching algorithm on the revised query graph is computationally prohibitive. In this paper, we propose a family of effective and efficient algorithms (FIRST) to support interactive attributed subgraph matching. There are two key ideas behind the proposed methods. The first is to recast the attributed subgraph matching problem as a cross-network node similarity problem, whose major computation lies in solving a Sylvester equation for the query graph and the underlying data graph. The second key idea is to explore the smoothness between the initial and revised queries, which allows us to solve the new/updated Sylvester equation incrementally, without re-solving it from scratch. Experimental results show that our method can achieve (1) up to 16x speed-up when applying on networks with 6M+ nodes; (2) preserving more than 90% accuracy compared with existing methods; and (3) scales linearly with respect to the size of the data graph.

KW - Cross-network similarity

KW - Inexact matching

KW - Interactive attributed subgraph matching

UR - http://www.scopus.com/inward/record.url?scp=85029046778&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029046778&partnerID=8YFLogxK

U2 - 10.1145/3097983.3098040

DO - 10.1145/3097983.3098040

M3 - Conference contribution

AN - SCOPUS:85029046778

VL - Part F129685

SP - 1447

EP - 1456

BT - KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -