Abstract

Measuring node proximity on large scale networks is a fundamental building block in many application domains, ranging from computer vision, e-commerce, social networks, software engineering, disaster management to biology and epidemiology. The state of the art (e.g., random walk based methods) typically assumes the input network is given a priori, with the known network topology and the associated edge weights. A few recent works aim to further infer the optimal edge weights based on the side information. This paper generalizes the challenge in multiple dimensions, aiming to learn optimal networks for node proximity measures. First (optimization scope), our proposed formulation explores a much larger parameter space, so that it is able to simultaneously infer the optimal network topology and the associated edge weights. This is important as a noisy or missing edge could greatly mislead the network node proximity measures. Second (optimization granularity), while all the existing works assume one common optimal network, be it given as the input or learned by the algorithms, exists for all queries, our method performs optimization at a much finer granularity, essentially being able to infer an optimal network that is specific to a given query. Third (optimization efficiency), we carefully design our algorithms with a linear complexity wrt the neighborhood size of the user preference set. We perform extensive empirical evaluations on a diverse set of 10+ real networks, which show that the proposed algorithms (1) consistently outperform the existing methods on all six commonly used metrics; (2) empirically scale sub-linearly to billion-scale networks and (3) respond in a fraction of a second.

Original languageEnglish (US)
Title of host publicationKDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages985-994
Number of pages10
Volume13-17-August-2016
ISBN (Electronic)9781450342322
DOIs
StatePublished - Aug 13 2016
Event22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016 - San Francisco, United States
Duration: Aug 13 2016Aug 17 2016

Other

Other22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
CountryUnited States
CitySan Francisco
Period8/13/168/17/16

Fingerprint

Topology
Epidemiology
Disasters
Computer vision
Software engineering

Keywords

  • Node proximity
  • Optimal networks

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Li, L., Yao, Y., Tang, J., Fan, W., & Tong, H. (2016). QUINT: On query-specific optimal networks. In KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. 13-17-August-2016, pp. 985-994). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939768

QUINT : On query-specific optimal networks. / Li, Liangyue; Yao, Yuan; Tang, Jie; Fan, Wei; Tong, Hanghang.

KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016 Association for Computing Machinery, 2016. p. 985-994.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, L, Yao, Y, Tang, J, Fan, W & Tong, H 2016, QUINT: On query-specific optimal networks. in KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. vol. 13-17-August-2016, Association for Computing Machinery, pp. 985-994, 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, San Francisco, United States, 8/13/16. https://doi.org/10.1145/2939672.2939768
Li L, Yao Y, Tang J, Fan W, Tong H. QUINT: On query-specific optimal networks. In KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016. Association for Computing Machinery. 2016. p. 985-994 https://doi.org/10.1145/2939672.2939768
Li, Liangyue ; Yao, Yuan ; Tang, Jie ; Fan, Wei ; Tong, Hanghang. / QUINT : On query-specific optimal networks. KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016 Association for Computing Machinery, 2016. pp. 985-994
@inproceedings{ec8348f295814e699121ce57868b78c4,
title = "QUINT: On query-specific optimal networks",
abstract = "Measuring node proximity on large scale networks is a fundamental building block in many application domains, ranging from computer vision, e-commerce, social networks, software engineering, disaster management to biology and epidemiology. The state of the art (e.g., random walk based methods) typically assumes the input network is given a priori, with the known network topology and the associated edge weights. A few recent works aim to further infer the optimal edge weights based on the side information. This paper generalizes the challenge in multiple dimensions, aiming to learn optimal networks for node proximity measures. First (optimization scope), our proposed formulation explores a much larger parameter space, so that it is able to simultaneously infer the optimal network topology and the associated edge weights. This is important as a noisy or missing edge could greatly mislead the network node proximity measures. Second (optimization granularity), while all the existing works assume one common optimal network, be it given as the input or learned by the algorithms, exists for all queries, our method performs optimization at a much finer granularity, essentially being able to infer an optimal network that is specific to a given query. Third (optimization efficiency), we carefully design our algorithms with a linear complexity wrt the neighborhood size of the user preference set. We perform extensive empirical evaluations on a diverse set of 10+ real networks, which show that the proposed algorithms (1) consistently outperform the existing methods on all six commonly used metrics; (2) empirically scale sub-linearly to billion-scale networks and (3) respond in a fraction of a second.",
keywords = "Node proximity, Optimal networks",
author = "Liangyue Li and Yuan Yao and Jie Tang and Wei Fan and Hanghang Tong",
year = "2016",
month = "8",
day = "13",
doi = "10.1145/2939672.2939768",
language = "English (US)",
volume = "13-17-August-2016",
pages = "985--994",
booktitle = "KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - QUINT

T2 - On query-specific optimal networks

AU - Li, Liangyue

AU - Yao, Yuan

AU - Tang, Jie

AU - Fan, Wei

AU - Tong, Hanghang

PY - 2016/8/13

Y1 - 2016/8/13

N2 - Measuring node proximity on large scale networks is a fundamental building block in many application domains, ranging from computer vision, e-commerce, social networks, software engineering, disaster management to biology and epidemiology. The state of the art (e.g., random walk based methods) typically assumes the input network is given a priori, with the known network topology and the associated edge weights. A few recent works aim to further infer the optimal edge weights based on the side information. This paper generalizes the challenge in multiple dimensions, aiming to learn optimal networks for node proximity measures. First (optimization scope), our proposed formulation explores a much larger parameter space, so that it is able to simultaneously infer the optimal network topology and the associated edge weights. This is important as a noisy or missing edge could greatly mislead the network node proximity measures. Second (optimization granularity), while all the existing works assume one common optimal network, be it given as the input or learned by the algorithms, exists for all queries, our method performs optimization at a much finer granularity, essentially being able to infer an optimal network that is specific to a given query. Third (optimization efficiency), we carefully design our algorithms with a linear complexity wrt the neighborhood size of the user preference set. We perform extensive empirical evaluations on a diverse set of 10+ real networks, which show that the proposed algorithms (1) consistently outperform the existing methods on all six commonly used metrics; (2) empirically scale sub-linearly to billion-scale networks and (3) respond in a fraction of a second.

AB - Measuring node proximity on large scale networks is a fundamental building block in many application domains, ranging from computer vision, e-commerce, social networks, software engineering, disaster management to biology and epidemiology. The state of the art (e.g., random walk based methods) typically assumes the input network is given a priori, with the known network topology and the associated edge weights. A few recent works aim to further infer the optimal edge weights based on the side information. This paper generalizes the challenge in multiple dimensions, aiming to learn optimal networks for node proximity measures. First (optimization scope), our proposed formulation explores a much larger parameter space, so that it is able to simultaneously infer the optimal network topology and the associated edge weights. This is important as a noisy or missing edge could greatly mislead the network node proximity measures. Second (optimization granularity), while all the existing works assume one common optimal network, be it given as the input or learned by the algorithms, exists for all queries, our method performs optimization at a much finer granularity, essentially being able to infer an optimal network that is specific to a given query. Third (optimization efficiency), we carefully design our algorithms with a linear complexity wrt the neighborhood size of the user preference set. We perform extensive empirical evaluations on a diverse set of 10+ real networks, which show that the proposed algorithms (1) consistently outperform the existing methods on all six commonly used metrics; (2) empirically scale sub-linearly to billion-scale networks and (3) respond in a fraction of a second.

KW - Node proximity

KW - Optimal networks

UR - http://www.scopus.com/inward/record.url?scp=84985004020&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84985004020&partnerID=8YFLogxK

U2 - 10.1145/2939672.2939768

DO - 10.1145/2939672.2939768

M3 - Conference contribution

AN - SCOPUS:84985004020

VL - 13-17-August-2016

SP - 985

EP - 994

BT - KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -