Abstract

Ranking on large-scale graphs plays a fundamental role in many high-impact application domains, ranging from information retrieval, recommender systems, sports team management, biology to neuroscience and many more. PageRank, together with many of its random walk based variants, has become one of the most well-known and widely used algorithms, due to its mathematical elegance and the superior performance across a variety of application domains. Important as it might be, state-of-the-art lacks an intuitive way to explain the ranking results by PageRank (or its variants), e.g., why it thinks the returned top-k webpages are the most important ones in the entire graph; why it gives a higher rank to actor John than actor Smith in terms of their relevance w.r.t. a particular movie?In order to answer these questions, this paper proposes a paradigm shift for PageRank, from identifying which nodes are most important to understanding why the ranking algorithm gives a particular ranking result. We formally define the PageRank auditing problem, whose central idea is to identify a set of key graph elements (e.g., edges, nodes, subgraphs) with the highest influence on the ranking results. We formulate it as an opti-mization problem and propose a family of effective and scalable algorithms (Aurora) to solve it. Our algorithms measure the influence of graph elements and incrementally select influential elements w.r.t. their gradients over the ranking results. We perform extensive empirical evaluations on real-world datasets, which demonstrate that the proposed methods (Aurora) provide intuitive explanations with a linear scalability.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsYang Song, Bing Liu, Kisung Lee, Naoki Abe, Calton Pu, Mu Qiao, Nesreen Ahmed, Donald Kossmann, Jeffrey Saltz, Jiliang Tang, Jingrui He, Huan Liu, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages713-722
Number of pages10
ISBN (Electronic)9781538650356
DOIs
StatePublished - Jan 22 2019
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: Dec 10 2018Dec 13 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

Conference

Conference2018 IEEE International Conference on Big Data, Big Data 2018
CountryUnited States
CitySeattle
Period12/10/1812/13/18

    Fingerprint

Keywords

  • explainability
  • Graph mining
  • PageRank

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Cite this

Kang, J., Wang, M., Cao, N., Xia, Y., Fan, W., & Tong, H. (2019). AURORA: Auditing PageRank on Large Graphs. In Y. Song, B. Liu, K. Lee, N. Abe, C. Pu, M. Qiao, N. Ahmed, D. Kossmann, J. Saltz, J. Tang, J. He, H. Liu, & X. Hu (Eds.), Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018 (pp. 713-722). [8622563] (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2018.8622563