Abstract

Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

Original languageEnglish (US)
Title of host publicationGraph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings
EditorsTherese Biedl, Andreas Kerren
PublisherSpringer Verlag
Pages463-477
Number of pages15
ISBN (Print)9783030044138
DOIs
StatePublished - Jan 1 2018
Event26th International Symposium on Graph Drawing and Network Visualization, GD 2018 - Barcelona, Spain
Duration: Sep 26 2018Sep 28 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11282 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other26th International Symposium on Graph Drawing and Network Visualization, GD 2018
CountrySpain
CityBarcelona
Period9/26/189/28/18

Fingerprint

Graph Drawing
Statistics
Graph in graph theory
Multiple Correlation
Graph Mining
Visual Analytics
Sampling
Random Graphs
Large Data Sets
Entire
Higher Order
Sufficient
Distinct
Methodology

Keywords

  • Graph generators
  • Graph mining
  • Graph properties

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Chen, H., Soni, U., Lu, Y., Maciejewski, R., & Kobourov, S. (2018). Same stats, different graphs: (Graph statistics and why we need graph drawings). In T. Biedl, & A. Kerren (Eds.), Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings (pp. 463-477). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11282 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-04414-5_33

Same stats, different graphs : (Graph statistics and why we need graph drawings). / Chen, Hang; Soni, Utkarsh; Lu, Yafeng; Maciejewski, Ross; Kobourov, Stephen.

Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings. ed. / Therese Biedl; Andreas Kerren. Springer Verlag, 2018. p. 463-477 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11282 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, H, Soni, U, Lu, Y, Maciejewski, R & Kobourov, S 2018, Same stats, different graphs: (Graph statistics and why we need graph drawings). in T Biedl & A Kerren (eds), Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11282 LNCS, Springer Verlag, pp. 463-477, 26th International Symposium on Graph Drawing and Network Visualization, GD 2018, Barcelona, Spain, 9/26/18. https://doi.org/10.1007/978-3-030-04414-5_33
Chen H, Soni U, Lu Y, Maciejewski R, Kobourov S. Same stats, different graphs: (Graph statistics and why we need graph drawings). In Biedl T, Kerren A, editors, Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings. Springer Verlag. 2018. p. 463-477. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-04414-5_33
Chen, Hang ; Soni, Utkarsh ; Lu, Yafeng ; Maciejewski, Ross ; Kobourov, Stephen. / Same stats, different graphs : (Graph statistics and why we need graph drawings). Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings. editor / Therese Biedl ; Andreas Kerren. Springer Verlag, 2018. pp. 463-477 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{ff67d8c4d4894500ac7569f88bd38523,
title = "Same stats, different graphs: (Graph statistics and why we need graph drawings)",
abstract = "Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.",
keywords = "Graph generators, Graph mining, Graph properties",
author = "Hang Chen and Utkarsh Soni and Yafeng Lu and Ross Maciejewski and Stephen Kobourov",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-04414-5_33",
language = "English (US)",
isbn = "9783030044138",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "463--477",
editor = "Therese Biedl and Andreas Kerren",
booktitle = "Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings",

}

TY - GEN

T1 - Same stats, different graphs

T2 - (Graph statistics and why we need graph drawings)

AU - Chen, Hang

AU - Soni, Utkarsh

AU - Lu, Yafeng

AU - Maciejewski, Ross

AU - Kobourov, Stephen

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

AB - Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

KW - Graph generators

KW - Graph mining

KW - Graph properties

UR - http://www.scopus.com/inward/record.url?scp=85059065454&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059065454&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-04414-5_33

DO - 10.1007/978-3-030-04414-5_33

M3 - Conference contribution

SN - 9783030044138

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 463

EP - 477

BT - Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings

A2 - Biedl, Therese

A2 - Kerren, Andreas

PB - Springer Verlag

ER -