### Abstract

Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

Original language | English (US) |
---|---|

Title of host publication | Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings |

Editors | Therese Biedl, Andreas Kerren |

Publisher | Springer Verlag |

Pages | 463-477 |

Number of pages | 15 |

ISBN (Print) | 9783030044138 |

DOIs | |

State | Published - Jan 1 2018 |

Event | 26th International Symposium on Graph Drawing and Network Visualization, GD 2018 - Barcelona, Spain Duration: Sep 26 2018 → Sep 28 2018 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 11282 LNCS |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 26th International Symposium on Graph Drawing and Network Visualization, GD 2018 |
---|---|

Country | Spain |

City | Barcelona |

Period | 9/26/18 → 9/28/18 |

### Fingerprint

### Keywords

- Graph generators
- Graph mining
- Graph properties

### ASJC Scopus subject areas

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings*(pp. 463-477). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11282 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-04414-5_33

**Same stats, different graphs : (Graph statistics and why we need graph drawings).** / Chen, Hang; Soni, Utkarsh; Lu, Yafeng; Maciejewski, Ross; Kobourov, Stephen.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings.*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11282 LNCS, Springer Verlag, pp. 463-477, 26th International Symposium on Graph Drawing and Network Visualization, GD 2018, Barcelona, Spain, 9/26/18. https://doi.org/10.1007/978-3-030-04414-5_33

}

TY - GEN

T1 - Same stats, different graphs

T2 - (Graph statistics and why we need graph drawings)

AU - Chen, Hang

AU - Soni, Utkarsh

AU - Lu, Yafeng

AU - Maciejewski, Ross

AU - Kobourov, Stephen

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

AB - Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (≤10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

KW - Graph generators

KW - Graph mining

KW - Graph properties

UR - http://www.scopus.com/inward/record.url?scp=85059065454&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059065454&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-04414-5_33

DO - 10.1007/978-3-030-04414-5_33

M3 - Conference contribution

SN - 9783030044138

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 463

EP - 477

BT - Graph Drawing and Network Visualization - 26th International Symposium, GD 2018, Proceedings

A2 - Biedl, Therese

A2 - Kerren, Andreas

PB - Springer Verlag

ER -