### Abstract

Topological data analysis is becoming a popular way to study high dimensional feature spaces without any contextual clues or assumptions. This paper concerns itself with one popular topological feature, which is the number of d-dimensional holes in the dataset, also known as the Betti-d number. The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD), which indicates the birth and death times of these holes as scale varies. A common way to compare PDs is by a pointto-point matching, which is given by the n-Wasserstein metric. However, a big drawback of this approach is the need to solve correspondence between points before computing the distance, for n points, the complexity grows according to O(n3). Instead, we propose to use an entirely new framework built on Riemannian geometry, that models PDs as 2D probability density functions that are represented in the square-root framework on a Hilbert Sphere. The resulting space is much more intuitive with closed form expressions for common operations. The distance metric is 1) correspondence-free and also 2) independent of the number of points in the dataset. The complexity of computing distance between PDs now grows according to O(K2), for a K K discretization of [0, 1]2. This also enables the use of existing machinery in differential geometry towards statistical analysis of PDs such as computing the mean, geodesics, classification etc. We report competitive results with the Wasserstein metric, at a much lower computational load, indicating the favorable properties of the proposed approach.

Original language | English (US) |
---|---|

Title of host publication | Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016 |

Publisher | IEEE Computer Society |

Pages | 1023-1031 |

Number of pages | 9 |

ISBN (Electronic) | 9781467388504 |

DOIs | |

State | Published - Dec 16 2016 |

Event | 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016 - Las Vegas, United States Duration: Jun 26 2016 → Jul 1 2016 |

### Other

Other | 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016 |
---|---|

Country | United States |

City | Las Vegas |

Period | 6/26/16 → 7/1/16 |

### Fingerprint

### ASJC Scopus subject areas

- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering

### Cite this

*Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016*(pp. 1023-1031). [7789622] IEEE Computer Society. https://doi.org/10.1109/CVPRW.2016.132

**A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams.** / Anirudh, Rushil; Venkataraman, Vinay; Ramamurthy, Karthikeyan Natesan; Turaga, Pavan.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016.*, 7789622, IEEE Computer Society, pp. 1023-1031, 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016, Las Vegas, United States, 6/26/16. https://doi.org/10.1109/CVPRW.2016.132

}

TY - GEN

T1 - A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams

AU - Anirudh, Rushil

AU - Venkataraman, Vinay

AU - Ramamurthy, Karthikeyan Natesan

AU - Turaga, Pavan

PY - 2016/12/16

Y1 - 2016/12/16

N2 - Topological data analysis is becoming a popular way to study high dimensional feature spaces without any contextual clues or assumptions. This paper concerns itself with one popular topological feature, which is the number of d-dimensional holes in the dataset, also known as the Betti-d number. The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD), which indicates the birth and death times of these holes as scale varies. A common way to compare PDs is by a pointto-point matching, which is given by the n-Wasserstein metric. However, a big drawback of this approach is the need to solve correspondence between points before computing the distance, for n points, the complexity grows according to O(n3). Instead, we propose to use an entirely new framework built on Riemannian geometry, that models PDs as 2D probability density functions that are represented in the square-root framework on a Hilbert Sphere. The resulting space is much more intuitive with closed form expressions for common operations. The distance metric is 1) correspondence-free and also 2) independent of the number of points in the dataset. The complexity of computing distance between PDs now grows according to O(K2), for a K K discretization of [0, 1]2. This also enables the use of existing machinery in differential geometry towards statistical analysis of PDs such as computing the mean, geodesics, classification etc. We report competitive results with the Wasserstein metric, at a much lower computational load, indicating the favorable properties of the proposed approach.

AB - Topological data analysis is becoming a popular way to study high dimensional feature spaces without any contextual clues or assumptions. This paper concerns itself with one popular topological feature, which is the number of d-dimensional holes in the dataset, also known as the Betti-d number. The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD), which indicates the birth and death times of these holes as scale varies. A common way to compare PDs is by a pointto-point matching, which is given by the n-Wasserstein metric. However, a big drawback of this approach is the need to solve correspondence between points before computing the distance, for n points, the complexity grows according to O(n3). Instead, we propose to use an entirely new framework built on Riemannian geometry, that models PDs as 2D probability density functions that are represented in the square-root framework on a Hilbert Sphere. The resulting space is much more intuitive with closed form expressions for common operations. The distance metric is 1) correspondence-free and also 2) independent of the number of points in the dataset. The complexity of computing distance between PDs now grows according to O(K2), for a K K discretization of [0, 1]2. This also enables the use of existing machinery in differential geometry towards statistical analysis of PDs such as computing the mean, geodesics, classification etc. We report competitive results with the Wasserstein metric, at a much lower computational load, indicating the favorable properties of the proposed approach.

UR - http://www.scopus.com/inward/record.url?scp=85010208249&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85010208249&partnerID=8YFLogxK

U2 - 10.1109/CVPRW.2016.132

DO - 10.1109/CVPRW.2016.132

M3 - Conference contribution

SP - 1023

EP - 1031

BT - Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016

PB - IEEE Computer Society

ER -