Statistical Topological Data Analysis using Persistence Landscapes
Authors: Peter Bubenik
JMLR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show how a number of standard statistical tests can be used for statistical inference using this summary. We also prove that this summary is stable and that it can be used to provide lower bounds for the bottleneck and Wasserstein distances. In Section 4.1, "we sample 200 points from the uniform distribution on the union of two annuli. We then calculate the corresponding persistence landscape in degree one using the Vietoris-Rips complex. We repeat this 100 times and calculate the mean persistence landscape." In Section 4.2, "We sample this field on a 100 by 100 grid, and calculate the persistence landscape of the sublevel set... We calculate the mean persistence landscapes in degrees 0 and 1 from 100 samples." In Section 4.3, "Here we combine persistence landscapes and statistical inference to discriminate between iid samples of 1000 points from a torus and a sphere... We then calculate the persistence landscape of this filtered simplicial complex for 100 samples and plot the mean landscapes." |
| Researcher Affiliation | Academia | Peter Bubenik EMAIL Department of Mathematics Cleveland State University Cleveland, OH 44115-2214, USA |
| Pseudocode | No | The paper describes mathematical concepts and theoretical results with proofs. It does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The persistent homologies in this section were calculated using java Plex (Tausz et al., 2011) and Perseus by Nanda (2013). Another publicly available alternative is Dionysus by Morozov (2012). In Section 4.2 we use Matlab code courtesy of Eliran Subag that implements an algorithm from Wood and Chan (1994). These are third-party tools used by the authors, not open-source code for the methodology presented in this paper. |
| Open Datasets | No | The paper uses generated data for its examples. For instance, in Section 4.1, "we sample 200 points from the uniform distribution on the union of two annuli." In Section 4.2, "we consider a stationary Gaussian random field on [0, 1]2... We sample this field on a 100 by 100 grid." In Section 4.3, "we sample 1000 points from a torus and a sphere." No concrete access information for publicly available datasets is provided; instead, the authors describe how they generated their own data for the experiments. |
| Dataset Splits | No | The paper does not describe traditional training/test/validation dataset splits. Instead, it describes generating multiple samples or repetitions for statistical analysis, such as repeating a sampling process 100 times, but these are not dataset splits in the conventional machine learning sense for reproduction of specific data partitions. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The persistent homologies in this section were calculated using java Plex (Tausz et al., 2011) and Perseus by Nanda (2013). Another publicly available alternative is Dionysus by Morozov (2012). In Section 4.2 we use Matlab code courtesy of Eliran Subag that implements an algorithm from Wood and Chan (1994). While software names are mentioned, no specific version numbers are provided for JavaPlex, Perseus, Dionysus, or Matlab. |
| Experiment Setup | Yes | In Section 4.1, "We sample 200 points from the uniform distribution on the union of two annuli." In Section 4.2, "We sample this field on a 100 by 100 grid... We calculate the mean persistence landscapes... from 100 samples." In Section 4.3, "we sample 1000 points... we construct a filtered simplicial complex as follows. First we triangulate the underlying space using the Coxeter Freudenthal Kuhn triangulation, starting with a cubical grid with sides of length 1/2. Next we smooth our data using a triangular kernel with bandwidth 0.9. We evaluate this kernel density estimator at the vertices of our simplicial complex... calculate the persistence landscape of this filtered simplicial complex for 100 samples." The paper also mentions using "the permutation test with 10,000 repetitions". |