reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Statistical Analysis and Parameter Selection for Mapper

Authors: Mathieu Carrière, Bertrand Michel, Steve Oudot

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide few examples of parameter selections and conﬁdence regions (which are unions of squares in the extended persistence diagrams) obtained with bottleneck bootstrap. ... We show in Figure 5 various Mappers ... and 85 percent conﬁdence regions computed on various data sets.
Researcher Affiliation	Academia	Mathieu Carri ere EMAIL Inria Saclay 91120 Palaiseau, France Bertrand Michel EMAIL LMJL UMR 6629 Ecole Centrale Nantes 44322 Nantes, France Steve Oudot EMAIL Inria Saclay 91120 Palaiseau, France
Pseudocode	No	No pseudocode or algorithm block is explicitly present in the paper. The methodology is described through textual explanations and mathematical formulations.
Open Source Code	Yes	The code we used is available in the Gudhi open source library (see Carri ere (2017)).
Open Datasets	Yes	The first data set comes from the Miller-Reaven diabetes study that contains 145 observations of patients suﬀering or not from diabete. Observations were mapped into R5 by computing various medical features. Data can be obtained in the locﬁt R-package. The second data set is an instance of the 16,384-dimensional COIL data set of Nene et al. (1996). We computed the Mapper of an ant shape and a human shape from Chen et al. (2009) embedded in R3
Dataset Splits	No	The paper mentions 'N = 100 subsamplings' and 'bootstrapping data 100 times' for statistical analysis and confidence region estimation, but does not provide specific train/test/validation dataset splits for model training or evaluation in the traditional machine learning sense.
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU/CPU models, memory, or processor types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'the Gudhi open source library (see Carri ere (2017))' but does not specify a version number for this or any other software dependency.
Experiment Setup	Yes	All δ parameters and resolutions were computed with Equation (9) (the δ parameters were also averaged over N = 100 subsamplings with β = 0.001), and all gains were set to 40%.