reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentially Private Topological Data Analysis

Authors: Taegyu Kang, Sehwan Kim, Jinwon Sohn, Jordan Awan

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the performance of our privatized persistence diagrams through simulations as well as on a real data set tracking human movement.
Researcher Affiliation	Academia	Taegyu Kang EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA Sehwan Kim EMAIL Department of Population Medicine Harvard Medical School/Harvard Pilgrim Health Care Institute Boston, MA, 02215, USA Jinwon Sohn EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA Jordan Awan EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA
Pseudocode	Yes	Algorithm 1 MCMC implementation of the exponential mechanism Input: P0(D), P1(D), and a positive integer M Initialization: P(0) 0,DP, P(0) 1,DP Unif(Pers M) for i = 1, 2, . . . , do P 0 = P(t 1) 0,DP + N(0, σ2I2), P 1 = P(t 1) 1,DP + N(0, σ2I2) P = (P 0, P 1) p = min 0, ϵ 2 u D(P(t 1)) u D(P ) U Unif(0, 1) if log U p then P(t) 0,DP = P 0, P(t) 1,DP = P 1 else P(t) 0,DP = P(t 1) 0,DP , P(t) 1,DP = P(t 1) 1,DP end if end for
Open Source Code	Yes	The R code is available at https://github.com/jwsohn612/DPTDA.
Open Datasets	Yes	Data is provided at http://bertrand.michel.perso.math.cnrs.fr/Enseignements/TDA/Tuto-Part1.html
Dataset Splits	No	The paper describes simulation studies where
Hardware Specification	No	The paper does not provide any specific hardware details used for running its experiments or simulations.
Software Dependencies	No	The paper mentions "The R code is available" but does not specify the version of R or any other software libraries with version numbers.
Experiment Setup	Yes	For the exponential mechanism, the default parameters are ϵ = 1, m = 0.2, n = 4000, and M = 5. These hyper-parameters were chosen by preliminary simulations. To sample from the exponential mechanism, we use a Markov chain Monte Carlo algorithm, speciﬁed in Appendix C.1. We choose the last iterate out of T = 10000 Monte Carlo diagrams as the reporting privatized diagram to be used for analysis . In Section 6, the parameters are: "We set the size of the sampling space M = 5, the resolution of the L1-DTM m = 0.05, the privacy budget ϵ = 1." and "We ... carry out 50000 iterations in the MCMC procedure in our mechanism"