Differentially Private Topological Data Analysis
Authors: Taegyu Kang, Sehwan Kim, Jinwon Sohn, Jordan Awan
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the performance of our privatized persistence diagrams through simulations as well as on a real data set tracking human movement. |
| Researcher Affiliation | Academia | Taegyu Kang EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA Sehwan Kim EMAIL Department of Population Medicine Harvard Medical School/Harvard Pilgrim Health Care Institute Boston, MA, 02215, USA Jinwon Sohn EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA Jordan Awan EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA |
| Pseudocode | Yes | Algorithm 1 MCMC implementation of the exponential mechanism Input: P0(D), P1(D), and a positive integer M Initialization: P(0) 0,DP, P(0) 1,DP Unif(Pers M) for i = 1, 2, . . . , do P 0 = P(t 1) 0,DP + N(0, σ2I2), P 1 = P(t 1) 1,DP + N(0, σ2I2) P = (P 0, P 1) p = min 0, ϵ 2 u D(P(t 1)) u D(P ) U Unif(0, 1) if log U p then P(t) 0,DP = P 0, P(t) 1,DP = P 1 else P(t) 0,DP = P(t 1) 0,DP , P(t) 1,DP = P(t 1) 1,DP end if end for |
| Open Source Code | Yes | The R code is available at https://github.com/jwsohn612/DPTDA. |
| Open Datasets | Yes | Data is provided at http://bertrand.michel.perso.math.cnrs.fr/Enseignements/TDA/Tuto-Part1.html |
| Dataset Splits | No | The paper describes simulation studies where |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running its experiments or simulations. |
| Software Dependencies | No | The paper mentions "The R code is available" but does not specify the version of R or any other software libraries with version numbers. |
| Experiment Setup | Yes | For the exponential mechanism, the default parameters are ϵ = 1, m = 0.2, n = 4000, and M = 5. These hyper-parameters were chosen by preliminary simulations. To sample from the exponential mechanism, we use a Markov chain Monte Carlo algorithm, specified in Appendix C.1. We choose the last iterate out of T = 10000 Monte Carlo diagrams as the reporting privatized diagram to be used for analysis . In Section 6, the parameters are: "We set the size of the sampling space M = 5, the resolution of the L1-DTM m = 0.05, the privacy budget ϵ = 1." and "We ... carry out 50000 iterations in the MCMC procedure in our mechanism" |