reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimizing Noise Distributions for Differential Privacy

Authors: Atefeh Gilani, Juan Felipe Gomez, Shahab Asoodeh, Flavio Calmon, Oliver Kosut, Lalitha Sankar

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results demonstrate that our optimized distributions are consistently better, with significant improvements in (ε, δ)-DP guarantees in the moderate composition regimes, compared to Gaussian and Laplace distributions with the same variance. In this section, we present numerical results evaluating the privacy characteristics of our proposed RDP noise.
Researcher Affiliation	Academia	1School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ, USA 2Department of Physics, Harvard University, Cambridge, MA, USA 3Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada 4School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA. Correspondence to: Atefeh Gilani <EMAIL>.
Pseudocode	Yes	We propose an algorithm for optimizing noise distributions under (ε, δ)-DP, outlined in Algorithm 1 1. Algorithm 1 Optimal Noise Distribution for (ε, δ)-DP Algorithm 2 Initialize Distribution Algorithm 3 Optimization Algorithm
Open Source Code	Yes	Our implementation is available on Git Hub (git, 2025). Renyi DP mechanism design. https://github.com/SankarLab/Renyi-DP-Mechanism-Design, May 2025.
Open Datasets	Yes	Breast Cancer Wisconsin (Diagnostic) (Wolberg et al., 1993), Diabetes (learn developers), and the UCI Heart Disease dataset (Cleveland subset) (Janosi et al., 1988).
Dataset Splits	No	To obtain the results in Table 1, we assume that the 5th and 95th percentiles of each feature are privately released. These quantiles are used to rescale each feature by subtracting the 5th percentile and dividing by the difference between the 95th and 5th percentiles. This transformation maps most feature values to the [0, 1] range, and any values outside this range are clipped to 0 or 1. This normalization step ensures that all features are on a consistent scale before noise is added. For each query, we generate 100,000 differentially private outputs by adding noise according to the selected mechanism and compute the mean squared error (MSE) with respect to the true (non-private) value. We average the MSEs across 10 queries, repeat the process for 20 random seeds, and report the mean and standard deviation ( ) of the improvement over Gaussian noise. The paper does not provide specific train/test/validation dataset splits for the mentioned datasets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions its implementation is available on Git Hub but does not specify any software dependencies with version numbers (e.g., programming language, libraries, or frameworks with their versions) in the main text.
Experiment Setup	Yes	Algorithm 3 Optimization Algorithm input : Privacy parameter δ, Number of compositions Nc, Noise scale σ, Query sensitivity s, Total number of iterations K, Initial distribution p0, Distribution parameters N, r, , R enyi parameter (α) update time step T, type (discrete or continuous) The noise parameters are = 0.01, r = 0.9999, and N = 8000 for the left plot, and N = 4000 for the right. The noise parameters are = 0.005, N = 1600, and r = 0.9999.