reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness

Authors: Eli Chien, Pan Li

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide positive answers by proving convergent R enyi DP bound for non-convex nonsmooth losses, where we show that requiring losses to have H older continuous gradient is sufficient. We also provide a strictly better privacy bound compared to state-of-the-art results for smooth strongly convex losses. Our analysis relies on the improvement of shifted divergence analysis in multiple aspects, including forward Wasserstein distance tracking, identifying the optimal shifts allocation, and the H older reduction lemma. Our results further elucidate the benefit of hidden-state analysis for DP and its applicability. ... Figure 1: (a) Our RDP guarantees for smooth losses over the bounded domain, where the noise variance is the same for all lines. ... (b) The detailed comparison of our privacy bound with Altschuler & Talwar (2022); Ye & Shokri (2022) for smooth strongly convex losses. The setting is the same as (a). We relegate the detail setting to Appendix A.13. ... Figure 2: (a) Our RDP bound for non-smooth loss with (L, λ)-H older continuous gradient, where we empirically estimate the H older continuous constant L of a 2 layer Multi-Layer Perceptron (MLP) for each λ. See Appendix A.13 for the detailed setting. ... A.13 NUMERICAL EVALUATIONS
Researcher Affiliation	Academia	Eli Chien & Pan Li Department of Electrical and Computer Engineering Georgia Institute of Technology Georgia, U.S.A. EMAIL
Pseudocode	No	The paper describes algorithms and methods using mathematical notation and textual descriptions, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets	Yes	For the experiment shown in Figure 2 (a), our model is a 2-layer MLP with a hidden dimension of 64. The task is classification on the UCI Iris dataset (Fisher, 1988) as a toy example.
Dataset Splits	No	The paper mentions using the UCI Iris dataset and sampling points for Holder constant estimation, but it does not provide specific details on training/test/validation splits for the model's classification task.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments.
Software Dependencies	No	The paper mentions 'scipy.optimize.minimize function in scipy library' but does not specify version numbers for Python, SciPy, or any other key software libraries used in the experiments.
Experiment Setup	Yes	We set the learning rate to be 0.1, training epoch 200, gradient clipping norm 1.0, and use cross-entropy loss with an additional 0.1 multiplicative factor.