reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Private Regression via Data-Dependent Sufficient Statistic Perturbation

Authors: Cecilia Ferrando, Daniel Sheldon

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show experimentally that DD-SSP outperforms the state-of-the-art data-independent SSP method Ada SSP for linear regression, and for logistic regression tasks, DD-SSP achieves better results than the widely used objective perturbation baseline. We also compare DD-SSP with DP-SGD (Abadi et al., 2016), known to achieve excellent performance when hyperparameters are properly fine-tuned. Our results show that the proposed method is competitive with DP-SGD when the privacy cost of hyperparameter tuning is taken into account. In our experiments, we evaluate the effectiveness of DD-SSP on both linear and logistic regression tasks. Figure 3 shows that DD-SSP and AIM-Synth have nearly identical performance and both improve significantly upon Ada SSP on all datasets except ACSIncome, where performance is similar.
Researcher Affiliation	Academia	Cecilia Ferrando EMAIL Manning College of Information and Computer Sciences University of Massachusetts Amherst Daniel Sheldon EMAIL Manning College of Information and Computer Sciences University of Massachusetts Amherst
Pseudocode	Yes	Algorithm 1 outlines how to retrieve approximate sufficient statistics XT X and ] XT y from marginals privately estimated by AIM. Algorithm 1 DD-SSP. Algorithm 5 outlines the Ada SSP method for linear regression (Wang, 2018). Algorithm 5 Ada SSP. Algorithm 6 Generalized Objective Perturbation Mechanism (Obj Pert) (Kifer et al., 2012). Algorithm 6 Generalized Objective Perturbation Mechanism. Algorithm 2 AIM (Mc Kenna et al., 2022). Algorithm 3 Initialize pt (Subroutine of Algorithm 2). Algorithm 4 Budget Annealing (Subroutine of Algorithm 2).
Open Source Code	Yes	1All experiment code is available at https://github.com/ceciliaferrando/DD-SSP.
Open Datasets	Yes	We use the following datasets:2 Adult (Becker and Kohavi, 1996): The target variable is num-education (number of education years) for linear regression and income>50K for logistic regression. Fire (Ridgeway et al., 2021): The target variable is Priority (of the call). Taxi (Grégoire et al., 2021): The target variable is totalamount (total fare amount). ACS Datasets (Ding et al., 2021): Data is queried for California (2018). Includes binary classification tasks for PINCP (income above $50k), MIG (mobility), ESR (employment), and PUBCOV (public coverage). ACSincome is also used for linear regression with the target variable PINCP (income) discretized into 20 bins. 2The ACS data is sourced from https://github.com/socialfoundations/folktables. All other datasets are sourced from https://github.com/ryan112358/hd-datasets.
Dataset Splits	Yes	Data is shuffled and split into 1,000 test points and up to 50,000 training points.
Hardware Specification	Yes	All experiments were conducted on an internal cluster equipped with Xeon Gold 6240 CPUs @ 2.60GHz, 192GB RAM, and 240GB local SSD storage.
Software Dependencies	No	The paper mentions several algorithms and methods like AIM (Mc Kenna et al., 2022) and DP-SGD (Abadi et al., 2016) but does not specify the versions of the software or libraries used to implement them (e.g., Python, PyTorch, TensorFlow, scikit-learn with version numbers).
Experiment Setup	Yes	DP-SGD s hyperparameters are fine-tuned by running a gridsearch for the best parameter. The search space spans the following values: Batch size: [n, 1024, 256] Gradient clipping norm: [0.01, 0.1, 0.2] Number or epochs: [1, 10, 20] Learning rate: [0.001, 0.01, 0.1, 1.0] AIM training: AIM is trained with a model size of 200MB, a maximum of 1,000 iterations, and a workload of all pairwise marginals. We compare the Mean Squared Error (MSE) of DP query-based methods DD-SSP and AIM-Synth against the DP baseline Ada SSP and the public baseline for ϵ {0.05, 0.1, 0.5, 1.0, 2.0}, with a fixed δ = 10 5.