reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Noise-Aware Differentially Private Regression via Meta-Learning

Authors: Ossi Räisä, Stratis Markou, Matthew Ashman, Wessel Bruinsma, Marlon Tobaben, Antti Honkela, Richard Turner

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on synthetic and a sim-to-real task with real data. We provide the exact experimental details in Appendix E.
Researcher Affiliation	Collaboration	Ossi Räisä University of Helsinki EMAIL Stratis Markou University of Cambridge EMAIL Matthew Ashman University of Cambridge EMAIL Wessel P. Bruinsma Microsoft Research AI for Science EMAIL Marlon Tobaben University of Helsinki EMAIL Antti Honkela University of Helsinki EMAIL Richard E. Turner University of Cambridge EMAIL
Pseudocode	Yes	Algorithm 1 Meta-training a neural process. Algorithm 2 Meta-testing a neural process. Algorithm 3 DPSet Conv; modifications to the original Set Conv layer shown in blue. Algorithm 4 Efficient sampling of GP noise on a D-dimensional grid.
Open Source Code	Yes	We make our implementation of the DPConv CNP public in the repository https://github.com/cambridge-mlg/dpconvcnp.
Open Datasets	Yes	We evaluated the performance of the DPConv CNP in a sim-to-real task, where we train the model on simulated data and test it on the the Dobe !Kung dataset [Howell, 2009], also used by Smith et al. [2018], containing age, weight and height measurements of 544 individuals. The Dobe !Kung dataset is publicly available in Tensor Flow 2 [Abadi et al., 2016], specifically the Tensorflow Datasets package.
Dataset Splits	Yes	Throughout optimisation, we maintain a fixed set of 2,048 tasks generated in the same way, as a validation set.
Hardware Specification	Yes	We train the DPConv CNP on a single NVIDIA Ge Force RTX 2080 Ti GPU, on a machine with 20 CPU workers.
Software Dependencies	No	For all our experiments with the DPConv CNP we use Adam with a learning rate of 3 10 4, setting all other options to the default Tensor Flow 2 settings. We use Optuna [Akiba et al., 2019] to perform the BO, and Opacus [Yousefpour et al., 2021] to perform DP-SGD using the PRV privacy accountant.
Experiment Setup	Yes	For all our experiments with the DPConv CNP we use Adam with a learning rate of 3 10 4, setting all other options to the default Tensor Flow 2 settings. For the DPConv CNP we use 6,553,600 such tasks with a batch size of 16 at training time, which is equivalent to 409,600 gradient update steps. For all our experiments, we initialise the DPSet Conv and Set Conv lengthscales (which are also used to sample the DP noise) to λ = 0.20, and allow this parameter to be optimised during training.