reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mean Estimation with User-level Privacy under Data Heterogeneity

Authors: Rachel Cummings, Vitaly Feldman, Audra McMillan, Kunal Talwar

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We demonstrate asymptotic optimality of our estimator and also prove general lower bounds on the error achievable in our problem.
Researcher Affiliation	Collaboration	Rachel Cummings Department of Industrial Engineering and Operations Research Columbia University New York, NY 10027 EMAIL Vitaly Feldman Apple Cupertino, CA 95014 Audra Mc Millan Apple Cupertino, CA 95014 EMAIL Kunal Talwar Apple Cupertino, CA 95014 EMAIL
Pseudocode	Yes	Algorithm 1 Non-private Heterogeneous Mean Estimation. Algorithm 2 Private Heterogeneous Mean Estimation.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository.
Open Datasets	No	The paper is theoretical and does not describe experiments performed on a specific, publicly available dataset. It discusses data in a general, abstract sense (e.g., 'user data is heterogeneous', 'each user generates multiple data points').
Dataset Splits	No	The paper is theoretical and does not describe experiments that would involve dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any computational experiments or their hardware specifications.
Software Dependencies	No	The paper is theoretical and does not describe any computational experiments that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any computational experiments or their setup details, such as hyperparameters or training configurations.