Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Mean Estimation with User-level Privacy under Data Heterogeneity

Authors: Rachel Cummings, Vitaly Feldman, Audra McMillan, Kunal Talwar

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We demonstrate asymptotic optimality of our estimator and also prove general lower bounds on the error achievable in our problem.
Researcher Affiliation Collaboration Rachel Cummings Department of Industrial Engineering and Operations Research Columbia University New York, NY 10027 EMAIL Vitaly Feldman Apple Cupertino, CA 95014 Audra Mc Millan Apple Cupertino, CA 95014 EMAIL Kunal Talwar Apple Cupertino, CA 95014 EMAIL
Pseudocode Yes Algorithm 1 Non-private Heterogeneous Mean Estimation. Algorithm 2 Private Heterogeneous Mean Estimation.
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository.
Open Datasets No The paper is theoretical and does not describe experiments performed on a specific, publicly available dataset. It discusses data in a general, abstract sense (e.g., 'user data is heterogeneous', 'each user generates multiple data points').
Dataset Splits No The paper is theoretical and does not describe experiments that would involve dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any computational experiments or their hardware specifications.
Software Dependencies No The paper is theoretical and does not describe any computational experiments that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe any computational experiments or their setup details, such as hyperparameters or training configurations.