Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Mean Estimation with User-level Privacy under Data Heterogeneity
Authors: Rachel Cummings, Vitaly Feldman, Audra McMillan, Kunal Talwar
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We demonstrate asymptotic optimality of our estimator and also prove general lower bounds on the error achievable in our problem. |
| Researcher Affiliation | Collaboration | Rachel Cummings Department of Industrial Engineering and Operations Research Columbia University New York, NY 10027 EMAIL Vitaly Feldman Apple Cupertino, CA 95014 Audra Mc Millan Apple Cupertino, CA 95014 EMAIL Kunal Talwar Apple Cupertino, CA 95014 EMAIL |
| Pseudocode | Yes | Algorithm 1 Non-private Heterogeneous Mean Estimation. Algorithm 2 Private Heterogeneous Mean Estimation. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments performed on a specific, publicly available dataset. It discusses data in a general, abstract sense (e.g., 'user data is heterogeneous', 'each user generates multiple data points'). |
| Dataset Splits | No | The paper is theoretical and does not describe experiments that would involve dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments or their hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any computational experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any computational experiments or their setup details, such as hyperparameters or training configurations. |