Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal
Authors: Deheng Yuan, Tao Guo, Zhongyi Huang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Under certain regularity assumptions, we characterize the minimax optimal rates for all regimes, and identify phase transitions of the optimal rates as the samples per terminal vary from sparse to dense. This fully solves the problem left open by previous works, whose scopes are limited to regimes with either dense samples or a single sample per terminal. To achieve the optimal rates, we design a layered estimation protocol by exploiting protocols for the parametric density estimation problem. We show the optimality of the protocol using informationtheoretic methods and strong data processing inequalities, and incorporating the classic balls and bins model. The optimal rates are immediate for various special cases such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models. To establish our results, we need to prove both the upper and lower bounds for the minimax rate. |
| Researcher Affiliation | Academia | 1Department of Mathematical Sciences, Tsinghua University, Beijing, China. 2School of Cyber Science and Engineering, Southeast University, Nanjing, China. |
| Pseudocode | No | The paper describes a 'layered estimation protocol' in Section 4.1, but it is presented in descriptive text rather than structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide links to code repositories. |
| Open Datasets | No | The paper discusses various estimation settings such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models as special cases of their theoretical framework, which are ways of sample generation for theoretical analysis, not specific open datasets used in experiments. No concrete access information for any dataset is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on specific datasets, therefore no dataset split information is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup or specific hardware used for computations. |
| Software Dependencies | No | The paper is theoretical and does not describe any software implementations or list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical derivations and proofs of optimal rates for nonparametric estimation, and therefore does not provide experimental setup details or hyperparameters. |