Testing with Non-identically Distributed Samples
Authors: Shivam Garg, Chirag Pabbaraju, Kirankumar Shiragur, Gregory Valiant
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing framework: Suppose there is a set of distributions over a discrete support of size k, p1, p2, . . . , p T , and we obtain c independent draws from each distribution. Suppose the goal is to learn or test a property of the average distribution, pavg. ... Our first main result is that with just c = 2 samples from each distribution, we recover the full strength of sublinear sample uniformity (and identity) testing. ... The main part of the proof constitutes bounding the variance of the estimator that we construct in order to show that it concentrates around its mean. |
| Researcher Affiliation | Collaboration | Shivam Garg EMAIL Microsoft Research Chirag Pabbaraju EMAIL Stanford University Kirankumar Shiragur EMAIL Microsoft Research Gregory Valiant EMAIL Stanford University |
| Pseudocode | No | The paper describes algorithms for uniformity testing and identity testing using mathematical formulas and prose, such as in Section 3 'Uniformity testing from non-identical samples' and Section 4 'Identity testing from non-identical samples', but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to code repositories or supplementary materials containing code. |
| Open Datasets | No | The paper is theoretical and analyzes properties of distributions. It does not use or refer to any specific datasets, public or otherwise, for empirical evaluation. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with datasets. Therefore, there are no mentions of training, testing, or validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical proofs and algorithms for property testing. It does not describe any computational experiments or the hardware used to perform them. |
| Software Dependencies | No | The paper is theoretical and does not describe any computational implementations or experiments. Consequently, no specific software dependencies or their version numbers are mentioned. |
| Experiment Setup | No | The paper is purely theoretical, presenting mathematical analysis, proofs, and algorithm design for property testing. It does not include details on experimental setup, hyperparameters, or training configurations. |