reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Learning Parallel Pancakes with Mostly Uniform Weights

Authors: Ilias Diakonikolas, Daniel Kane, Sushrut Karmalkar, Jasper C.H. Lee, Thanasis Pittas

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary. This work is theoretical in nature and focuses on advancing fundamental knowledge.
Researcher Affiliation	Collaboration	1Department of Computer Sciences, University of Wisconsin Madison, Madison, United States 2University of California San Diego, San Diego, United States 3Microsoft Research, Cambridge, England 4Some of this work was done while the author was a postdoctoral researcher at UW-Madison. 5University of California Davis, Davis, United States. Correspondence to: Thanasis Pittas <EMAIL>.
Pseudocode	Yes	Algorithm 1 Testing Algorithm (simplified) 1: Input: k, n. Output: ˆH {H0, H1}. 2: For i = 1, 2, 3, . . . , C (log(k) + k ) do 3: Draw x1, . . . , xn D. 4: Define M 1 n Pn i=1 x i. 5: Define M := Ex N(0,I)[x i]. 6: If a=(i1, . . . , ji) such that \|Ma M a\|>d Cmλm 7: then Output H1 and terminate. 8: Return H0. The tester above is a simplified version. However, it is not fully correct, as we must ensure the concentration of the empirical tensor to bound the sample complexity. The Gaussian N(0, I) s empirical tensor is well-concentrated. While the parallel pancake s tensor might not concentrate well, this happens only when there is a Gaussian component much farther than d from the origin otherwise every sample from the parallel pancake is within O( d) in norm in high probability, and the empirical tensor is entrywise well-concentrated (e.g. by Hoeffding). This is also a testable condition: with log(k)/wmin samples, we will be able to check if every component is centered at most O( d) from the origin. The full version of the algorithm with this additional preliminary check, along with its correctness proof, are provided in Appendix D.1.
Open Source Code	No	The paper does not provide any statement about code availability or links to source code repositories.
Open Datasets	No	The paper is theoretical and does not describe experiments using specific datasets, thus no information on open datasets is provided.
Dataset Splits	No	The paper does not describe any experiments using specific datasets, therefore, it does not provide information on dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any experiments, therefore, it does not provide hardware specifications.
Software Dependencies	No	The paper is theoretical and does not describe any specific software implementations or dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any experiments, therefore, it does not provide details on experimental setup or hyperparameters.