reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Statistical Query Lower Bounds for Tensor PCA

Authors: Rishabh Dudeja, Daniel Hsu

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We provide a sharp analysis of the optimal sample complexity in the Statistical Query (SQ) model and show that SQ algorithms with polynomial query complexity not only fail to solve Tensor PCA in the conjectured hard phase, but also have a strictly sub-optimal sample complexity compared to some polynomial time estimators such as the Richard-Montanari spectral estimator. Our analysis reveals that the optimal sample complexity in the SQ model depends on whether ET1 is symmetric or not. For symmetric, even order tensors, we also isolate a sample size regime in which it is possible to test if ET1 = 0 or ET1 = 0 with polynomially many queries but not estimate ET1. Our proofs rely on the Fourier analytic approach of Feldman et al. (2018) to prove sharp SQ lower bounds.
Researcher Affiliation	Academia	Rishabh Dudeja EMAIL Department of Statistics Columbia University New York, NY 10027, USA Daniel Hsu EMAIL Department of Computer Science Columbia University New York, NY 10027, USA
Pseudocode	No	The paper describes "optimal SQ procedures" in Section 3 and Appendix E, but it does not present them in a structured pseudocode block or explicitly label any section as "Pseudocode" or "Algorithm". The description is in paragraph form.
Open Source Code	No	The paper does not contain any statement about releasing source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets	No	In the Tensor PCA problem introduced by Richard and Montanari (2014), one is given a dataset consisting of n samples T1:n of i.i.d. Gaussian tensors of order k with the promise that ET1 is a rank-1 tensor and ET1 = 1. This describes the theoretical problem setup, not an actual dataset used for experiments. The paper is theoretical and does not conduct experiments on datasets.
Dataset Splits	No	The paper is theoretical and describes a problem setup involving hypothetical 'n samples' rather than actual datasets. Therefore, it does not provide details on training/test/validation dataset splits.
Hardware Specification	No	The paper is theoretical and focuses on mathematical analysis and lower bounds. It does not describe any experimental procedures that would require specific hardware specifications.
Software Dependencies	No	The paper is theoretical and focuses on mathematical proofs and analysis. It does not describe any experimental implementations that would require specific software dependencies or version numbers.
Experiment Setup	No	The paper is theoretical and focuses on deriving lower bounds and optimal SQ procedures. It does not include any empirical experiments or mention specific hyperparameters, training schedules, or other experimental setup details.