Statistical Query Lower Bounds for Tensor PCA
Authors: Rishabh Dudeja, Daniel Hsu
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide a sharp analysis of the optimal sample complexity in the Statistical Query (SQ) model and show that SQ algorithms with polynomial query complexity not only fail to solve Tensor PCA in the conjectured hard phase, but also have a strictly sub-optimal sample complexity compared to some polynomial time estimators such as the Richard-Montanari spectral estimator. Our analysis reveals that the optimal sample complexity in the SQ model depends on whether ET1 is symmetric or not. For symmetric, even order tensors, we also isolate a sample size regime in which it is possible to test if ET1 = 0 or ET1 = 0 with polynomially many queries but not estimate ET1. Our proofs rely on the Fourier analytic approach of Feldman et al. (2018) to prove sharp SQ lower bounds. |
| Researcher Affiliation | Academia | Rishabh Dudeja EMAIL Department of Statistics Columbia University New York, NY 10027, USA Daniel Hsu EMAIL Department of Computer Science Columbia University New York, NY 10027, USA |
| Pseudocode | No | The paper describes "optimal SQ procedures" in Section 3 and Appendix E, but it does not present them in a structured pseudocode block or explicitly label any section as "Pseudocode" or "Algorithm". The description is in paragraph form. |
| Open Source Code | No | The paper does not contain any statement about releasing source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | No | In the Tensor PCA problem introduced by Richard and Montanari (2014), one is given a dataset consisting of n samples T1:n of i.i.d. Gaussian tensors of order k with the promise that ET1 is a rank-1 tensor and ET1 = 1. This describes the theoretical problem setup, not an actual dataset used for experiments. The paper is theoretical and does not conduct experiments on datasets. |
| Dataset Splits | No | The paper is theoretical and describes a problem setup involving hypothetical 'n samples' rather than actual datasets. Therefore, it does not provide details on training/test/validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical analysis and lower bounds. It does not describe any experimental procedures that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical proofs and analysis. It does not describe any experimental implementations that would require specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on deriving lower bounds and optimal SQ procedures. It does not include any empirical experiments or mention specific hyperparameters, training schedules, or other experimental setup details. |