reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Power Iteration for Tensor PCA

Authors: Jiaoyang Huang, Daniel Z. Huang, Qing Yang, Guang Cheng

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct numerical experiments on synthetic data to demonstrate our distributional results provided in Sections 2.1 and 2.2. We fix the dimension n = 600 and rank k = 3. We plot in Figure 1 our estimators for the strength of signals after normalization (22) and our estimators for the linear functionals of the signals (23) as in Corollary 3. The results are reported over 2000 independent trials where the initialization of our power iteration algorithm u a random vector sampled from the unit sphere in Rn, and the strength of signal β = n(k 2)/2 ≈ 24.495.
Researcher Affiliation	Academia	Jiaoyang Huang EMAIL New York University, New York, NY Daniel Z. Huang EMAIL California Institute of Technology, Pasadena, CA Qing Yang EMAIL University of Science and Technology of China, China Guang Cheng EMAIL University of California, Los Angeles & Purdue University, Los Angeles, CA
Pseudocode	No	The paper describes the power iteration algorithm mathematically via equations (2) and (3): "We consider the power iteration algorithm given by the following recursion u0 = u, ut+1 = X[u (k 1) t ] / \|\|X[u (k 1) t ]\|\|2 (2) where u ∈ Rn with \|\|u\|\|2 = 1 is the initial vector, and X[v (k 1)] ∈ Rn is the vector with i-th entry given by <X, ei ⊗ v (k 1)>. The estimators are given by bv = u T , bβ = <X, bv k >. (3) for some large T.". There are no explicit pseudocode blocks or algorithm listings.
Open Source Code	No	The paper does not contain any statements about releasing code or links to source code repositories.
Open Datasets	No	The paper explicitly states the use of "synthetic data": "In this section, we conduct numerical experiments on synthetic data to demonstrate our distributional results provided in Sections 2.1 and 2.2." It describes how this data is generated (e.g., "We take the signal v a random vector sampled from the unit sphere in Rn"), but does not refer to any pre-existing, publicly available datasets.
Dataset Splits	No	The paper conducts numerical experiments on synthetically generated data and reports results over a number of "independent trials" (e.g., "The results are reported over 2000 independent trials", "Over 1500 independent trials"). There is no mention of dataset splits such as training, validation, or test sets, as the data is generated on-the-fly for each trial.
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, memory, etc.) used for running the numerical experiments.
Software Dependencies	No	The paper does not provide any specific software versions or libraries used for the numerical studies.
Experiment Setup	Yes	In Section 3, "Numerical Study", the paper details several aspects of its experimental setup: "We fix the dimension n = 600 and rank k = 3.", "We take the signal v a random vector sampled from the unit sphere in Rn", and specifies initialization methods: "u a random vector sampled from the unit sphere in Rn" for random initialization, and "u = (v + w)/ \|\|v + w\|\|2" for informative initialization. It also provides specific signal strength values: "β = n(k 2)/2 ≈ 24.495", "β = 5", "β = 10", and for multi-rank cases "β1 = 1.2n(k 2)/2 ≈ 29.394 and β2 = n(k 2)/2 ≈ 24.495" along with dimensions tested: "n ∈ {200, 300, 400, 500, 600}".