reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sparse GCA and Thresholded Gradient Descent

Authors: Sheng Gao, Zongming Ma

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also demonstrate the prowess of the algorithm on a number of synthetic data sets. ... This section reports numerical results on synthetic data sets. ... Table 1 reports the results of the aforementioned simulation study. For all latent dimensions, we observe a signiﬁcant decrease in estimation error after Algorithm 1 is applied. ... From Figure 1, we observe an approximate linear decay trend at the beginning in all cases, which corresponds to exponential decay in the original scale. Moreover, after suﬃciently many iterations, all error curves plateau, which suggests that the performance of the resulting estimators have stabilized. Both phenomena agree well with the theoretical ﬁndings in Theorem 7.
Researcher Affiliation	Academia	Sheng Gao EMAIL Zongming Ma EMAIL Department of Statistics and Data Science University of Pennsylvania Philadelphia, PA 19104, USA
Pseudocode	Yes	Algorithm 1: Thresholded gradient descent for sparse GCA Input: Covariance matrix estimator bΣ and its block diagonal part bΣ0; Initialization b A0. Tuning Parameters: Step size η; Penalty λ; Sparsity level s ; Number of iterations T;
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. There is no explicit statement about code release, nor are there any repository links or mentions of code in supplementary materials.
Open Datasets	No	This section reports numerical results on synthetic data sets. ... To generate covariance matrices Σ and Σ0, we use the latent variable model speciﬁed in Section 2. ... To generate U{i} Rpi r, we ﬁrst randomly select a support of size 5.
Dataset Splits	Yes	The procedure for the selection of s is as follows. We ﬁrst randomly split the data X into ﬁve folds of equal sizes. For l = 1, . . . , 5, we use one fold as the test set Xtest (l) and the other four folds combined as the training set Xtrain (l) .
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. It describes numerical studies but omits hardware specifications.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments. Only conceptual algorithms and mathematical frameworks are discussed.
Experiment Setup	Yes	For tuning parameters in Algorithm 1, we set s = 20, η = 0.001, λ = 0.01, and T = 15000. The tuning parameter for generalized Fantope initialization is set to be n . The truncation parameter for initialization is also set to be s = 20.