reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Proximal Distance Algorithms: Theory and Practice

Authors: Kevin L. Keys, Hua Zhou, Kenneth Lange

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our numerical examples demonstrate the utility of proximal distance algorithms in various high-dimensional settings, including a) linear programming, b) constrained least squares, c) projection to the closest kinship matrix, d) projection onto a second-order cone constraint, e) calculation of Horn s copositive matrix index, f) linear complementarity programming, and g) sparse principal components analysis. The proximal distance algorithm in each case is competitive or superior in speed to traditional methods such as the interior point method and the alternating direction method of multipliers (ADMM).
Researcher Affiliation	Academia	Kevin L. Keys EMAIL Department of Medicine University of California, San Francisco CA 94158, USA; Hua Zhou EMAIL Department of Biostatistics University of California, Los Angeles CA 90095-1772, USA; Kenneth Lange EMAIL Departments of Biomathematics, Human Genetics, and Statistics University of California, Los Angeles CA 90095-1766, USA
Pseudocode	Yes	ALGORITHM 1: A typical proximal distance algorithm
Open Source Code	Yes	Source code for the numerical examples can be found at https://github.com/klkeys/proxdist.
Open Datasets	Yes	The breast cancer data from PMA provide the data matrix X. The data consist of p = 19, 672 RNA measurements on n = 89 patients. The two ﬁgures show computation times and the proportion of variance explained (PVE) by the p q loading matrix U. For sparse PCA, PVE is deﬁned as tr(Xt q Xq)/ tr(Xt X), where Xq = XU(U t U) 1U t (Shen and Huang, 2008). When the loading vectors of U are orthogonal, this criterion reduces to the familiar deﬁnition tr(U t Xt XU)/ tr(Xt X) of PVE for ordinary PCA. The proximal distance algorithm enforces either matrix-wise or column-wise sparsity. In contrast, SPC enforces only column-wise sparsity via the constraint ui 1 c for each column ui of U. We take c = 8. The number of nonzeroes per loading vector output by SPC dictates the sparsity level for the column-wise version of the proximal distance algorithm. Summing these counts across all columns dictates the sparsity level for the matrix version of the proximal distance algorithm.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. While it mentions using data like "breast cancer data from PMA" or generating data with "standard normal deviates", it does not specify how this data is partitioned for experiments (e.g., percentages for train/test/validation sets, or specific methodology for splits).
Hardware Specification	No	The paper mentions CPU times for various algorithms and problem sizes, indicating that computational experiments were performed. However, it does not provide any specific details about the hardware used (e.g., CPU models, GPU models, memory, or cloud instances) on which these timings were measured.
Software Dependencies	No	All of our examples are coded in the Julia programming language. Whenever possible, competing software was run in the Julia environment via the Julia module Math Prog Base (Dunning et al., 2017; Lubin and Dunning, 2015). The sparse PCA problem relies on the software of Witten et al. (Witten et al., 2009), which is coded in R. The paper mentions various software and programming languages (Julia, Math Prog Base, R, SCS, Gurobi, Ipopt, Mosek) but generally lacks specific version numbers for these dependencies.
Experiment Setup	Yes	Convergence is tested at iteration k by the two criteria \|f(xk) f(xk 1)\| ϵ1[\|f(xk 1)\| + 1] and dist(xk, C) ϵ2, where ϵ1 = 10 6 and ϵ2 = 10 4 are typical values. For dense problems, the proximal distance algorithms start the penalty constant ρ at 1 and double it every 100 iterations. For sparse problems the proximal distance algorithms update ρ by a factor of 1.5 every 50 iterations. For the proximal distance algorithm, we start ρ at 1 and multiply it by 1.5 every 200 iterations. In algorithm PD1 we set ρk = max{1.2k, 222}. In the accelerated versions PD2 and PD3 we started ρ at 1 and multiplied it by 5 every 100 iterations. Dense problems converge quickly and accurately when we set ρ0 = 1 and double ρ every 100 iterations. Sparse problems require a greater range and faster updates of ρ, so we set ρ0 = 0.01 and then multiply ρ by 2.5 every 10 iterations. The choice ρk = 1.2k converges in 600 to 700 iterations from random starting points and reliably yields objective values below 10 5 for Horn matrices.