reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Perturbed Proximal Gradient Algorithms

Authors: Yves F. Atchadé, Gersende Fort, Eric Moulines

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To support our ﬁndings, we discuss the inference of a sparse generalized linear model with random eﬀect and the problem of learning the edge structure and parameters of sparse undirected graphical models. In section 4, these results are illustrated with the problem of estimating a highdimensional discrete graphical models. In section 5, we consider high-dimensional random eﬀect logistic regression model. We report in this section some simulation results showing the performances of the stochastic proximal gradient algorithm. The relative errors are presented on Figure 1 and suggest that, when measured as function of resource used, Solver 1 and Solver 2 have roughly the same convergence rate.
Researcher Affiliation	Academia	Yves F. Atchad e EMAIL University of Michigan, 1085 South University, Ann Arbor, 48109, MI, United States, Gersende Fort EMAIL LTCI, CNRS, Telecom Paris Tech, Universit e Paris-Saclay, 46, rue Barrault 75013 Paris, France, Eric Moulines EMAIL CMAP, INRIA XPOP, Ecole Polytechnique, 91128 Palaiseau, France
Pseudocode	Yes	Algorithm 1 (Proximal gradient algorithm) Given θn, compute θn+1 = Proxγn+1,g (θn γn+1 f(θn)) . Algorithm 2 (Perturbed Proximal Gradient algorithm) Let θ0 Θ be the initial solution and {γn, n N} be a sequence of posi tive step sizes. For n 1, given (θ0, . . . , θn) construct an approximation Hn+1 of f(θn) and compute θn+1 = Proxγn+1,g (θn γn+1Hn+1) .
Open Source Code	No	The paper mentions the 'BMN package' as a third-party tool used for comparison but does not provide any information or links to source code specifically for the methodology described by the authors.
Open Datasets	No	We generate the true matrix θtrue such that it has on average p non-zero elements below the diagonal which are simulated from a uniform distribution on ( 4, 1) (1, 4). We generate the N p covariates matrix X columnwise, by sampling a stationary RN-valued autoregressive model with parameter ρ = 0.8 and Gaussian noise p 1 ρ2 NN(0, I). We generate the vector of regressors βtrue from the uniform distribution on [1, 5] and randomly set 98% of the coeﬃcients to zero.
Dataset Splits	No	The paper describes generating synthetic data for its experiments and mentions the total size of the datasets (N=250, p={50, 100, 200} in Section 4; N=500, p=1000, q=5 in Section 5.2). However, it does not specify any explicit training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide specific hardware details for running its experiments.
Software Dependencies	No	The paper mentions 'BMN package' in the context of comparing algorithms but does not specify its version or any other software dependencies with version numbers used for the authors' own implementation.
Experiment Setup	Yes	We use M = 20, B0(x) = x, N = 250 and for p {50, 100, 200}. By trial-and-error we set the regularization parameter to λ = 2.5 p log(p)/n for all the simulations. We compare the following two versions of Algorithm 2: 1. Solver 1: A version with a ﬁxed Monte Carlo batch size mn = 500, and decreasing step size γn = 25/(n+1) 2. Solver 2: A version with increasing Monte Carlo batch size mn = 500 + n1.2, and ﬁxed step size γn = 25. We run Solver 2 for Niter = 5p iterations, where p {50, 100, 200} is as above. We test the algorithms with N = 500, p = 1, 000 and q = 5. The variance of the random eﬀect is set to σ2 = 0.1. We use the Lasso penalty (α = 1 in (19)) with λ = 30. We compare the Monte Carlo proximal gradient algorithm (i) with ﬁxed batch size: γn = 0.01/ n and mn = 275 (Algo 1); γn = 0.5/n and mn = 275 (Algo 2). (ii) with increasing batch size: γn = γ = 0.005, mn = 200 + n (Algo 3); γn = γ = 0.001, mn = 200 + n (Algo 4); and γn = 0.05/ n and mn = 270 + n (Algo 5). Each algorithm is run for 150 iterations.