reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Non asymptotic analysis of Adaptive stochastic gradient algorithms and applications

Authors: Antoine Godichon-Baggioni, Pierre Tarrago

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Simulation study In this simulation study, we consider the following scenarios: Stochastic Newton Algorithm: ... Adagrad: ... 6.1 Linear model ... generate 50 datasets of size n = 10^5. ... In Figures 1 and 2, we analyze the evolution of the quadratic mean error of the estimators ... 6.2 Logistic regression ... In Figures 3 and 4, we analyze the evolution of the quadratic mean error of the estimates as a function of the sample size n.
Researcher Affiliation	Academia	Antoine Godichon-Baggioni EMAIL Laboratoire de Probabilités, Statistique et Modélisation Sorbonne Université Pierre Tarrago EMAIL Laboratoire de Probabilités, Statistique et Modélisation Sorbonne Université
Pseudocode	No	Then, an adaptive stochastic gradient algorithm is defined recursively for all n 0 by θn+1 = θn γn+1An hg (Xn+1, θn) , where θ0 is arbitrarily chosen... The stochastic Newton algorithm is defined recursively for all n 0 by (Boyer & Godichon-Baggioni, 2020) θn+1 = θn + γn+1 S 1 n Yn+1 XT n+1θn Xn+1 (6)
Open Source Code	No	The paper does not provide any explicit statement about releasing source code, nor does it include links to a code repository.
Open Datasets	No	6.1 Linear model We consider the linear model: Y = XT θ + ϵ, where X N (0, diag(1, . . . , d)) and ϵ N(0, 1). ... 6.2 Logistic regression We now consider the logistic regression case: Y \|X B π θT X , where X N (0, diag(1, . . . , d)) and π(x) = ex. The paper describes how the data is generated, but does not provide specific access information or citation to a publicly available dataset instance.
Dataset Splits	No	In the following experiments, we set d = 10 and generate 50 datasets of size n = 10^5. The paper mentions the total size of the generated datasets but does not specify any training, validation, or test splits.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models or cloud resources used for running experiments.
Software Dependencies	No	In practice, we generate a sample of size 10^7 and approximate the minimizer using the R function glmnet. Only "R function glmnet" is mentioned without a specific version number.
Experiment Setup	Yes	Stochastic Newton Algorithm: We set cγ = 1 and initialize An = 1 10Id to stabilize the algorithm during the first iterations. Additionally, and again for stabilization purposes, as suggested in Boyer & Godichon Baggioni (2020), we use a modified step size, taking γn = cγ (n+20)γ . We consider: The choice of γ: γ = 0.66 or γ = 0.75. The use of truncation or not, with cβ = 1 and β = γ 1/2, while employing the Frobenius norm. Adagrad: We set cγ = 1 and initialize An = Id. For stabilization purposes, as suggested in Boyer & Godichon-Baggioni (2020), we use a modified step size, taking γn = cγ (n+20)γ . We consider: The choice of γ: γ = 0.5 or γ = 0.75. The use of truncation or not, with cβ = 1, λ 0 = 1, β = 0.25 (resp. 0.125) and λ = 0.385 (resp. 0.25) if γ = 0.75 (resp. if γ = 0.5). ... In the following experiments, we set d = 10 and generate 50 datasets of size n = 10^5. Moreover, we consider random initializations θ0 = θ + U, where U N (0, Id). ... In addition, we set σ = 0.1 and denote by θ the minimizer of Gσ ... Moreover, we generate 50 datasets of size n = 10^5 and we consider random initializations θ0 = θ + U, where U N (0, Id).