reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonlinearly Preconditioned Gradient Methods under Generalized Smoothness

Authors: Konstantinos Oikonomidis, Jan Quan, Emanuel Laude, Panagiotis Patrinos

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present some simple experiments that display the behavior of the proposed method on problems beyond traditional Lipschitzian assumptions. The code for reproducing the experiments is publicly available1. [...] Figure 2. Minimizing 1/4 x^4 using (2). [...] Figure 3. Nonconvex phase retrieval. [...] Figure 4. Simple NN training.
Researcher Affiliation	Collaboration	1Department of Electrical Engineering (ESAT-STADIUS), KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium 2Leuven.AI-KU Leuven Institute for AI, 3000 Leuven, Belgium 3Proxima Fusion Gmb H, Floßergasse 2, 81369 Munich, Germany. Correspondence to: Konstantinos Oikonomidis <EMAIL>.
Pseudocode	No	The paper describes the main iteration in equation (2): xk+1 = Tγ,λ(xk) := xk γ ϕ (λ f(xk)), but does not provide a separate, structured pseudocode block for the algorithm in the main text.
Open Source Code	Yes	The code for reproducing the experiments is publicly available1. 1https://github.com/JanQ/nonlinearly-preconditioned-gradient
Open Datasets	Yes	In this experiment we consider training a simple fourlayer fully connected network with layer dimensions [28 28, 128, 64, 32, 32, 10] and Re LU activation functions on a subset of the MNIST dataset (Deng, 2012), using the cross-entropy loss.
Dataset Splits	No	The paper mentions using 'a subset (m = 600) of the dataset' for neural network training but does not specify how this subset is further divided into training, validation, or test splits. No explicit percentages, counts, or references to standard splits for reproduction are provided.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiments.
Experiment Setup	Yes	For the isotropic case of (2) we take γ = 5/3 and λ = 1/100, while for the anisotropic one γ = 1/5 and λ = 1/14. [...] We compare the methods generated by ϕ1(x) = cosh( x ) 1, ϕ2(x) = x ln(1 x ) and the gradient clipping method (Zhang et al., 2020b), that can also be considered as an instance of (2) through Example 1.7, for various choices of the stepsizes and the clipping parameters. The results are presented in Figure 4. It can be seen that different combinations of γ and λ lead to different behaviors for the compared methods.