reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variance Reduction of Stochastic Hypergradient Estimation by Mixed Fixed-Point Iteration

Authors: Naoyuki Terashita, Satoshi Hara

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on synthetic and real-world tasks verify our theoretical results and superior variance reduction over existing methods. The paper includes a dedicated 'Experiments' section (Section 5) with subsections on 'Effect of Mixing Rate' and 'Comparison with Existing Approaches', performing evaluations on various machine learning tasks like hyperparameter optimization, influence estimation, and meta learning.
Researcher Affiliation	Collaboration	Naoyuki Terashita is affiliated with Hitachi, Ltd., which is an industry affiliation. Satoshi Hara is affiliated with University of Electro-Communications, which is an academic affiliation. This mix indicates a collaboration.
Pseudocode	Yes	The paper includes a section titled 'F Python Implementation of Mixed FP-KM' which provides a Python code block (Figure 7) that explicitly implements the Mixed FP-KM algorithm.
Open Source Code	Yes	The code is available at https://github.com/hitachi-rd-cv/mixed-fp.
Open Datasets	Yes	The paper explicitly mentions and cites several well-known public datasets: 'Adult Income dataset (Becker & Kohavi, 1996)', 'Fashion-MNIST (Xiao et al., 2017)', and 'California Housing dataset (Pace & Barry, 1997)'.
Dataset Splits	Yes	Table 2 (Experiment settings for the real-world tasks) explicitly lists 'ntrain' and 'nval' values for each dataset used: Adult Income (5000 train, 5000 val), Fashion MNIST (5000 train, 5000 val), California Housing (5000 train, 5000 val). Additionally, Section E.1 states: 'In addition to the training and validation splits used in Section 5.2, we introduce a separate test set of 5,000 samples to evaluate the final model performance after the outer optimization.'
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. It mentions 'wall-clock basis' in the context of computational cost but does not specify the hardware.
Software Dependencies	No	The paper mentions the use of 'Adam optimizer' and 'Py Torch' implicitly in the Python implementation, but it does not specify any version numbers for these or any other software components used in the experiments.
Experiment Setup	Yes	Section D.2.2 'Influence Estimation' states: 'Any inner-problem optimization was performed using the Adam optimizer with a learning rate of 0.01. To rule out the effect incurred by inexact x(λ), for any task, we used the full-batch inner loss to compute gradients for Adam and ran 1,000 epochs to ensure the convergence.' It also details grid search ranges for hyperparameters. Section E.1 'Settings' further specifies: 'We configure the bilevel optimization with 100 outer optimization steps using SGD with a learning rate of 20.0, and 100 inner optimization steps per outer iteration using Adam with a learning rate of 0.01.'