reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis

Authors: Youcheng Niu, Jinming Xu, Ying Sun, Yan Huang, Li Chai

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present two numerical experiments to test the performance of the proposed Lo PA algorithm and verify the theoretical ﬁndings. Speciﬁcally, the ﬁrst experiment considers classiﬁcation problems for a 10-class classiﬁcation task, while the second focuses on hyperparameter optimization in l2-regularized binary logistic regression problems for a two-class classiﬁcation task. [...] The experiment results for loss, training accuracy, and testing accuracy are presented in Figures 1 and 2.
Researcher Affiliation	Academia	College of Control Science and Engineering, Zhejiang University, China School of Electrical Engineering and Computer Science, The Pennsylvania State University, USA
Pseudocode	Yes	Algorithm 1 Lo PA
Open Source Code	No	The paper does not provide a concrete link to source code, an explicit statement of code release, or mention of code in supplementary materials for the methodology described.
Open Datasets	Yes	We employ MNIST datasets to train m personalized classiﬁers for a 10-class classiﬁcation task. [...] We conduct the experiment across various datasets including MNIST (784 features, 12000 samples for digits 0 and 1 ), covtype (54 features, 90000 samples for the Lodgepole and Ponderosa pine classes), and cifar10 (3072 features, 6000 samples for the dog and horse classes).
Dataset Splits	No	The paper describes how samples are distributed among nodes (e.g., 'each node having 14000 samples', 'validation and training sets for each node are randomly assigned with a uniform number of samples') but does not provide specific train/test/validation split percentages or absolute counts for the overall dataset or for individual nodes, which are necessary for full reproducibility of data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., library names like PyTorch 1.9, or solver versions).
Experiment Setup	Yes	The step-sizes are set as α = 0.01, β = 0.01, λ = 0.008, γ = 0.4, τ = 0.4 both for Lo PA-LG and Lo PA-GT. [...] mini-batch sizes are set to 50 for all algorithms. [...] The ﬁrst layer of the classiﬁer contains 28 neurons, while the second layer contains 10 neurons.