reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Decentralized Bilevel Optimization: A Perspective from Transient Iteration Complexity

Authors: Boao Kong, Shuchen Zhu, Songtao Lu, Xinmeng Huang, Kun Yuan

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present experiments to validate the inﬂuence of convergence caused by communication topologies, data heterogeneity, moving average parameter θ, and ﬁnitediﬀerence parameter δ. We also compare D-SOBA to other decentralized SBO algorithms and show the beneﬁt of ﬁnite-diﬀerence Hessian/Jacobian-vector approximation of D-SOBA-FO in high-dimensional settings.
Researcher Affiliation	Academia	Boao Kong EMAIL Center for Data Science, Peking University Beijing, China Shuchen Zhu EMAIL Center for Data Science, Peking University Beijing, China Songtao Lu EMAIL Department of Computer Science and Engineering The Chinese University of Hong Kong Hong Kong SAR Xinmeng Huang EMAIL Graduate Group in Applied Mathematics and Computational Science University of Pennsylvania Philadelphia, USA Kun Yuan EMAIL Center for Machine Learning Research, Peking University AI for Science Institute Beijing, China
Pseudocode	Yes	Algorithm 1 D-SOBA-SO and D-SOBA-FO
Open Source Code	No	The paper does not contain any explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	Here we consider a data hyper-cleaning problem (Shaban et al., 2019) for a Fashion MNIST dataset (Xiao et al., 2017) ... Consider a meta-learning problem (Finn et al., 2017)... The dataset is mini Image Net (Vinyals et al., 2016), which is generated from Image Net (Russakovsky et al., 2015) and consists of 100 classes with each class containing 600 images of size 84 x 84.
Dataset Splits	Yes	Then we split the 60000 training images into the training set and validation set, which contains 50000 images and 10000 images separately. ... We split these classes into 64 for training, 16 for validation, and 20 for testing. Each training and validation class splits its data by Dirichlet distribution with parameters α = 0.1 (Lin et al., 2021).
Hardware Specification	No	The paper mentions running experiments but does not provide specific details about the hardware used (e.g., CPU, GPU models, or memory specifications).
Software Dependencies	No	In high-dimensional bilevel optimization, storing and computing the second-order matrices 2 12gi(x(t), y(t)) and 2 22gi(x(t), y(t)) is highly expensive... The Hessian/Jacobian vector product is computed via Py Torch s forward-mode automatic diﬀerentiation in D-SOBA-SO, which is signiﬁcantly faster than ﬁrst computing the full Hessian/Jacobian matrix and then multiplying it by the vector.
Experiment Setup	Yes	The step-sizes αt, βt, γt are initialized as 0.1 and multiplied by 0.8 per 1, 000 iterations while θt is set to 0.2. ... For all algorithms, the step-sizes are set to 0.1 and the batch size are set to 200. Meanwhile, the moving average parameters θt of D-SOBA-SO, D-SOBA-FO and MA-DSBO are set to 0.8. ... We use a four-layer CNN with four convolution blocks in which each block sequentially consists of a 3 x 3 convolution with 32 ﬁlters, batch normalizationm, Re LU activation, and 2 x 2 max pooling. ... The batch size is set to 32. The parameters of the last linear layer are set to be task-speciﬁc and the other parameters are global. We choose the stepsize 0.002 for upper-level iterations and 0.01 for lower-level iterations for all algorithms. The stepsize of auxiliary variables in D-SOBA-FO is 0.01 and the term δt in D-SOBA-FO is 0.001.