DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity

Authors: Zhen Qin, Zhuqing Liu, Songtao Lu, Yingbin Liang, Jia (Kevin) Liu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct numerical experiments to verify our theoretical results for DUET. Due to the lack of existing algorithms for solving decentralized bilevel optimization problems without LLSC assumption, we compare the convergence performance of DUET and DSGT. 1) A Pedagogical Example: We first verify the convergence results under the ULSC and non LLSC cases using five-agent communication networks... As shown in Figs. 1(a) and 1(b), the gradients of x, y, reach zero when using our DUET algorithm... 2) Decentralized Meta-learning Problems with Real-World Data: Next, we evaluate our DUET algorithm on decentralized meta-learning problems with heterogeneous datasets. Following the experimental settings in Qiu et al. (2023), we use the MNIST dataset to train m personalized classifiers... In Fig. 2(a), the DUET algorithm demonstrates superior performance by achieving the highest testing accuracy, along with fast convergence.
Researcher Affiliation Collaboration Zhen Qin , Zhuqing Liu , Songtao Lu , Yingbin Liang , Jia Liu Department of Electrical and Computer Engineering, The Ohio State University Department of Computer Science and Engineering, University of North Texas Department of Computer Science and Engineering, The Chinese University of Hong Kong This work was completed while S. Lu was a senior research scientist at IBM Research in the U.S.
Pseudocode Yes Algorithm 1 The DUET Algorithm at Each Agent i. Set parameter pair (xi,0, yi,0, vi,0) = (x0, y0, v0). for t = 1, , T do Update local models (xi,t+1, yi,t+1, vi,t+1) as in Eqs. (5); Compute the (di,t x , di,t y , di,t v ) local estimator as in Eq. (6); Track global gradients (hi,t x , hi,t y , hi,t v ) as in Eq. (7); end for
Open Source Code No The paper does not contain an explicit statement about releasing code, a link to a code repository, or mention of code in supplementary materials.
Open Datasets Yes 2) Decentralized Meta-learning Problems with Real-World Data: Next, we evaluate our DUET algorithm on decentralized meta-learning problems with heterogeneous datasets... we use the MNIST dataset to train m personalized classifiers. E.3 DECENTRALIZED HYPERPARAMETER OPTIMIZATION WITH REAL-WORLD DATA: ... we use the Fashion MNIST dataset, which consists of images of clothing categories and serves as an alternative to the classic MNIST dataset.
Dataset Splits Yes In the scenario with i.i.d. data, all agents have access to the same global dataset. We compare our proposed algorithms with the baseline DSGD... In contrast, in the non-i.i.d. data scenario, we adopt a data partitioning strategy, where each agent accesses data consisting of 95% from two specific labels and 5% randomly selected from other labels... For each agent, the dataset is split into training, validation, and testing sets, each containing 5000 samples.
Hardware Specification Yes All numerical experiments were conducted on a Mac Book Pro equipped with an Apple M3 Pro chip, featuring a 12-core CPU with 6 performance cores and 6 efficiency cores, and 36GB of memory.
Software Dependencies No The paper mentions implementing algorithms but does not specify any software libraries or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For the DUET algorithm, we set the LL learning rate to 0.005, with p = 1/10 and τ = 1/40. For the DSGT algorithm, the LL learning rate is set to 0.001, with p = 1/15 and τ = 1/40. ...γ = 0.1 denotes the regularization coefficient...