An Asynchronous Bundle Method for Distributed Learning Problems

Authors: Daniel Cederberg, Xuyang Wu, Stephen Boyd, Mikael Johansson

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The practical advantages of our method are illustrated through numerical experiments on classification problems of varying complexities and scales. ... Implementation details and numerical experiments are presented in 5 and 6, respectively. ... We conduct experiments on three datasets (mnist8m/infimnist, epsilon, rcv1) from the LIBSVM library (Chang & Lin, 2011) and on the SVHN dataset (Netzer et al., 2011).
Researcher Affiliation Academia Daniel Cederberg Stanford University, USA Xuyang Wu SUSTech, China Stephen Boyd Stanford University, USA Mikael Johansson KTH, Sweden
Pseudocode Yes A summary of the algorithm we propose is given in Algorithm 1.
Open Source Code Yes The code is written in Python using MPI4PY (Dalcin & Fang, 2021) and is available at https://github.com/dance858/Asynchronous-bundle-method.
Open Datasets Yes We conduct experiments on three datasets (mnist8m/infimnist, epsilon, rcv1) from the LIBSVM library (Chang & Lin, 2011) and on the SVHN dataset (Netzer et al., 2011).
Dataset Splits No The data is distributed evenly among the workers. The dataset mnist8m corresponds to a multiclass problem with 10 different labels. To use it for binary classification, we select data corresponding to the digits 7 and 9 and discard the rest.
Hardware Specification No All methods are evaluated on a workstation using 10 cores.
Software Dependencies No The code is written in Python using MPI4PY (Dalcin & Fang, 2021) ... To evaluate the objective value and the gradients we use Py Torch (Paszke et al., 2019)...
Experiment Setup Yes For ABM we use bundle size m = 10, master problem tolerance δ = 10^-7, and adaptive smoothness estimation. DAve-RPG has two hyperparameters: the step size γ and the number of inner prox-steps p. We use step size γ = 1/Laverage where Laverage is the average smoothness parameter of the workers, and p = 1 inner prox-steps (as in Mishchenko et al. (2018)). For PIAG with delay-tracking we implemented the first adaptive step size strategy described in Wu et al. (2022).