An Asynchronous Bundle Method for Distributed Learning Problems
Authors: Daniel Cederberg, Xuyang Wu, Stephen Boyd, Mikael Johansson
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The practical advantages of our method are illustrated through numerical experiments on classification problems of varying complexities and scales. ... Implementation details and numerical experiments are presented in 5 and 6, respectively. ... We conduct experiments on three datasets (mnist8m/infimnist, epsilon, rcv1) from the LIBSVM library (Chang & Lin, 2011) and on the SVHN dataset (Netzer et al., 2011). |
| Researcher Affiliation | Academia | Daniel Cederberg Stanford University, USA Xuyang Wu SUSTech, China Stephen Boyd Stanford University, USA Mikael Johansson KTH, Sweden |
| Pseudocode | Yes | A summary of the algorithm we propose is given in Algorithm 1. |
| Open Source Code | Yes | The code is written in Python using MPI4PY (Dalcin & Fang, 2021) and is available at https://github.com/dance858/Asynchronous-bundle-method. |
| Open Datasets | Yes | We conduct experiments on three datasets (mnist8m/infimnist, epsilon, rcv1) from the LIBSVM library (Chang & Lin, 2011) and on the SVHN dataset (Netzer et al., 2011). |
| Dataset Splits | No | The data is distributed evenly among the workers. The dataset mnist8m corresponds to a multiclass problem with 10 different labels. To use it for binary classification, we select data corresponding to the digits 7 and 9 and discard the rest. |
| Hardware Specification | No | All methods are evaluated on a workstation using 10 cores. |
| Software Dependencies | No | The code is written in Python using MPI4PY (Dalcin & Fang, 2021) ... To evaluate the objective value and the gradients we use Py Torch (Paszke et al., 2019)... |
| Experiment Setup | Yes | For ABM we use bundle size m = 10, master problem tolerance δ = 10^-7, and adaptive smoothness estimation. DAve-RPG has two hyperparameters: the step size γ and the number of inner prox-steps p. We use step size γ = 1/Laverage where Laverage is the average smoothness parameter of the workers, and p = 1 inner prox-steps (as in Mishchenko et al. (2018)). For PIAG with delay-tracking we implemented the first adaptive step size strategy described in Wu et al. (2022). |