Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization
Authors: Yuki Takezawa, Xiaowen Jiang, Anton Rodomanov, Sebastian U Stich
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that SPDO significantly outperforms existing methods. ... 7. Numerical Evaluation Experimental Setup: We used MNIST (Lecun et al., 1998) and logistic loss with L2 regularization. ... Results: Fig. 1(a) indicates that Accelerated-SPDO can achieve the best communication and computational complexities. |
| Researcher Affiliation | Academia | 1Kyoto University 2OIST 3CISPA Helmholtz Center for Information Security 4Saarland University. Correspondence to: Yuki Takezawa <EMAIL>. |
| Pseudocode | Yes | We show the pseudo-code in Alg. 1. ... Algorithm 1 Proximal Decentralized Optimization Method (PDO) ... Algorithm 2 Multiple Gossip Averaging ... Algorithm 3 Stabilized Proximal Decentralized Optimization Method (SPDO) ... Algorithm 4 Accelerated Stabilized Proximal Decentralized Optimization Method (Accelerated-SPDO) ... Algorithm 5 Fast Gossip Averaging |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Experimental Setup: We used MNIST (Lecun et al., 1998) and logistic loss with L2 regularization. |
| Dataset Splits | No | The paper mentions using the MNIST dataset and distributing data to nodes using Dirichlet distribution but does not specify the train/test/validation splits (e.g., percentages or counts) for the dataset itself. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions using 'gradient descent' as a method but does not specify any software libraries, frameworks, or their version numbers used for implementation. |
| Experiment Setup | Yes | We set the coefficient of L2 regularization to 0 and 0.01 and the number of nodes n to 25. ... We set the number of communications to 2000 for all methods and tuned other hyperparameters, e.g., λ, M, and η, to minimize the norm of the last gradient. See Sec. H for a more detailed setting. ... Table 3. Experimental setups. Note that for Inexact Accelerated SONATA, δ is the coefficient of the additional L2 regularization in the subproblem, and it is different from Definition 1. [Table 3 lists specific η, λ, M, β, T, and δ values/ranges for each algorithm]. |