Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization

Authors: Yuki Takezawa, Xiaowen Jiang, Anton Rodomanov, Sebastian U Stich

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that SPDO significantly outperforms existing methods. ... 7. Numerical Evaluation Experimental Setup: We used MNIST (Lecun et al., 1998) and logistic loss with L2 regularization. ... Results: Fig. 1(a) indicates that Accelerated-SPDO can achieve the best communication and computational complexities.
Researcher Affiliation Academia 1Kyoto University 2OIST 3CISPA Helmholtz Center for Information Security 4Saarland University. Correspondence to: Yuki Takezawa <EMAIL>.
Pseudocode Yes We show the pseudo-code in Alg. 1. ... Algorithm 1 Proximal Decentralized Optimization Method (PDO) ... Algorithm 2 Multiple Gossip Averaging ... Algorithm 3 Stabilized Proximal Decentralized Optimization Method (SPDO) ... Algorithm 4 Accelerated Stabilized Proximal Decentralized Optimization Method (Accelerated-SPDO) ... Algorithm 5 Fast Gossip Averaging
Open Source Code No The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets Yes Experimental Setup: We used MNIST (Lecun et al., 1998) and logistic loss with L2 regularization.
Dataset Splits No The paper mentions using the MNIST dataset and distributing data to nodes using Dirichlet distribution but does not specify the train/test/validation splits (e.g., percentages or counts) for the dataset itself.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper mentions using 'gradient descent' as a method but does not specify any software libraries, frameworks, or their version numbers used for implementation.
Experiment Setup Yes We set the coefficient of L2 regularization to 0 and 0.01 and the number of nodes n to 25. ... We set the number of communications to 2000 for all methods and tuned other hyperparameters, e.g., λ, M, and η, to minimize the norm of the last gradient. See Sec. H for a more detailed setting. ... Table 3. Experimental setups. Note that for Inexact Accelerated SONATA, δ is the coefficient of the additional L2 regularization in the subproblem, and it is different from Definition 1. [Table 3 lists specific η, λ, M, β, T, and δ values/ranges for each algorithm].