Decentralized Asynchronous Optimization with DADAO allows Decoupling and Acceleration

Authors: Adel Nabli, Edouard Oyallon

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we study the behavior of our method in a standard experimental setting (e.g., see Kovalev et al. (2021a); Even et al. (2021a)). In order to compare to methods using the gradient of the Fenchel conjugate (Even et al., 2021a) in our experiments, we restrict ourselves to a situation where it is easily computable. Thus, we perform the empirical risk minimization for the decentralized linear regression task given by: j=1 a ijx cij 2, (9) where aij Rd, and cij R correspond to m local data points stored at node i. We follow a protocol similar to Kovalev et al. (2021a): we generate n independent synthetic datasets with the make_regression functions of scikit-learn (Pedregosa et al., 2011), each worker storing m = 100 data points. We recall that the metrics of interest are the total number of local gradient steps and the total number of individual messages exchanged (i.e., number of edges that fired) to reach an ϵ-precision. We systematically used the proposed hyper-parameters of each reference paper for our implementation without any specific fine-tuning.
Researcher Affiliation Academia 1 Sorbonne University, CNRS, ISIR, Paris, France 2 Mila, Concordia University, Montréal, Canada
Pseudocode Yes We summarize in the Alg. 1 the algorithmic block corresponding to our implementation. See Appendix H for more details. ... Algorithm 1: This algorithm block describes our implementation on each local machine. ... Algorithm 2: Pseudo-code of our implementation of DADAO on a single machine.
Open Source Code Yes All our experiments are reproducible, using Py Torch (Paszke et al., 2019), our code being online https://github.com/Adel Nabli/DADAO/.
Open Datasets Yes Thus, we perform the empirical risk minimization for the decentralized linear regression task given by: j=1 a ijx cij 2, (9) where aij Rd, and cij R correspond to m local data points stored at node i. We follow a protocol similar to Kovalev et al. (2021a): we generate n independent synthetic datasets with the make_regression functions of scikit-learn (Pedregosa et al., 2011), each worker storing m = 100 data points.
Dataset Splits No The paper states it generates 'n independent synthetic datasets with the make_regression functions of scikit-learn (Pedregosa et al., 2011), each worker storing m = 100 data points.' However, it does not specify any explicit training, validation, or test splits for these datasets.
Hardware Specification No This work was granted access to the HPC/AI resources of IDRIS under the allocation AD011013743 made by GENCI. This statement is too general and does not provide specific hardware models (e.g., GPU, CPU, memory).
Software Dependencies No All our experiments are reproducible, using Py Torch (Paszke et al., 2019), our code being online https://github.com/Adel Nabli/DADAO/. ... we generate n independent synthetic datasets with the make_regression functions of scikit-learn (Pedregosa et al., 2011)... While PyTorch and scikit-learn are mentioned, their specific version numbers are not provided.
Experiment Setup No We systematically used the proposed hyper-parameters of each reference paper for our implementation without any specific fine-tuning. This statement indicates that hyperparameters were used, but does not explicitly provide the specific values for the DADAO algorithm within the paper's text, instead referring to other papers.