reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Federated Automatic Differentiation

Authors: Keith Rush, Zachary Charles, Zachary Garrett

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply the resulting methods to a variety of benchmark FL tasks. We find that the resulting method can exhibit significantly better convergence properties than static Fed Opt, and its ability to dynamically adjust the hyperparameters over time plays a crucial role in this behavior. We now present an empirical exploration of federated hypergradient descent as applied to server optimization hyperparameters.
Researcher Affiliation	Industry	Keith Rush EMAIL Google Research Seattle, WA, USA Zachary Charles EMAIL Google Research Seattle, WA, USA Zachary Garrett EMAIL Google Research Seattle, WA, USA
Pseudocode	Yes	Algorithm 1 Fed Opt ... Algorithm 2 Fed Opt with hypergradient descent.
Open Source Code	Yes	Since the initial version of this work appeared, we have developed an open-source library that implements federated AD in JAX. See (Rush et al., 2024) for details. ... Since the initial version of this work, we have developed a high-performance implementation of federated AD in JAX; see (Rush et al., 2024).
Open Datasets	Yes	We used four data sets: CIFAR-100 (Krizhevsky, 2009), EMNIST (Cohen et al., 2017), Shakespeare (Mc Mahan et al., 2017), and Stack Overflow (Authors, 2019). ... We use the Synthetic(α, β) task proposed by Li et al. (2020).
Dataset Splits	Yes	Throughout, we compare the accuracy of the learned model on the test split of each of the data sets above. ... We sample 50 clients uniformly at random at each communication round. ... We focused on sampling clients for both purposes from a single population.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments. It mentions federated systems generally but no particular GPU/CPU models or infrastructure.
Software Dependencies	No	The development of ML-focused AD frameworks, such as JAX (Bradbury et al., 2018), TensorFlow (Abadi et al., 2016), and PyTorch (Paszke et al., 2019), has accelerated this... ... we have developed an open-source library that implements federated AD in JAX. ... We instantiated these parameterizations with hand-differentiated pairs of functions representing server optimizer updates implemented in TensorFlow. No specific version numbers for these software components are provided.
Experiment Setup	Yes	We use E = 1 epochs of mini-batch SGD in Client Update throughout, and use the same batch sizes for each task as in (Charles et al., 2021). We set the per-client weights pk to the number of examples in each client s data set (example-weighting). We sample 50 clients uniformly at random at each communication round. ... For random initialization, learning rates were chosen via a log-uniform distribution from the range (10-3, 10). Momentum values were chosen uniformly from the range (0, 1). For default server optimizer settings, we used a learning rate of 1.0 and a momentum value of 0.9. ... Throughout, our hypergradient descent optimizer is SGD with a learning rate of 0.01. ... Last, we use Adam (Kingma and Ba, 2015) in our hypergradient descent step (rather than SGD), with learning rate 0.01.