reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization

Authors: Kilian Pfeiffer, Martin Rapp, Ramin Khalili, Joerg Henkel

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Experimental Evaluation Partial quantization of NN models results in hardware-specific gains in execution time and memory. Hence, our evaluation follows a hybrid approach, where we profile on-device training loops on real hardware and take the profiling information to perform simulations of distributed systems. This allows for the evaluation of large systems with hundreds or thousands of devices.
Researcher Affiliation	Collaboration	Kilian Pfeiffer EMAIL Karlsruhe Institute of Technology Martin Rapp EMAIL Karlsruhe Institute of Technology Ramin Khalili EMAIL Huawei Research Center Munich Jörg Henkel EMAIL Karlsruhe Institute of Technology
Pseudocode	Yes	Algorithm 1 Each Selected Device c (Client) in Each Round ... Algorithm 2 FL Server (Synchronization and Aggregation)
Open Source Code	Yes	1The code is available at https://github.com/k1l1/Co Co FL.
Open Datasets	Yes	For each experiment, we distribute the data from the datasets CIFAR10/100 (Krizhevsky & Hinton, 2009), FEMNIST (Cohen et al., 2017), CINIC10 (Darlow et al., 2018), XChest (Wang et al., 2017), IMDB (Maas et al., 2011), and Shakespeare (Caldas et al., 2019) to devices in C.
Dataset Splits	No	The paper describes how data is distributed among clients (iid, non-iid, rc-non-iid) and grouped for experiments (strong, medium, weak devices). It specifies how the overall dataset is distributed to clients and their capabilities (e.g., 'randomly distributed to all devices' for iid, 'number of samples per class varies between devices' for non-iid, 'medium devices has 2/3 of the computational and memory resources', 'weak devices has 1/3'). However, it does not explicitly provide traditional train/validation/test splits for the datasets themselves, nor specific file names or URLs for custom splits, or details about stratified splitting or cross-validation for the overarching dataset evaluation, beyond how data is partitioned across federated learning clients.
Hardware Specification	Yes	We employ two different hardware platforms to factor out potential microarchitecture-dependent peculiarities w.r.t. quantization or freezing: x64 AMD Ryzen 7 and a Raspberry Pi with an ARMv8 CPU.
Software Dependencies	Yes	We implement the presented training scheme in Py Torch 1.10 (Paszke et al., 2019), which supports int8 quantization.
Experiment Setup	Yes	We train with the optimizer SGD with an initial learning rate η of 0.1. For a fair comparison we do not use momentum, as Fj ORD is incompatible with a stateful optimizer. The remaining NN-specific hyperparameters, learning rate decay, and weight decay are given in Table 1. ... A mini-batch size of 32 is used for all experiments.