reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DICE: Data Influence Cascade in Decentralized Learning

Authors: Tongtian Zhu, Wenhao Li, Can Wang, Fengxiang He

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section presents the experimental results, with implementation details outlined in Appendix D.1. We evaluate the alignment between one-hop DICE-GT (see Definition 2) and its first-order approximation, one-hop DICE-E (see Proposition 1). ... Anomaly Detection DICE identifies malicious neighbors, referred to as anomalies, by evaluating their proximal influence... Influence Cascade The topological dependency of DICE-E in our theory reveals the power asymmetries...
Researcher Affiliation	Academia	Tongtian Zhu , Wenhao Li & Can Wang Zhejiang University EMAIL Fengxiang He University of Edinburgh EMAIL
Pseudocode	Yes	Algorithm 1 Decentralized Learning with Flexible Gossip and Optimization Require: G = (V, E), {θ0 k}k V, optimizer Ok, number of communication rounds T, and mixing matrix distributions Wt ( t [T]) 1: for t = 1 to T do in parallel for all participants k V 2: Local Update: 3: Sample zt k Dk, update parameters with optimizer Ok: θ t+ 1 2 k Ok(θt k, zt k) 4: Gossip Averaging: 5: Send θ t+ 1 2 k to {l \| Wl,k > 0} and receive θ t+ 1 2 j from {j \| Wk,j > 0}. 6: Sample W t Wt, perform gossip averaging: θt+1 k P j Nin(k) W t k,jθ t+ 1 2 j End for
Open Source Code	No	Project page is available at DICE. ... The code will be made publicly available.
Open Datasets	Yes	We employ the vanilla mini-batch Adapt-Then-Communicate version of Decentralized SGD ((Lopes & Sayed, 2008), see Algorithm 1) with commonly used network topologies (Ying et al., 2021) to train three-layer MLPs (Rumelhart et al., 1986), three-layer CNNs (Le Cun et al., 1998), and Res Net-18 (He et al., 2016) on subsets of MNIST (Le Cun et al., 1998), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015).
Dataset Splits	No	Each node uses a 512-sample subset of CIFAR-10. Models are trained for 5 epochs with a batch size of 128 and a learning rate of 0.1. (from Figure 4 caption). The paper mentions subsets of data per node but not overall train/test/validation splits for the datasets.
Hardware Specification	Yes	The experiments are conducted on a computing facility equipped with 80 GB NVIDIA A100 GPUs.
Software Dependencies	No	We employ the vanilla mini-batch Adapt-Then-Communicate version of Decentralized SGD ((Lopes & Sayed, 2008), see Algorithm 1) with commonly used network topologies (Ying et al., 2021) to train three-layer MLPs (Rumelhart et al., 1986), three-layer CNNs (Le Cun et al., 1998), and Res Net-18 (He et al., 2016) on subsets of MNIST (Le Cun et al., 1998), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015). The paper mentions algorithms, models, and datasets but does not specify software libraries with version numbers.
Experiment Setup	Yes	The number of participants (one GPU as a participant) is set to 16 and 32, with each participant holding 512 samples. For sensitivity analysis, we evaluate the stability of results under hyperparameter adjustments. The local batch size is varied as 16, 64, and 128 per participant, while the learning rate is set as 0.1 and 0.01 without decay.