reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Distributed Training With Co-Adaptive Data Parallelism in Heterogeneous Environments

Authors: Lifang Chen, Zhichao Chen, Liqi Yan, Yanyu Cheng, Fangli Guan, Pan Li

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the Image Net100 dataset demonstrate that C-ADP achieves fast convergence in heterogeneous distributed training environments. Compared to Distributed Data Parallel (DDP) and Deep Speed, C-ADP achieves 21.6 times and 26.3 times improvements in FLOPS, respectively, and a reduction in training time of about 72% and 47%, respectively.
Researcher Affiliation	Academia	Lifang Chen , Zhichao Chen , Liqi Yan , Yanyu Cheng , Fangli Guan and Pan Li Hangzhou Dianzi University EMAIL, chenzhichao EMAIL, EMAIL, EMAIL, EMAIL, EMAIL Corresponding authors: Pan Li, Liqi Yan.
Pseudocode	Yes	Algorithm 1 Our Data Parallel Scheduling Algorithm Input: S, n, ρ, µ(0), µdecay, λ(0), ε, η(0), ηdecay, ψ, ζ Output: ˆx 1: Let x S n1 , k 0 ,µ µ(0) ,λ λ(0) , η η(0) . 2: while k < ψ do 3: i 0 4: while i < ζ do 5: Compute gradients x Lρ(x(k,i), λ(k), µ(k)) 6: Update x(k,i+1) = x(k,i) η(k) x Lρ(x(k,i), λ(k), µ(k)) 7: if L 2 ε then 8: break 9: end if 10: i i + 1 11: end while Update λ(k+1) λ(k) + ρ j x(k+1,0) j S Update µ(k+1) µdecay µ(k) 12: if L 2 ε then 13: break 14: end if 15: k k + 1 16: end while 17: x x 18: ˆx Round ( x ) 19: return ˆx
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	Yes	The experiments were carried out using the Image Net100 dataset, a subset of the Image Net dataset.
Dataset Splits	Yes	Image Net100 consists of 100 classes, particularly 100,000 training samples and 10,000 test samples, making it a balanced and diverse dataset for evaluating model performance.
Hardware Specification	Yes	Ranks 1 4 use RTX3090 GPUs with 24GB VRAM. Ranks 5 6 are 4 core CPUs with 8GB memory. Rank7 is a 16 core CPU with 32GB VRAM, and rank8 is an 8 core CPU with 16GB VRAM.
Software Dependencies	No	The paper mentions PyTorch as a framework used by DDP but does not specify a version number for PyTorch or any other software dependencies used in their own implementation.
Experiment Setup	Yes	For all GPUs, the batch size is set to 64 due to its high computational complexity, to ensure stable training within the GPU s memory capacity. For all CPUs, the batch size is set to 4, considering their limited computing power. All settings are designed to keep operations within the device s memory and computational limits. The maximum number of epochs is set to 50. The initial learning rate is 0.01. To improve convergence, we reduce the learning rate by half after every 10 epochs.