reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Accelerated training through iterative gradient propagation along the residual path

Authors: Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through an extensive empirical study on a large selection of tasks and models, we evaluate Highway-BP and show that major speedups can be achieved with minimal performance degradation.
Researcher Affiliation	Collaboration	Erwan Fagnou1, Paul Caillon1, Blaise Delattre1,2 & Alexandre Allauzen1,3 1 Miles Team, LAMSADE, Universit e Paris Dauphine-PSL, Paris, France 2 Foxstream, Vaulx-en-Velin, France 3 ESPCI PSL, Paris, France
Pseudocode	Yes	We use Hillis and Steele s parallel algorithm (Hillis & Steele, 1986) in our experiments, and we indicate a pseudocode of this algorithm adapted to our needs in Appendix B (Algorithm 1). ... Appendix B PSEUDOCODE OF THE PARALLEL PREFIX SCAN ALGORITHM FOR CUMSUMPROD Algorithm 1 (Parallel Cum Sum Prod)
Open Source Code	No	The paper does not explicitly state that the code is publicly available or provide a link to a repository. It mentions "We leave its practical implementation for training large models in a distributed setting for future work." and "Still, we believe the prefix scan algorithm could be much more optimized, using a custom CUDA kernel for instance." indicating future work rather than current release.
Open Datasets	Yes	CIFAR10 The CIFAR10 (Krizhevsky, 2009) dataset contains 50k images with 10 classes. CIFAR10 pixel-level In Long Range Arena (Tay et al., 2021), CIFAR10 images are flattened as sequences of 3-dimensional vectors. Image Net32 The Image Net32 (Chrabaszcz et al., 2017) dataset contains 1.3M images with 1000 classes. Wikitext103 Wikitext103 is a dataset containing texts extracted from Wikipedia. MNLI The Multi-Genre Natural Language Inference dataset (Williams et al., 2018) is a task from the GLUE benchmark (Wang et al., 2018).
Dataset Splits	No	The paper uses standard datasets like CIFAR10, Image Net32, Wikitext103, and MNLI, which typically have predefined splits. However, it does not explicitly state the specific training/validation/test split percentages or sample counts used for these datasets within the paper's text. For example, for CIFAR10 it says "We process this the same way as CIFAR10" when referring to ImageNet32, which implies standard splits, but doesn't specify them.
Hardware Specification	Yes	All experiments were conducted on single GPUs, either Nvidia A100, A40, or RTX A6000.
Software Dependencies	No	The paper mentions using the Adam optimizer (Kingma & Ba, 2014) and AdamW variation (Loshchilov & Hutter, 2017), as well as a cosine learning rate scheduler, but does not specify the versions of the software frameworks (e.g., PyTorch, TensorFlow) or specific Python libraries used.
Experiment Setup	Yes	Table 4: Hyperparameters used in the deep models experiments. Table 5: Hyperparameters used in the RNN experiments. Most models are trained using the Adam optimizer (Kingma & Ba, 2014). In case of weight decay, we use the Adam W variation (Loshchilov & Hutter, 2017). We also use a cosine learning rate scheduler to decrease the learning rate to a tenth of its initial value. Additionally, the first 10% of the training is performed with a linear warmup.