reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exploring the potential of Direct Feedback Alignment for Continual Learning

Authors: Sara Folchini, Viplove Arora, Sebastian Goldt

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We train fully-connected networks on several continual learning benchmarks using DFA and compare its performance to vanilla backpropagation, random features, and other continual learning algorithms. We empirically show that DFA is competitive at Continual Learning to vanilla back-propagation and other baselines, such as random features (RF) and Elastic Weight Consolidation (EWC).
Researcher Affiliation	Academia	Sara Folchini EMAIL International Institute for Advanced Studies (SISSA) Trieste, Italy Viplove Arora EMAIL International Institute for Advanced Studies (SISSA) Trieste, Italy Sebastian Goldt EMAIL International Institute for Advanced Studies (SISSA) Trieste, Italy
Pseudocode	No	The paper describes the mathematical equations for BP and DFA weight updates (e.g., equations 1-4) but does not present them within a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not include any explicit statements about code availability, links to repositories, or mentions of code in supplementary materials for the methodology described.
Open Datasets	Yes	We report results on the Fashion MNIST (FMNIST) dataset (Xiao et al., 2017), CIFAR10 (Krizhevsky, 2009) dataset, and the MNIST dataset (Deng, 2012).
Dataset Splits	Yes	Split FMNIST (s FMNIST) and split CIFAR10, where we split the original dataset of 10 classes into ﬁve smaller datasets with two disjoint classes for each. The resulting smaller datasets will have very diﬀerent statistical characteristics, so a model trained sequentially on them needs to be able to incrementally learn new information with dramatically diﬀerent feature representations.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU models, or cloud computing specifications used for running the experiments.
Software Dependencies	No	The paper does not explicitly mention any software dependencies with specific version numbers, such as programming languages or deep learning frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	In our experiments, we use 3-layer Fully-Connected Networks with 1000 neurons in each hidden layer. We train the networks for a maximum of 1000 epochs (an impact of this choice is expanded in F) and apply early-stopping by halting the training as soon as the network overcomes 99% training accuracy. All layers are initialized using the Xavier uniform initialization (Glorot & Bengio, 2010). We choose a logistic activation function in the output layer and ReLU in the other layers. The loss function is cross-entropy. For DFA, we use a learning rate of 0.01 and a Feedback matrix variance optimized in the range between the orders of 1e-8 and 1. Backpropagation... a Dropout layer (Srivastava et al., 2014) after each layer... (0.2% in the ﬁrst layer and 0.5% in the other layers, excluding the output layer). We perform a grid-search optimization for the learning rate in the range between 1e-2 and 1e-4. Random Features... We use a learning rate of 1e-2. Elastic weight consolidation (EWC)... we chose a learning rate of 1e-3 and an importance of 1000; lambda is set to 0.4 by default...