reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset

Authors: Alexandre Galashov, Michalis Titsias, András György, Clare Lyle, Razvan Pascanu, Yee Whye Teh, Maneesh Sahani

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that our approach performs well in non-stationary supervised, and off-policy reinforcement learning settings.
Researcher Affiliation	Collaboration	Alexandre Galashov Gatsby Unit, UCL Google Deep Mind EMAIL Michalis K. Titsias Google Deep Mind EMAIL András György Google Deep Mind EMAIL Clare Lyle Google Deep Mind EMAIL Razvan Pascanu Google Deep Mind EMAIL Yee Whye Teh Google Deep Mind University of Oxford EMAIL Maneesh Sahani Gatsby Unit, UCL EMAIL
Pseudocode	Yes	Algorithm 1 Soft-Reset algoritm
Open Source Code	No	Unfortunately, due to IP constrains, we cannot release the code for the paper.
Open Datasets	Yes	subset of 10000 images images from either CIFAR-10 [32] or MNIST and Hopper-v5 and Humanoid-v4 GYM [6] environments
Dataset Splits	No	The paper defines metrics like 'average per-task online accuracy' (Section 5, H.1) which evaluates performance during training. It describes training regimes (e.g., '400 epochs on a task with a batch size of 128') but does not specify a separate validation dataset split (e.g., '10% of data used for validation').
Hardware Specification	Yes	For each experiment, we used a 3 hours of the A100 GPU with 40 Gb of memory.
Software Dependencies	No	We ran SAC [19] agent with default parameters from Brax [15] on the Hopper-v5 and Humanoid-v4 GYM [6] environments. No specific version numbers for Brax, GYM, SAC, Python, or other libraries are given.
Experiment Setup	Yes	For all the experiments, we run a sweep over the hyperparameters. We select the best hyperparameters based on the smallest cumulative error (sum of all 1 at i throughout the training). We then report the mean and the standard deviation across 3 seeds in all the plots. Hyperparameter ranges . Learning rate α which is used to update parameters, for all the methods, is selected from {1e 4, 5e 4, 1e 3, 5e 3, 1e 2, 5e 2, 1e 1, 5e 1, 1.0}. The λinit parameter in L2 Init, is selected from {10.0, 1.0, 0.0, 1e 1, ...}. For S&P, the shrink parameter λ is selected from {1.0, 0.99999, ...}, and the perturbation parameter σ is from {1e 1, ...}. For Soft Resets, the learning rate for γt is selected from {0.5, 0.1, ...}, the constant s is selected from {1.0, 0.95, ...}, the temperature λ in (45) is selected from {1.0, 0.1, 0.01}...