reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sparsity-Driven Plasticity in Multi-Task Reinforcement Learning

Authors: Aleksandar Todorov, Juan Cardenas-Cartagena, Rafael F. Cunha, Marco Zullich, Matthia Sabatelli

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate these approaches across distinct MTRL architectures (shared backbone, Mixture of Experts, Mixture of Orthogonal Experts) on standardized MTRL benchmarks, comparing against dense baselines, and a comprehensive range of alternative plasticity-inducing or regularization methods. Our results demonstrate that both GMP and SET effectively mitigate key indicators of plasticity degradation, such as neuron dormancy and representational collapse. These plasticity improvements often correlate with enhanced multi-task performance, with sparse agents frequently outperforming dense counterparts and achieving competitive results against explicit plasticity interventions.
Researcher Affiliation	Academia	Aleksandar Todorov EMAIL Juan Cardenas-Cartagena EMAIL Rafael F. Cunha EMAIL Marco Zullich EMAIL Matthia Sabatelli EMAIL University of Groningen, Groningen, The Netherlands
Pseudocode	No	The paper describes algorithms like Gradual Magnitude Pruning (GMP) and Sparse Evolutionary Training (SET) in text paragraphs (Sections C.4 and C.5) but does not present them as structured pseudocode or algorithm blocks.
Open Source Code	Yes	The full implementation is available at https://github.com/atodorov284/sparsity_driven_plasticity.
Open Datasets	Yes	Environment and Benchmarks We mostly consider the three multi-task Mini Grid (Chevalier-Boisvert et al., 2023) benchmarks proposed by Hendawy et al. (2024) MT3, MT5, and MT7, with the exception being made for the results presented in Section 4.3, which use the Meta World MT10 benchmark (Yu et al., 2021).
Dataset Splits	No	The paper describes using Mini Grid and Meta World MT10 benchmarks where tasks are sampled randomly with replacement during training, and evaluation is done using a certain number of episodes per task. It does not provide specific training/test/validation splits for a fixed dataset in the traditional sense.
Hardware Specification	No	The paper does not explicitly mention the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments. It only refers to the training environment without detailing the computing infrastructure.
Software Dependencies	No	The paper mentions using the mushroom_rl library and the Adam optimizer, but it does not specify their exact version numbers. It also refers to the rliable library without a version.
Experiment Setup	Yes	Appendix A provides detailed hyperparameters in Table 2 ('Core experimental setup, agent architecture, and algorithm hyperparameters on Mini Grid') and Table 3 ('The hyperparameters and training setup used for MTMH SAC on Meta World MT10'), covering aspects like number of environments, steps per epoch, total timesteps, train frequency, evaluation episodes/frequency, optimizer details (Adam, learning rates), network architecture (Conv2D channels, kernel sizes, activations, hidden sizes), GAE λ, Entropy Term Coefficient, Clipping ε, Epochs for Policy/Critic, Batch Size, Discount Factor, and specific parameters for MoE/MOORE, MTMH SAC, and sparsity methods.