Heterogeneous Knowledge for Augmented Modular Reinforcement Learning
Authors: Lorenz Wolf, Mirco Musolesi
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results demonstrate the performance and efficiency improvements, also in terms of generalization, which can be achieved by augmenting traditional modular RL with heterogeneous knowledge sources and processing mechanisms. Finally, we examine the safety, robustness, and interpretability issues stemming from the introduction of knowledge heterogeneity. |
| Researcher Affiliation | Academia | Lorenz Wolf EMAIL Department of Computer Science & Centre for Artificial Intelligence University College London Mirco Musolesi EMAIL Department of Computer Science & Centre for Artificial Intelligence University College London Department of Computer Science and Engineering University of Bologna |
| Pseudocode | Yes | Algorithm 1 Decision-making with an AMRL agent in discrete action spaces. |
| Open Source Code | Yes | 1The full implementation and code used for the experiments are publicly available: https://github.com/lorenzflow/amrl. |
| Open Datasets | Yes | We use several environments from the Minigrid suite (Chevalier-Boisvert et al., 2018), each presenting distinct challenges: ...For evaluation in continuous action spaces, we use the Fetch environments (Plappert et al., 2018), a set of manipulation tasks performed with a 7-Do F robot arm from the Open AI Robotics Gym (de Lazcano et al., 2024). |
| Dataset Splits | No | The paper refers to training durations (e.g., 'trained for 300k frames', '1.5 million frames') for experiments in simulator environments, but does not specify explicit train/test/validation splits for static datasets. The environments themselves do not inherently have such splits described within the paper. |
| Hardware Specification | No | Compute Resources. The experiments were run on a CPU. No large amount of memory is required. |
| Software Dependencies | No | The implementations of all agents with discrete action spaces rely on the rl-starter-files repository and torch_ac3. For continuous action spaces, SAC is implemented following standard settings from rl-baselines3-zoo (Raffin, 2020) and stable-baselines3 (Raffin et al., 2021). The paper refers to software packages (e.g., torch_ac3, stable-baselines3) and repositories but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | No | For all agents below the PPO hyperparameters are set to the default values provided in the rl-starter-file repository. Default hyperparameter settings are used for both PPO and SAC. The paper states that default hyperparameter settings are used or refers to external repositories for these details, rather than providing specific values within the main text. |