Novelty Detection in Reinforcement Learning with World Models
Authors: Geigh Zollicoffer, Kenneth Eaton, Jonathan C Balloch, Julia Kim, Wei Zhou, Robert Wright, Mark Riedl
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method by injecting novelties into Mini Grid (Chevalier-Boisvert et al., 2018), Atari (Machado et al., 2018), and continuous Deep Mind Control (DMC) (Tunyasuvunakool et al., 2020) environments. Specifically, we use the Nov Grid (Balloch et al., 2022), Hack Atari (Delfosse et al., 2024), and Real World RL Suite (Dulac-Arnold et al., 2020) that provide novelties to their respective base environments. |
| Researcher Affiliation | Academia | 1Department of Mathematics, Georgia Institute of Technology, Atlanta, United States of America 2Georgia Tech Research Institute, Atlanta, United States of America 3Department of Computer Science, Georgia Institute of Technology, Atlanta, United States of America. Correspondence to: Mark Riedl <EMAIL>, Geigh Zollicoffer <EMAIL>. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are present. The methodology is described using equations and narrative text. |
| Open Source Code | No | The paper uses and references a third-party framework's code (Dreamer V2) by citing its GitHub repository in Table 9, but does not provide any statement or link for the open-sourcing of the authors' own novelty detection methodology described in this paper. |
| Open Datasets | Yes | We evaluate our method by injecting novelties into Mini Grid (Chevalier-Boisvert et al., 2018), Atari (Machado et al., 2018), and continuous Deep Mind Control (DMC) (Tunyasuvunakool et al., 2020) environments. |
| Dataset Splits | No | The paper describes the evaluation protocol where agents are trained in nominal environments and then tested in novel environments for specific numbers of episodes or steps (e.g., 'capturing 300 independent and identically distributed episodes' or 'capturing 50,000 steps within various independent and identically distributed episodes'). However, it does not provide specific training, validation, or test dataset splits in terms of percentages, sample counts, or predefined files for the underlying datasets themselves. |
| Hardware Specification | Yes | All experiments can be sufficiently reproduced utilizing a NVIDIA Ge Force GTX 1080 GPU with at least 8 GB of VRAM for environment complexity, a AMD Ryzen 5 5600X 6-Core Processor and at least 50 MB for files, excluding training data which is dependent on environment and model hyper-parameters. |
| Software Dependencies | No | The paper mentions using 'Dreamer V2' as a world model framework and references its GitHub repository for default training parameters, but it does not provide specific version numbers for Dreamer V2 itself or any other software dependencies like programming languages (e.g., Python), deep learning libraries (e.g., PyTorch, TensorFlow), or CUDA. |
| Experiment Setup | Yes | Appendix F.1 provides a table titled 'Dreamer World Model Training Parameters' which lists various hyperparameters such as 'Dataset size (FIFO)', 'Batch size B', 'Sequence length L', 'Discrete latent dimensions', 'KL loss scale β', 'World model learning rate', 'Imagination horizon H', 'Discount γ', 'Actor learning rate', and 'Critic learning rate' along with their specific numerical values. |