reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations

Authors: Yupei Yang, Biwei Huang, Fan Feng, Xinyue Wang, Shikui Tu, Lei Xu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the generalization capability of CSR on a number of simulated and well-established datasets2, including the Cart Pole, Coin Run and Atari environments, with detailed descriptions provided in Appendix D.3. For all these benchmarks, we evaluate the POMDP case, where the inputs are high-dimensional observations. Specifically, the evaluation focuses on answering the following key questions: Q1: Can CSR effectively detect and adapt to the two types of environmental changes? Q2: Does the incorporation of causal knowledge enhance the generalization performance? Q3: Is searching for the optimal expansion structure necessary? We compare our approach against several baselines: Dreamer (Hafner et al., 2023), which handles fixed tasks without integrating causal knowledge; Ada RL (Huang et al., 2021), which employs simple scenario-based policy adaptation without space expansion considerations; and the traditional model-free DQN (Mnih et al., 2015) and SPR (Schwarzer et al., 2020). Additionally, for the Atari games, we benchmark against the state-of-the-art method, Efficient Zero (Ye et al., 2021). All results are averaged over 5 runs, more implementation details can be found in Appendix D.
Researcher Affiliation	Academia	Yupei Yang1, Biwei Huang2 , Fan Feng2,3, Xinyue Wang2, Shikui Tu1 , Lei Xu1 1Shanghai Jiao Tong University, 2University of California San Diego, 3Mohamed bin Zayed University of Artificial Intelligence EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Towards Generalizable RL through CSR
Open Source Code	Yes	Code is available at https://github.com/CMACH508/CSR.
Open Datasets	Yes	We evaluate the generalization capability of CSR on a number of simulated and well-established datasets2, including the Cart Pole, Coin Run and Atari environments, with detailed descriptions provided in Appendix D.3. ... To illustrate, we reference the popular Coin Run environment (Cobbe et al., 2019). ... Atari 100K games, which includes 26 games with a budget of 400K environment steps (Kaiser et al., 2019).
Dataset Splits	Yes	For each of these games, we perform experiments among a sequence of four tasks, where each task randomly assigns a (mode, difficulty) pair. We then train these models on the source task and generalize them to downstream target tasks. ... For each task, agents are allowed to collect data over 100 episodes, each consisting of 256 time steps. ... Following Cobbe et al. (2019), we utilize a set of 500 levels as source tasks and generalize the agents to target tasks with higher difficulty levels outside these 500 levels.
Hardware Specification	Yes	All experiments are conducted using an Nvidia A100 GPU.
Software Dependencies	No	The paper mentions specific software components and algorithms like Dreamer, Gumbel-Softmax, Adam optimizer, and REINFORCE gradients, but does not provide specific version numbers for these dependencies, which are necessary for reproducible software details.
Experiment Setup	Yes	Table 4: Architecture and hyperparameters for the simulated environment. Table 5: Hyperparameters of CSR for Cart Pole, Coin Run and Atari games.