Self-Consistent Model-based Adaptation for Visual Reinforcement Learning

Authors: Xinning Zhou, Chengyang Ying, Yao Feng, Hang Su, Jun Zhu

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple visual generalization benchmarks and real robot data demonstrate that SCMA effectively boosts performance across various distractions and exhibits better sample efficiency.
Researcher Affiliation Academia Department of Computer Science & Technology, Institute for AI, BNRist Center, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University EMAIL, EMAIL
Pseudocode No The paper describes methods and derivations using mathematical formulas and textual explanations, but it does not include a clearly labeled pseudocode block or algorithm box.
Open Source Code No The paper does not explicitly state that source code is released, nor does it provide a link to a code repository. It mentions 'More details can be found in Appendix C.1.' and 'further training details provided in Appendix C.' but these do not refer to code availability.
Open Datasets Yes To measure the effectiveness of SCMA, we follow the settings from the commonly adopted DMControl GB [Hansen and Wang, 2021; Hansen et al., 2021; Bertoin et al., 2022], DMControl View [Yang et al., 2024], and RL-Vi Gen [Yuan et al., 2024]. ... Following the official design [Hansen and Wang, 2021], the augmentation-based methods use random overlay with images from Place365 [Zhou et al., 2017].
Dataset Splits No The paper mentions using specific environments like DMControl GB, DMControl View, and RL-Vi Gen, and discusses pre-training and adaptation phases, but it does not provide explicit details on how datasets for these environments are split into training, validation, or test sets (e.g., specific percentages or sample counts) within the main text.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific libraries with their versions).
Experiment Setup Yes Adaptation-based methods will first be pre-trained in the clean environments for 1M timesteps and then adapt to the distracting environments for 0.1M timesteps (0.4M for video hard and 0.5M for RLVi Gen).