reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning View-invariant World Models for Visual Robotic Manipulation

Authors: Jing-Cheng Pang, Nan Tang, Kaiyuan Li, Yuting Tang, Xin-Qiang Cai, Zhen-Yu Zhang, Gang Niu, Masashi Sugiyama, Yang Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of Re Vi Wo in various viewpoint disturbance scenarios, including control under novel camera positions and frequent camera shaking, using the Meta-world & Panda Gym environments. Besides, we also conduct experiments on real world ALOHA robot. The results demonstrate that Re Vi Wo maintains robust performance under viewpoint disturbance, while baseline methods suffer from significant performance degradation. Furthermore, we show that the VIR captures taskrelevant state information and remains stable for observations from novel viewpoints, validating the efficacy of the Re Vi Wo approach.
Researcher Affiliation	Collaboration	1 National Key Laboratory for Novel Software Technology, Nanjing University, China & School of Artificial Intelligence, Nanjing University, China; 2 RIKEN Center for Advanced Intelligence Project, Japan; 3 Polixir.ai, China; 4 The University of Tokyo, Japan
Pseudocode	Yes	Algorithm 1 Representation learning for View-invariant World model (Re Vi Wo)
Open Source Code	No	The paper mentions using "Offline RL-kit (Sun, 2023)" but does not provide a direct link or explicit statement that their methodology's code is open-source or available.
Open Datasets	Yes	Meanwhile, Re Vi Wo is simutaneously trained on Open X-Embodiment datasets without view labels. We conduct experiments on two robotics manipulation environments: Meta-world (Yu et al., 2019) and Panda Gym (Gallou edec et al., 2021). Integration of Open X-Embodiment data without view labels. In addition to the data with view labels, we also involve multi-view data without view labels from the Open X-Embodiment dataset (O Neill et al., 2024), which are readily available on the internet.
Dataset Splits	No	The paper describes data collection processes for training the autoencoder and offline control data, as well as evaluation scenarios (e.g., various azimuth offsets, camera shaking). However, it does not provide explicit training/validation/test splits (e.g., percentages or absolute counts for reproduction) from a single dataset, but rather describes training on collected data and evaluating on different disturbance conditions.
Hardware Specification	Yes	We use 64 CPU cores (AMD EPYC 9654 @ 2.4GHz) and 4 GPUs (NVIDIA Ge Force RTX 4090) for our experiments.
Software Dependencies	Yes	The software stack employed for our experiments includes Python 3.11 and Py Torch 2.1.0.
Experiment Setup	Yes	The hyper-parameters for implementing Re Vi Wo are presented in Table 4. For all methods, the model is trained with an offline RL algorithm for 25000 gradient steps, and evaluated for 40 episodes.