GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation
Authors: Hongyin Zhang, Pengxiang Ding, Shangke Lyu, Ying Peng, Donglin Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the state generation and visual manipulation capabilities of GEVRM. To this end, our experiments aim to investigate the following questions: 1) Can GEVRM have strong generalization ability to generate expressive goal in various environments? 2) Does GEVRM exhibit a higher success rate in executing robot tasks compared to the baseline in various environments? 3) How important are the core components of the GEVRM for achieving robust decision action? |
| Researcher Affiliation | Academia | Hongyin Zhang1,2 Pengxiang Ding1,2 Shangke Lyu2 Ying Peng2 Donglin Wang2 1 Zhejiang University. 2 Westlake University. EMAIL |
| Pseudocode | Yes | Algorithm 1 GEVRM: Test-time Execution |
| Open Source Code | No | The paper does not contain a clear, affirmative statement about releasing the source code for GEVRM, nor does it provide a specific link to a code repository. It mentions 'open-source video generative models' in the context of baselines but not for its own method. |
| Open Datasets | Yes | We utilized two types of datasets (realistic Bridge (Walke et al., 2023) and simulated CALVIN (Mees et al., 2022b)) to evaluate the generalization of goal generation. |
| Dataset Splits | Yes | We study zero-shot multi-environment training on A, B, and C, and testing on D, varying in table texture, furniture positioning, and color patches. |
| Hardware Specification | No | The paper does not explicitly state the specific hardware (e.g., GPU models, CPU types) used for training or inference of the GEVRM model. It mentions a 'UR5' robotic arm for real-world tasks, but this is the robot being controlled, not the computational hardware. |
| Software Dependencies | No | The paper mentions several models and algorithms like T5, Rectified Flow, DDPM, and ResNet34, along with their respective research papers. However, it does not specify software dependencies with version numbers, such as Python versions or library versions (e.g., PyTorch 1.9). |
| Experiment Setup | Yes | The hyperparameters are shown in Appendix Tab. 8 and Tab. 9. The policy training hyperparameters are shown in Appendix Tab. 10. |