Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Authors: Ahmet H. Güzel, Ilija Bogunovic, Jack Parker-Holder
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on the V-D4RL benchmark (continuous control) and Procgen benchmark (discrete control) demonstrate that our approach consistently reduces the generalization gap and improves performance in unseen environments. |
| Researcher Affiliation | Academia | Ahmet H. Güzel EMAIL University College London AI Centre Ilija Bogunovic EMAIL University College London AI Centre Jack Parker-Holder EMAIL University College London AI Centre |
| Pseudocode | No | The paper describes the methodology using text and equations, and includes architectural diagrams (e.g., Figure 2), but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to a code repository, an explicit statement of code release, or mention code in supplementary materials for the methodology described. |
| Open Datasets | Yes | We evaluated our method on two challenging offline RL benchmarks that test generalization capabilities in different domains: Visual D4RL (V-D4RL) (Lu et al., 2023a): This benchmark is a visual input version of the D4RL benchmark (Fu et al., 2021) and focuses on continuous control tasks with visual input. Offline Procgen (Mediratta et al., 2024): This is an offline version of Procgen benchmark Cobbe et al. (2020) procedurally generated games that targets discrete control tasks. It tests zero-shot generalization to entirely unseen levels. |
| Dataset Splits | No | The paper refers to using datasets from V-D4RL and Procgen benchmarks and mentions 'training and testing environments', but it does not explicitly provide specific percentages, sample counts, or detailed methodology for how the datasets are split into training, validation, or test sets within the main text. It defers to the original benchmarks for this information or to the supplementary material for experimental setup details. |
| Hardware Specification | No | The paper does not contain any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions algorithms used (Dr Q+BC, CQL) and refers to standard settings from original benchmark papers, but it does not provide specific version numbers for any ancillary software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | No | The hyperparameters, network architectures, and other implementation details follow the standard settings provided in the original benchmark papers. For completeness, we provide all hyperparameters and network architecture details in supplementary material. |