reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Conservative Offline Goal-Conditioned Implicit V-Learning

Authors: Kaiqiang Ke, Qian Lin, Zongkai Liu, Shenghong He, Chao Yu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluations on OGBench, a benchmark for offline GCRL, demonstrate that CGCIVL consistently surpasses state-of-the-art methods across diverse tasks. ... Empirically, experiments on OGBench (Park et al., 2024), a benchmark specifically designed for offline GCRL, demonstrate that our algorithm consistently matches or surpasses state-of-the-art methods across distinct environments with varying configurations.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Shanghai Innovation Institute, Shanghai, China 3Pengcheng Laboratory, Shenzhen, China. Correspondence to: Chao Yu <EMAIL>.
Pseudocode	Yes	The pseudocode of CGCIVL is shown in Algorithm 1. ... Algorithm 1 CGCIVL
Open Source Code	Yes	Our algorithm implementation is based on the reproduction of HIQL in the open-source OGbench and can be found at https://github.com/kkq2018/CGCIVL.git.
Open Datasets	Yes	We evaluate the proposed algorithm on OGbench (Park et al., 2024), a benchmark designed to evaluate algorithms in offline GCRL across diverse tasks and datasets. ... The dataset collected using a noisy expert policy that navigates the maze by sequentially reaching randomly sampled goals. This dataset is used to evaluate navigation performance. ... medium maze in D4RL (Fu et al., 2020).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits with percentages, sample counts, or references to predefined splits for reproduction. It describes how data is collected and how policies are evaluated, but not how the datasets are formally split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions that the algorithm implementation is based on 'HIQL in the open-source OGbench' but does not specify any particular software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	The detailed hyperparameter settings are shown in Table 2. The training process employed a batch size of 1024, with the policy and value networks designed as MLPs of dimensions (256, 256) and (512, 512, 512), respectively. The GELU activation function was used to ensure smooth gradient flow, while the Adam optimizer, configured with a learning rate of 0.0003, facilitated efficient parameter updates. To further stabilize the training process, the target network smoothing coefficient is set to 0.005. ... In the antmaze-giant-stitch-v0 environment, the algorithm was trained for 2,000,000 steps. In the humanoidmaze-giant-navigate-v0 and humanoidmaze-giant-stitch-v0 environments, the algorithm was trained for 3,000,000 steps. For all other environments, the training steps were set to 1,000,000, consistent with the settings in OGbench.