reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint

Authors: Jiafei Lyu, Mengbei Yan, Zhongjian Qiao, Runze Liu, Xiaoteng Ma, Deheng Ye, Jing-Wen Yang, Zongqing Lu, Xiu Li

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate OTDF upon various D4RL (Fu et al., 2020) datasets with different types of dynamics shifts (e.g., gravity shift), given limited target domain data. Empirically, we demonstrate that OTDF achieves superior performance across numerous tasks and with varied source or target domain dataset qualities, often outperforming recent strong baseline methods by a large margin. To ensure that our work is reproducible, our code is available at https://github.com/dmksjfl/OTDF.
Researcher Affiliation	Collaboration	1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Department of Automation, Tsinghua University, 3Tencent 4School of Computer Science, Peking University, 5Beijing Academy of Artificial Intelligence EMAIL, EMAIL
Pseudocode	Yes	The abstracted pseudocode of OTDF is presented in Algorithm 1. ... We summarize the pseudocode of OTDF+IQL in Algorithm 2.
Open Source Code	Yes	To ensure that our work is reproducible, our code is available at https://github.com/dmksjfl/OTDF.
Open Datasets	Yes	We evaluate OTDF upon various D4RL (Fu et al., 2020) datasets with different types of dynamics shifts (e.g., gravity shift), given limited target domain data.
Dataset Splits	Yes	To ensure that only a limited budget of target domain data can be accessed, we only collect 5 trajectories for each dataset, which amounts to about 5000 transitions. ... We strictly follow the data budget and pick 2 trajectories from the medium dataset and 3 trajectories from the expert dataset to construct the medium-expert datasets.
Hardware Specification	Yes	In Table 8, we list the compute infrastructure that we use to run all of the algorithms. Table 8: Compute infrastructure. CPU AMD EPYC 7452 GPU RTX3090 8 Memory 288GB
Software Dependencies	No	The paper mentions several software components like OTT-JAX, IQL, SAC, D4RL, Mu Jo Co, Open AI Gym, CVAE, and Adam. However, it does not provide specific version numbers for these components in the main text, except for 'D4RL -v2'.
Experiment Setup	Yes	We run all algorithms for 1M gradient steps across 5 random seeds. ... For most of our experiments, we set β = 0.5. ... We summarize the detailed hyperparameter setup for all baseline methods and OTDF in Table 5. Table 5 includes specific values for learning rate, batch size, discount factor, target update rate, and various algorithm-specific coefficients.