reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Model-Based Offline Planning

Authors: Arthur Argenson, Gabriel Dulac-Arnold

ICLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the performance of our algorithm, Model-Based Ofﬂine Planning (MBOP) on a series of robotics-inspired tasks, and demonstrate its ability to leverage planning to respect environmental constraints.
Researcher Affiliation	Industry	Arthur Argenson EMAIL Google Research Gabriel Dulac-Arnold EMAIL Google Research
Pseudocode	Yes	Algorithm 1 High-Level MBOP-Policy; Algorithm 2 MBOP-Trajopt
Open Source Code	No	The paper does not provide an explicit statement or link for open-sourcing the code for the described methodology. It only mentions that 'Accompanying videos are available here' and 'All non-standard datasets will be available publicly'.
Open Datasets	Yes	We use standard datasets from the RL Unplugged (RLU) (Gulcehre et al., 2020) and D4RL (Fu et al., 2020) papers.
Dataset Splits	Yes	On all datasets, training is performed on 90% of data and 10% is used for validation.
Hardware Specification	Yes	We calculate the average control frequency of MBOP on the RLU Walker task using a single Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz core and a Nvidia 1080TI ... Execution speeds on the RLU Walker task in represented in Table 9. ... on an Tesla P100 using a single core of a Xeon 2200 MHz equivalent processor.
Software Dependencies	No	The paper describes the software components (e.g., neural networks) but does not provide specific version numbers for any libraries, frameworks, or environments used.
Experiment Setup	Yes	The full set of parameters for each experiment can be found in the Appendix Sec. 5.2. ... # FC Layers : 2 Size FC Layers : 500 # Ensemble Networks : 3 Learning Rate : 0.001 Batch Size : 512 # Epochs : 40