reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Methods for Constraint Inference in Reinforcement Learning

Authors: Dimitris Papadimitriou, Usman Anwar, Daniel S. Brown

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that BICRL outperforms pre-existing constraint learning approaches, leading to more accurate constraint inference and consequently safer policies. We carry out simulations in deterministic state space grid world environments to compare our method to the Greedy Iterative Constraint Inference (GICI) method proposed by Scobee & Sastry (2019). Table 1: False Positive, False Negative and Precision classification rates for GICI and BICRL for varying levels of transition dynamics noise. Results averaged over 10 runs.
Researcher Affiliation	Academia	Dimitris Papadimitriou EMAIL UC Berkeley Usman Anwar EMAIL University of Cambridge Daniel S. Brown EMAIL University of Utah
Pseudocode	Yes	Algorithm 1 BICRL Algorithm 2 Active Constraint Learning Algorithm 3 BDPR Algorithm 4 BCPR Algorithm 5 Feature-Based BICRL
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	Figure 10 shows the floor plan of a single bedroom apartment obtained from the i Gibson dataset (Li et al., 2021).
Dataset Splits	No	The paper discusses using a certain number of expert demonstrations and evaluating generalization to a 'new unseen environment', but does not provide specific percentages or sample counts for training, validation, and test splits for any dataset in the traditional sense.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	Table 3: Hyperparameters of Sections 4.1-4.3 simulations. Hyperparameters Sec. 4.1 Sec. 4.2 Sec. 4.3 # Expert trajectories 100 100 20 n 80 80 80 γ 0.95 0.95 0.95 ϵ 0.0 0.0, 0.01, 0.05 0.0 β 1 1 1 K 2000 4000 200 σ 1 1 1 fr 50 50 50