Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

Authors: Mark Beliaev, Ramtin Pedarsani

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments in both online and offline IL settings, with simulated and human-generated data, demonstrate IRLEED s adaptability and effectiveness, making it a versatile solution for learning from suboptimal demonstrations. [...] In this section we evaluate how IRLEED performs when learning from suboptimal demonstrations, using experiments in both online and offline IL settings, with simulated and human-generated data.
Researcher Affiliation Academia University of California, Santa Barbara EMAIL, EMAIL
Pseudocode No The paper describes the method using mathematical equations (Eq. 1-6) and prose, outlining the iterative approach and gradient computations (Eq. 8-10). However, there is no clearly labeled 'Algorithm' or 'Pseudocode' block with structured steps.
Open Source Code Yes Code https://github.com/mbeliaev1/IRLEED
Open Datasets Yes human-generated data... dataset B: collected data using adept human players (Kurin et al. 2017)
Dataset Splits No The paper mentions generating data, using a certain number of trajectories (e.g., 'collecting 40 trajectories from each policy') and seeds ('using 100 seeds for each setting', '30 seeds for each dataset setting'), but does not explicitly provide percentages or counts for training, validation, or test splits. It describes how the data was generated or used for runs, not how it was partitioned for different phases of model evaluation.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory amounts) used for conducting the experiments.
Software Dependencies No The paper mentions utilizing 'codebases provided by the authors to implement ILEED and IQ' and creating 'IRLEED ontop of the IQ algorithm', but it does not specify any programming languages or library versions (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiments.
Experiment Setup No For further implementation details refer to the Appendix found in our extended version.