Avoiding Negative Side Effects of Autonomous Systems in the Open World

Authors: Sandhya Saisubramanian , Ece Kamar, Shlomo Zilberstein

JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings. ... We perform extensive evaluation of the different feedback mechanisms for mitigating avoidable and unavoidable NSE. ... Values averaged over 100 trials of planning and execution, along with their standard errors, are reported for the following domains. ... The effectiveness of shaping is evaluated in terms of the average NSE penalty incurred and the expected value of o P after shaping.
Researcher Affiliation Collaboration Sandhya Saisubramanian EMAIL School of Electrical Engineering and Computer Science Oregon State University ... Ece Kamar EMAIL Microsoft Research ... Shlomo Zilberstein EMAIL College of Information and Computer Sciences University of Massachusetts Amherst
Pseudocode Yes Algorithm 1 Slack Estimation ( M, N, E) ... Algorithm 2 Environment shaping to mitigate NSE ... Algorithm 3 Diverse modifications(b, Ω, Md, E0)
Open Source Code No The paper does not provide any explicit statement about releasing its own source code, nor does it include a link to a code repository. It mentions using 'sklearn Python package' but this refers to a third-party library, not the authors' implementation.
Open Datasets No The paper uses 'Boxpushing' and 'Driving' domains for its experiments, which are described as custom simulation environments. It does not provide specific links, DOIs, repository names, or formal citations for publicly available datasets used in the experiments. References like '(Seuken & Zilberstein, 2007)' and '(Saisubramanian, Kamar, & Zilberstein, 2020a; Wray et al., 2015)' are for related methodological papers, not specific dataset access.
Dataset Splits No The paper describes generating 'five instances with grid size 15 15' for the Boxpushing domain and 'Five test instances are generated with grid size 15 15' for the Driving domain. It also states that 'Values averaged over 100 trials of planning and execution' are reported. However, it does not specify any training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits) for these instances or trials.
Hardware Specification No The paper states: 'The algorithms are implemented in Python and tested on a computer with 16GB of RAM.' This provides information about RAM but lacks specific details such as CPU models, GPU models, or processor types, which are necessary for a comprehensive hardware specification.
Software Dependencies No The paper mentions: 'Random forest regression from sklearn Python package is used for model learning.' and 'A random forest classifier from the sklearn Python package is used for learning a predictive model.' While it names the 'sklearn Python package' and implicitly 'Python', it does not provide specific version numbers for either of these software dependencies.
Experiment Setup Yes We tested with β [0.1, 0.9] since o1 is prioritized in our formulation and report results with β = 0.8 as it achieved the best trade-offin training. ... The slack is computed using Algorithm 1 and γ = 0.95. ... Conservative where the agent explores an action with probability 0.1 or follows its primary policy, moderate where the agent either explores an action with probability 0.5 or follows its primary policy, and radical where the agent predominantly explores with probability 0.9... Pushing the box on a surface type c = 1 results in severe NSE with a penalty of 10, pushing the box on a surface c=2 results in mild NSE and a penalty of 5... The cost of navigating at a low speed is two and that of high speed is one... We vary δA between 0-25% of V P ( s0|E0) and δD between 0-25% of the NSE penalty of the actor s policy in E0.