ProtoX: Explaining a Reinforcement Learning Agent via Prototyping
Authors: Ronilo Ragodos, Tong Wang, Qihang Lin, Xun Zhou
NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct various experiments to test Proto X. Results show that Proto X achieved high fidelity to the original black-box agent while providing meaningful and understandable explanations. |
| Researcher Affiliation | Academia | Ronilo J. Ragodos Department of Business Analytics University of Iowa Iowa City, IA 52242 EMAIL Tong Wang Department of Business Analytics University of Iowa Iowa City, IA 52242 EMAIL Qihang Lin Department of Business Analytics University of Iowa Iowa City, IA 52242 EMAIL Xun Zhou Department of Business Analytics University of Iowa Iowa City, IA 52242 EMAIL |
| Pseudocode | Yes | Pseudocode and further details for our pre-training algorithm can be found in the supplementary material. |
| Open Source Code | Yes | Reproducibility Our code is available at https://github.com/rrags/Proto X. |
| Open Datasets | Yes | We use four video-game tasks from Open AI Gym, namely, Pong, Seaquest and two levels from Super Mario Bros[45]. |
| Dataset Splits | No | Both Proto X and Res Net-BC are trained with the behavior cloning algorithm using 30,000 state-action pairs obtained via an expert trained with PPO. ... For each game, we let the agent generate a test set of 10, 000 state-action pairs Dtest = {si, π (si)}i. |
| Hardware Specification | Yes | All experiments were done on a system with an RTX 3060Ti GPU, AMD Ryzen 7 3700X 8-Core Processor, with 32GB RAM |
| Software Dependencies | No | The paper mentions using 'stable-baselines3' and 'PPO models', but does not specify exact version numbers for these software packages or other key libraries like Python or PyTorch in the main text. |
| Experiment Setup | No | See the Appendix for hyperparameter settings and further details on our experimental design. |