reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

UTILITY: Utilizing Explainable Reinforcement Learning to Improve Reinforcement Learning

Authors: Shicheng Liu, Minghui Zhu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use Mu Jo Co experiments to show that our method outperforms state-of-the-art baselines. This section provides experiment results for the proposed framework.
Researcher Affiliation	Academia	Shicheng Liu & Minghui Zhu Department of Electrical Engineering Pennsylvania State University University Park, PA 16802, USA EMAIL
Pseudocode	Yes	Algorithm 1 Utilizing explainable reinforcement learning to improve reinforcement learning
Open Source Code	No	The paper does not contain any explicit statements about code availability, such as a link to a repository or a declaration that the code will be released.
Open Datasets	Yes	We test the algorithms on delayed Mu Jo Co environments (Zheng et al., 2018; Memarian et al., 2021; Oh et al., 2018)... We also conduct experiments on the original Mu Jo Co environments, which are widely used in RL literature (Xu & Zhu, 2023b; 2024)
Dataset Splits	No	The paper mentions "each episode has the length of 100 in our experiments" for the environments, but it does not specify any training/test/validation splits for a collected dataset. Reinforcement learning typically involves continuous interaction with an environment rather than predefined static dataset splits.
Hardware Specification	Yes	The code was running on a laptop whose CPU is Intel Core i9 12900k and GPU is NVIDIA RTX 3080.
Software Dependencies	No	The paper mentions the operating system as "Windows 10" but does not specify any software libraries, frameworks (like PyTorch, TensorFlow), or other dependencies with version numbers that would be necessary for reproduction.
Experiment Setup	Yes	The neural network has two hidden layers where each hidden layer has 64 neurons. The activation functions are respectively Re LU and Tanh. Following (Finn et al., 2017), each episode has the length of 100 in our experiments. We use soft actor-critic (SAC) (Haarnoja et al., 2018) as the baseline RL algorithm. The mean and standard deviation are computed using five random seeds.