reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms

Authors: ABULIKEMU ABUDUWEILI, Changliu Liu

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we implement an optics simulation environment for reinforcement learning based controllers. The environment captures the essence of nonconvexity, nonlinearity, and time-dependent noise inherent in optical systems, offering a more realistic setting. Subsequently, we provide the benchmark results of several reinforcement learning algorithms on the proposed simulation environment. The experimental findings demonstrate the superiority of off-policy reinforcement learning approaches over traditional control algorithms in navigating the intricacies of complex optical control environments.
Researcher Affiliation	Academia	Abulikemu Abuduweili EMAIL Robotics Institute, Carnegie Mellon University Changliu Liu EMAIL Robotics Institute, Carnegie Mellon University
Pseudocode	No	The paper includes 'Figure 4: Example code of the OPS environment.' which shows an example of how to use the environment, but it does not present any pseudocode or algorithm blocks for the core methodologies (SPGD, PPO, SAC, TD3) or novel contributions in a structured, code-like format.
Open Source Code	Yes	The code of the paper is available at https://github.com/Walleclipse/Reinforcement-Learning-Pulse-Stacking.
Open Datasets	No	The paper focuses on presenting an 'open and scalable simulator designed for controlling typical optical systems' called OPS. It describes a simulation environment for generating data rather than utilizing or providing access to a pre-existing public dataset.
Dataset Splits	No	The paper describes a training procedure for RL agents consisting of 'multiple episodes' and then evaluates 'testing performance of the trained policy'. This is characteristic of reinforcement learning where data is generated through interaction with an environment, rather than splitting a fixed, pre-existing dataset into training, validation, and test sets.
Hardware Specification	Yes	Our experiments were conducted on an Ubuntu 18.04 system, with an Nvidia RTX 2080 Ti (12 GB) GPU, Intel Core i9-7900x processors, and 64 GB memory.
Software Dependencies	No	The paper states: 'We used the algorithms implemented in stable-baselines-3 (Raffin et al., 2019).' While a specific library is mentioned, a version number for stable-baselines-3 is not provided, nor are versions for other mentioned frameworks like Open AI Gym API or Nonlinear-Optical-Modeling.
Experiment Setup	Yes	Detailed information regarding the hyperparameter ranges and the selected values for TD3, SAC, and PPO can be found in tables 4 to 6.