An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms

Authors: ABULIKEMU ABUDUWEILI, Changliu Liu

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we implement an optics simulation environment for reinforcement learning based controllers. The environment captures the essence of nonconvexity, nonlinearity, and time-dependent noise inherent in optical systems, offering a more realistic setting. Subsequently, we provide the benchmark results of several reinforcement learning algorithms on the proposed simulation environment. The experimental findings demonstrate the superiority of off-policy reinforcement learning approaches over traditional control algorithms in navigating the intricacies of complex optical control environments.
Researcher Affiliation Academia Abulikemu Abuduweili EMAIL Robotics Institute, Carnegie Mellon University Changliu Liu EMAIL Robotics Institute, Carnegie Mellon University
Pseudocode No The paper includes 'Figure 4: Example code of the OPS environment.' which shows an example of how to use the environment, but it does not present any pseudocode or algorithm blocks for the core methodologies (SPGD, PPO, SAC, TD3) or novel contributions in a structured, code-like format.
Open Source Code Yes The code of the paper is available at https://github.com/Walleclipse/Reinforcement-Learning-Pulse-Stacking.
Open Datasets No The paper focuses on presenting an 'open and scalable simulator designed for controlling typical optical systems' called OPS. It describes a simulation environment for generating data rather than utilizing or providing access to a pre-existing public dataset.
Dataset Splits No The paper describes a training procedure for RL agents consisting of 'multiple episodes' and then evaluates 'testing performance of the trained policy'. This is characteristic of reinforcement learning where data is generated through interaction with an environment, rather than splitting a fixed, pre-existing dataset into training, validation, and test sets.
Hardware Specification Yes Our experiments were conducted on an Ubuntu 18.04 system, with an Nvidia RTX 2080 Ti (12 GB) GPU, Intel Core i9-7900x processors, and 64 GB memory.
Software Dependencies No The paper states: 'We used the algorithms implemented in stable-baselines-3 (Raffin et al., 2019).' While a specific library is mentioned, a version number for stable-baselines-3 is not provided, nor are versions for other mentioned frameworks like Open AI Gym API or Nonlinear-Optical-Modeling.
Experiment Setup Yes Detailed information regarding the hyperparameter ranges and the selected values for TD3, SAC, and PPO can be found in tables 4 to 6.