Critic Regularized Regression

Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh S. Merel, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our algorithm, CRR, on a number of challenging simulated manipulation and locomotion domains. Our results demonstrate that CRR works well even in these challenging settings and that it outperforms previously published approaches, in some cases by a considerable margin.
Researcher Affiliation Industry Ziyu Wang EMAIL Alexander Novikov EMAIL Konrad Zołna EMAIL Jost Tobias Springenberg EMAIL Scott Reed EMAIL Bobak Shahriari EMAIL Noah Siegel EMAIL Josh Merel EMAIL Caglar Gulcehre EMAIL Nicolas Heess EMAIL Nando de Freitas EMAIL Deep Mind, London, United Kingdom. Google Brain, Toronto, Canada.
Pseudocode Yes Algorithm 1: Critic Regularized Regression
Open Source Code No The paper does not explicitly state that the code for CRR is open-source or provide a link to a repository for the methodology described.
Open Datasets Yes We experiment with the continuous control tasks introduced in RL Unplugged (RLU) [3]. There are 17 different tasks in RLU: nine tasks from the Deepmind Control suite [34] and seven locomotion tasks. We additionally introduce four robotic manipulation datasets.
Dataset Splits No The paper discusses training and evaluating models but does not provide specific dataset split percentages (e.g., train/validation/test) or sample counts for its experiments.
Hardware Specification No The paper states 'All simulations are conducted using Mu Jo Co [35]' but does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) to reproduce the experiment setup.
Experiment Setup Yes Full details on the hyper-parameters are given in the appendix.