reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Risk-sensitive control as inference with Rényi divergence

Authors: Kaito Ito, Kenji Kashima

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The behavior of the risk-sensitive soft actor-critic is examined via an experiment.
Researcher Affiliation	Academia	Kaito Ito The University of Tokyo EMAIL Kenji Kashima Kyoto University EMAIL
Pseudocode	No	The paper describes algorithms but does not provide them in a structured pseudocode or algorithm block.
Open Source Code	Yes	The code is available at https://github.com/kaito-1111/risk-sensitive-sac.git.
Open Datasets	Yes	The environment is Pendulum-v1 in Open AI Gymnasium.
Dataset Splits	No	The paper mentions training and testing but does not provide specific percentages or absolute counts for dataset splits (train/validation/test).
Hardware Specification	Yes	For the training, we used an Ubuntu 20.04 server (GPU: NVIDIA Ge Force RTX 2080Ti).
Software Dependencies	No	The implementation of the risk-sensitive SAC (RSAC) algorithm follows the stable-baselines3 [50] version of the SAC algorithm... optimizer Adam [51]. No specific version numbers for these or other software are provided.
Experiment Setup	Yes	Now, we introduce a series of hyperparameters listed in Table 1 shared for both SAC and RSAC algorithms. Table 1: SAC and RSAC Hyperparameters Parameter Value optimizer Adam [51] learning rate 10 3 discount factor 0.99 regularization coefﬁcient 0.1 target smoothing coefﬁcient 0.005 replay buffer size 105 number of critic networks 2 number of hidden layers (all networks) 2 number of hidden units per layer 256 activation function Re LU