reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Compositional Policy Learning in Stochastic Control Systems with Formal Guarantees

Authors: Đorđe Žikelić, Mathias Lechner, Abhinav Verma, Krishnendu Chatterjee, Thomas Henzinger

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement a prototype of our approach and evaluate it on a Stochastic Nine Rooms environment.
Researcher Affiliation	Academia	Ðor de Žikeli c Institute of Science and Technology Austria (ISTA) Klosterneuburg, Austria EMAIL Mathias Lechner Massachusetts Institute of Technology Cambridge, MA, USA EMAIL Abhinav Verma The Pennsylvania State University University Park, PA, USA EMAIL Krishnendu Chatterjee Institute of Science and Technology Austria (ISTA) Klosterneuburg, Austria EMAIL Thomas A. Henzinger Institute of Science and Technology Austria (ISTA) Klosterneuburg, Austria EMAIL
Pseudocode	Yes	The algorithm pseudocode is presented in Algorithm 1.
Open Source Code	Yes	Our code is available at https://github.com/mlech26l/neural_martingales
Open Datasets	No	The paper mentions the 'Stochastic Nine Rooms environment', which is obtained by injecting stochastic disturbances to the environment of [33]. However, it does not provide a link or specific access information for this customized environment/dataset to be considered publicly available.
Dataset Splits	No	The paper describes an RL environment but does not specify explicit train/validation/test splits for data or evaluation.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU or CPU models.
Software Dependencies	No	The paper mentions using 'proximal policy optimization (PPO) [50]' but does not provide specific version numbers for any software dependencies like PPO itself, Python, or machine learning frameworks (e.g., PyTorch, TensorFlow).
Experiment Setup	No	The paper states that PPO was used to initialize policy parameters but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or other detailed training configurations.