reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Contrastive Active Inference

Authors: Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we compare the contrastive AIF method to likelihood-based AIF and MBRL in high-dimensional image-based settings. Our experimentation aims to answer the following questions: (i) is it possible to achieve high-dimensional goals with AIF-based methods? (ii) what is the difference in performance between RL-based and AIF-based methods? (iii) does contrastive AIF perform better than likelihood-based AIF? (iv) in what contexts contrastive methods are more desirable than likelihood-based methods? (v) are AIF-based methods resilient to variations in the environment background?
Researcher Affiliation	Academia	Pietro Mazzaglia IDLab Ghent University EMAIL Tim Verbelen IDLab Ghent University EMAIL Bart Dhoedt IDLab Ghent University EMAIL
Pseudocode	Yes	The training routine, which alternates updates to the models with data collection, is shown in Algorithm 1.
Open Source Code	No	The paper mentions external resources like gym-minigrid and DeepMind Control Suite, but does not provide a link or explicit statement for its own source code.
Open Datasets	Yes	We performed experiments on the Empty 6 6 and the Empty 8 8 environments from the Mini Grid suite [8]... We performed continuous-control experiments on the Reacher Easy and Hard tasks from the Deep Mind Control (DMC) Suite [48] and on Reacher Easy from the Distracting Control Suite [47].
Dataset Splits	No	The paper describes how data is collected during training episodes and how performance is evaluated on trajectories, but does not specify fixed train/validation/test dataset splits in terms of percentages or counts for reproducibility.
Hardware Specification	No	Relevant parameterization for the experiments can be found in the next section, while hyperparameters and a detailed description of each network are left to the Appendix.
Software Dependencies	No	Relevant parameterization for the experiments can be found in the next section, while hyperparameters and a detailed description of each network are left to the Appendix.
Experiment Setup	Yes	For the 6 6 task, the world model is trained by sampling B = 50 trajectories of length L = 7, while the behavior model is trained by imagining H = 6 steps long trajectories. For the 8 8 task, we increased the length L to 11 and the imagination horizon H to 10. For both tasks, we ﬁrst collected R = 50 random episodes, to populate the replay buffer, and train for U = 100 steps after collecting a new trajectory. ... For both tasks, the world model is trained by sampling B = 30 trajectories of length L = 30, while the behavior model is trained by imagining H = 10 steps long trajectories. We ﬁrst collect R = 50 random episodes, to populate the replay buffer, and train for U = 100 steps after every new trajectory.