reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Maximum Total Correlation Reinforcement Learning

Authors: Bang You, Puze Liu, Huaping Liu, Jan Peters, Oleg Arenz

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate our algorithm on simulated robotic control tasks and show that the learned policies induce more periodic and better compressible trajectories, and that exhibit superior robustness to noise and changes in dynamics compared to baseline methods, while also improving performance in the original tasks. ... 5. Experimental Evaluation
Researcher Affiliation	Academia	1School of Information Engineering, Wuhan University of Technology, Wuhan, China 2Department of Computer Science, Tsinghua University, Beijing, China 3Intelligent Autonomous Systems Lab, Technische Universität Darmstadt, Darmstadt, Germany 4Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Germany 5Hessian Centre for Artificial Intelligence (Hessian.AI) 6Centre for Cognitive Science (Cog Sci). Correspondence to: Huaping Liu <EMAIL>.
Pseudocode	No	The paper describes the proposed algorithm, MTC-RL, and its components, objective functions, and optimization strategies in detail within Sections 3 and 4, and Appendix A. However, it does not present a formal pseudocode block or algorithm box.
Open Source Code	Yes	Our code is publicly available at https://github.com/ Bang You01/MTC.
Open Datasets	Yes	We performed experiments to investigate how our total correlation objective compares to vanilla soft-actor critic (Haarnoja et al., 2018) and the closely related alternative methods RPC (Eysenbach et al., 2021), LZSAC (Saanum et al., 2023) and SPAC (Saanum et al., 2023) in terms of performance on the original RL objective (Sec. 5.1 and Sec. 5.4), robustness to noise, dynamics mismatch and spurious correlation (Sec. 5.2), and consistency of the resulting trajectories (Sec. 5.3). ... eight continuous control tasks from the Deep Mind Control (DMC) (Tassa et al., 2018), ... eight robotic manipulation tasks from the Metaworld benchmark (Yu et al., 2020). ... six image-based DMC tasks from the Planet benchmark (Hafner et al., 2019).
Dataset Splits	Yes	We initialize the replay buffer with 5000 samples from the initial policy and train all agents for 1 million steps. We evaluate the agent every 20000 steps. ... For each task, the episode length is set to 1000 steps, and the action vector is bounded into [-1, 1]. ... Each run includes 30 evaluation trajectories. ... For each run, we collect 10 evaluation episodes.
Hardware Specification	Yes	We performed every experiment on an Intel(R) Xeon(R) E5-2620 CPU with Ge Force GTX 2080 Ti graphics card and used approximately one day for training.
Software Dependencies	No	We implement our algorithm on top of the common Py Torch implementation of the SAC algorithm (Yarats et al., 2021). We use the official implementation provided by Saanum et al. (2023) to obtain the results for LZ-SAC, since the official implementation is based on the same codebase of SAC and the hyperparameters has been tuned to achieve good results on DMC tasks. ... The LSTM module is implemented using the common nn.LSTM class provided by Py Torch. ... We measure the compressibility of trajectories using the bzip2 algorithm, which is easily available by installing the common bz2 python package. While PyTorch and bzip2 are mentioned, specific version numbers for these software components or Python itself are not provided.
Experiment Setup	Yes	We use the default hyperparameters from that implementation unless specified otherwise. Detailed descriptions of the SAC implementation are available in (Yarats et al., 2021). ... Table 2. Hyperparameters used in MTC. ... Table 3. Hyperparameters used in Image-based tasks. ... B.2. Implementation Details