reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ergodic Generative Flows

Authors: Leo Maxime Brunswic, Mateo Clémente, Rui Heng Yang, Adam Sigal, Amir Rasouli, Yinchuan Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate IL-EGFs on toy 2D tasks and real-world datasets from NASA on the sphere, using the KL-weak FM loss. Additionally, we conduct toy 2D reinforcement learning experiments with a target reward using the FM loss. [...] We proceed with experiments wherein the state space S is either a flat torus T2, or sphere S2. [...] On S2, we benchmark EGFs on the earth science volcano dataset (NGDC/WDS, 2025). [...] Table 1. Negative log-likelihood scores of the volcano dataset.
Researcher Affiliation	Industry	1Huawei Technologies Canada, Noah s Ark Laboratories 2Huawei. Correspondence to: Leo Maxime Brunswic <EMAIL>.
Pseudocode	Yes	Algorithm 1 Ergodic flow: IL training
Open Source Code	No	The paper refers to a third-party implementation for comparison: 'The Moser Flow is trained using the implementation provided in the authors Git Hub repository (Rozen, 2022)'. There is no explicit statement or link provided for the authors' own code for Ergodic Generative Flows (EGFs).
Open Datasets	Yes	On S2, we benchmark EGFs on the earth science volcano dataset (NGDC/WDS, 2025). [...] NGDC/WDS. Global significant volcanic eruptions database. https://www.ncei.noaa.gov/ access/metadata/landing-page/bin/iso? id=gov.noaa.ngdc.mgg.hazards:G10147, 2025.
Dataset Splits	No	The paper mentions 'negative log-likelihood on a validation dataset' and 'recalculating the negative log-likelihood on the training dataset' in Appendix D.1, implying the use of training and validation splits. However, it does not provide specific details on how these splits were performed, such as percentages, sample counts, or the methodology used to create them for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100, RTX 2080 Ti), CPU models (e.g., Intel Xeon, AMD Ryzen), or detailed computer specifications used for running the experiments.
Software Dependencies	No	The paper mentions 'We use the Adam W optimizer' for training but does not provide specific version numbers for any software, libraries, or programming languages used in the implementation.
Experiment Setup	Yes	The MLPs are tanh hyperbolic activated and initialized using orthogonal initialization (Saxe et al., 2013; Hu et al., 2020). We use the Adam W optimizer (Kingma & Ba, 2015; Loshchilov & Hutter, 2019) for training. [...] An EGF on S = T2 is built with 16 transformations (8 translations and 2 elements of SLd(Z) together with their inverse). Their MLPs have 5 hidden layers of width 32 to parameterize f and π. [...] The Moser Flow is trained using the implementation provided in the authors Git Hub repository (Rozen, 2022) with the only modification being the model size set to 32x3. [...] The two core MLPs of EGF are of size 256x5, compared to the 512x6 used by Rozen et al. (2021). The learning rate is 1e-3 with an exponential decay to 1e-5 at 3000 epochs of 25 steps. [...] Learning rate is kept at 0.001 (Figure 8 caption).