reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Experimental Design Via Contrastive Diffusions

Authors: Jacopo Iollo, Christophe Heinkelé, Pierre Alliez, Florence Forbes

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments and comparison with state-of-the-art methods show the potential of the approach. [...] 6 NUMERICAL EXPERIMENTS Two sequential density-based (Section 6.2) and data-based (Section 6.3) BOED examples are considered to illustrate that our method extends to the sequential case in both settings. [...] Evaluation metrics and comparison. We refer to our method as Co Diff. In Section 6.2, comparison is provided with other recent approaches [...] We also compare with a non tempered version of this latter approach (SMC) and with a random baseline, where the observations {y1, , y K} are simulated with designs generated randomly. [...] The L2 Wasserstein distance is two order of magnitude lower, suggesting the higher quality of our measurements. [...] It is confirmed quantitatively in Table 1, which reports reconstruction quality as measured by the structural similarity index measure (SSIM)
Researcher Affiliation	Academia	1: Université Grenoble Alpes, Inria, CNRS, G-INP, France, EMAIL 2: Cerema, Endsum-Strasbourg, France, EMAIL 3: Université Côte d Azur, Inria, France, EMAIL
Pseudocode	Yes	Algorithm 1:Nested-loop optimization [...] Algorithm 2: Single loop optimization
Open Source Code	Yes	Our code is implemented in Jax Bradbury et al. (2020) and uses Flax as a Neural Network library and Optax as optimization one Babuschkin et al. (2020). The code is available at https://github.com/jcopo/Contrastive Diffusions.
Open Datasets	Yes	For the image prior, we consider a diffusion model trained for generation of the MNIST dataset (Le Cun et al., 1998).
Dataset Splits	No	The paper mentions that a diffusion model was trained on the MNIST dataset, but it does not specify any training/test/validation splits for this model. For the image reconstruction experiments, it states: "20 ground truth digit images are randomly selected" for evaluation, but this does not constitute specific, reproducible dataset splits for the model or the experimental setup.
Hardware Specification	Yes	The source example 6.2 can be run locally. It was tested on an Apple M1 Pro 16Gb chip but faster running times can be achieved on GPU. The MNIST example 6.3 was run on a single A100 80Gb GPU.
Software Dependencies	No	Our code is implemented in Jax Bradbury et al. (2020) and uses Flax as a Neural Network library and Optax as optimization one Babuschkin et al. (2020). The paper mentions the software used (Jax, Flax, Optax) but does not provide specific version numbers for these dependencies, which is required for reproducibility.
Experiment Setup	Yes	In the notation of the single loop Algorithm 2, we consider ΣY ,θ\|Dk 1 t (p(t), ξt) and Σθ \|Dk 1 t (q(t), ξt, ˆρt+1) operators that correspond respectively to the update of batch samples of size N = 200 and M = 200. [...] The Langevin step-size in the Di GS method Chen et al. (2024) was set to 10 2. The joint optimization-sampling loop was run for 5000 steps. [...] the time varying SDE (34) with a noise schedule β(t) = bmin + (bmin bmax)(t t0)/(T t0) (with bmax = 5, bmin = 0.2, t0 = 0, T = 2). The training of the usual score matching was done for 3000 epochs with a batch size of 256 and using Adam optimizer Kingma and Ba (2015). We used gradient clipping and the training was done on a single A100 GPU.