reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Uncertainty modeling for fine-tuned implicit functions

Authors: Anna Susmelj, Mael Macuglia, Natasa Tagasovska, Reto Sutter, Sebastiano Caprara, Jean-Philippe Thiran, Ender Konukoglu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the efficacy of our approach through a series of experiments, starting with toy examples and progressing to a real-world scenario. Specifically, we train a Convolutional Occupancy Network on synthetic anatomical data and test it on low-resolution MRI segmentations of the lumbar spine. Our results show that Dropsembles achieve the accuracy and calibration levels of deep ensembles but with significantly less computational cost.
Researcher Affiliation	Collaboration	Anna Susmelj ETH AI Center Mael Macuglia ETH Z urich Nataˇsa Tagasovska Prescient/MLDD, Genentech Reto Sutter, Sebastiano Caprara Balgrist University Hospital, USZ Jean-Philippe Thiran LTS5, EPFL Ender Konukoglu ETH Z urich
Pseudocode	Yes	Algorithm 1 Dropsembles
Open Source Code	Yes	Here we introduce Dropsembles1, a method that creates ensembles based on the dropout technique. 1https://github.com/klanita/Dropsembles
Open Datasets	Yes	MNIST digit reconstruction In our next experiment, we explore the reconstruction from sparse inputs using the MNIST dataset... Shape Net We demonstrate the applicability of Dropsembles for modeling uncertainty in SDFs on a commonly used Shape Net dataset... As the target dataset B, we use a publicly available dataset of MR+CT images from 20 subjects (Cai et al., 2015).
Dataset Splits	Yes	Toy experiment We generated two-dimensional datasets A and B with a sinusoidal decision boundary and Gaussian noise... We created 1000 training samples and 500 test samples for dataset A, and 50 train and 500 test samples for dataset B... MNIST digit reconstruction... For dataset A, we utilized all images for a single digit 7 from the MNIST training set... We randomly selected 20 distinct images from the MNIST test split... Shape Net... For dataset A, we utilized 1,000 randomly selected airplane shapes. For dataset B, we selected 10 unseen airplane shapes... Lumbar spine... We randomly selected 3 subjects for the consequent fine-tuning and testing.
Hardware Specification	Yes	All experiments in this section were performed on NVIDIA RTX A6000 GPU, equipped with 48 GB of memory. The networks were trained at 16-mixed precision due to memory constraints.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	We used a consistent model architecture across experiments a straightforward 3-layer MLP with 256 hidden units in each layer. For methods using dropout, a dropout layer (p = 0.3) followed each linear layer... We trained for 800 epochs for training with a learning rate 1e 3 and 600 epochs for tuning with a learning rate 5e 3... The 8-layer MLP occupancy network was trained with cross-entropy loss for 50 epochs on dataset A with a learning rate 0.005 and a cosine warmup scheduler... The 8-layer Deep SDF MLP network was trained on Dataset A for 100 epochs using a clipped L1 loss (with clipping parameter δ = 0.1), a learning rate of 0.001, and a step scheduler... We used a learning rate 0.01 and trained for 100 epochs with early stopping... Learning rate 0.001 and batch size 32 were used for training...