reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

Authors: Hantao Zhang, Yuhe Liu, Jiancheng Yang, Shouhong Wan, Xinyuan Wang, Wei Peng, Pascal Fua

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Validated on 3D cardiac lesion MRI and lung nodule CT datasets, Le Fusion-generated data significantly improves the performance of state-of-the-art segmentation models, including nn UNet and Swin UNETR. Code and model are available at https://github.com/M3DV/Le Fusion.
Researcher Affiliation	Academia	1Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland 2University of Science and Technology of China (USTC), China 3Beihang University, China 4Stanford University, USA
Pseudocode	No	The paper describes the diffusion model and its training objectives using mathematical equations (Eq. 1-6) and descriptive text, but no explicit pseudocode or algorithm block is provided.
Open Source Code	Yes	Code and model are available at https://github.com/M3DV/Le Fusion.
Open Datasets	Yes	LIDC: Multi-Peak Lung Nodule CT. We use LIDC dataset (Armato III et al., 2011)... Emidec: Multi-Class Cardiac Lesion MRI. The Emidec dataset (Lalande et al., 2022)
Dataset Splits	Yes	LIDC: The dataset was divided into an 808-case training set, comprising 2,104 lung nodule ROIs, and a 202-case test set, containing 520 lung nodule ROIs. Additionally, 3,076 normal (N) ROIs were cropped from the 135 healthy patients... Emidec: We split the 67 P cases into 57 for training and 10 for testing.
Hardware Specification	Yes	For the entire experiment, we used 6*A100 (40G) GPUs
Software Dependencies	Yes	Python 3.8 and Py Torch version 2.4.0.
Experiment Setup	Yes	all diffusion models were set to 300 timesteps. For both datasets, we adopted a learning rate of 1e-4 and a batch size of 16. The training process required approximately 30,000 timesteps for the cardiac dataset and 40,000 timesteps for the LIDC lung nodule dataset. For downstream tasks, both Swin UNETR and nn UNet were trained for 200 epochs.