reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Plaque Segmentation in CCTA with Prompt- based Diffusion Data Augmentation

Authors: Ruan Yizhe, Xuangeng Chu, Ziteng Cui, Yusuke Kurose, JUNICHI IHO, Yoji Tokunaga, Makoto Horie, YUSAKU HAYASHI, Keisuke Nishizawa, Yasushi Koyama, Tatsuya Harada

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate Prompt Lesion on a private CCTA dataset and multi-organ tumor segmentation tasks (kidney, liver, pancreas) using public datasets, achieving superior performance compared to baseline methods. Models trained with our prompt-guided synthetic augmentation significantly improve Dice Similarity Coefficient (DSC) scores for both plaque and tumor segmentation. Extensive evaluations and ablation studies confirm the effectiveness of prompt conditioning.
Researcher Affiliation	Academia	Yizhe Ruan1,2, Xuangeng Chu1, Ziteng Cui1, Yusuke Kurose1,2, Junichi Iho3, Yoji Tokunaga3, Makoto Horie3, Yusaku Hayashi3, Keisuke Nishizawa3, Yasushi Koyama3,2, Tatsuya Harada1,2 1The University of Tokyo 2RIKEN Center for Advanced Intelligence Project 3Sakurabashi Watanabe Advanced Healthcare Hospital EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and figures, but it does not include a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	1https://github.com/Ruan Yizhe-77/Prompt Lesion TMLR
Open Datasets	Yes	Public Tumor Datasets. For tumor segmentation, we used the same public datasets as Diff Tumor Chen et al. (2024): the Kidney Tumor dataset (Ki TS) Heller et al. (2020), the Liver Tumor dataset (Li TS) Bilic et al. (2023), and the Pancreas Tumor dataset (MSD Pancreas) Antonelli et al. (2022).
Dataset Splits	Yes	For plaque-related experiments, we used a private set of 100 cardiac CTA volumes acquired at the Sakurabashi Watanabe Advanced Healthcare Hospital. Following the data split ratio of a prior work Ruan et al. (2025), the cohort was divided into a training set of 70 patients, a validation set of 10, and a test set of 20. The annotation procedure was carried out by cardiac experts.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies	No	Our training pipeline is implemented using the MONAI framework. No specific version number for MONAI or other key software libraries is provided.
Experiment Setup	Yes	The random seed for all experiments was set to 1234 to ensure reproducibility. Our training pipeline is implemented using the MONAI framework. To address class imbalance, we employed MONAI s class-aware cropping methods, specifically Rand Crop By Pos Neg Labeld and Rand Crop By Label Classesd, to ensure that training patches were sampled with a balanced representation of different lesion types. For data augmentation, we applied only random 90-degree rotations (Rand Rotate90d with a probability of 0.1). All models are trained for 500 epochs with early stopping on the validation set. We use the Dice CELoss Sudre et al. (2017) for all the segmentation trainings. The proportion of synthetic data is tuned on the validation set we found using a 1:1 ratio of synthetic: real achieved the best results without overwhelming the network with possibly redundant synthetic samples.