reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dynamic Entity-Masked Graph Diffusion Model for Histopathology Image Representation Learning

Authors: Zhenfeng Zhuang, Min Cen, Yanfeng Li, Fangyu Zhou, Lequan Yu, Baptiste Magnier, Liansheng Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our model has conducted pretraining experiments on three large histopathological datasets. The advanced predictive performance and interpretability of H-MGDM are clearly evaluated on comprehensive downstream tasks such as classification and survival analysis on six datasets.
Researcher Affiliation	Academia	1Department of Computer Science at School of Informatics, Xiamen University, Xiamen, China 2School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, Anhui, China 3School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China 4Euro Mov Digital Health in Motion, Univ Montpellier, IMT Mines Ales, Ales, France 5Service de Medecine Nucleaire, Centre Hospitalier Universitaire de Nimes, Universite de Montpellier, Nimes, France EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical equations, but it does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	Code https://github.com/centurion-crawler/H-MGDM
Open Datasets	Yes	Datasets The proposed framework underwent pretraining and classification using three large extensive histopathology datasets: Komura et al. (Komura et al. 2022) (1.6M images, 32 cancer types), Prostate ANnotation Data Archive (PANDA) dataset (Bulten et al. 2022) (11,000 digitized H&E-stained WSIs, obtaining 12.5M images with 6 level Gleason region annotations), our in-house colorectal cancer data IBD (23M images, with 360K patches annotated into 9 common tissue types). For the survival analysis task, we compare the performances on two public cohorts: TCGA-KIRC (512 cases) and TCGA-ESCA (155 cases) and one privately collected primary-metastatic pathology colorectal cancer dataset CRC-PM (388 cases).
Dataset Splits	Yes	Survival Analysis performance across SOTAs features on external public validation data by conducting the 5-fold evaluation procedure with 5 runs. The experimental results of CI are reported by mean std." and "Based on the median survival risk output by our model, each cancer cohort is divided into high-risk and low-risk groups.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions 'Experiments involving H-MGDM and comparative methods are conducted...'
Software Dependencies	No	The paper mentions using 'Adam' for optimization and 'SLIC algorithms' for partitioning, but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA, or their versions).
Experiment Setup	Yes	Experiments involving H-MGDM and comparative methods are conducted with a batch size set to 64 max. SLIC with the initial region number of 500, window size a is 64. Latent downsampling factors f = 2. Also, pre-training is optimized by Adam(Kingma and Ba 2014) with an initial learning rate of 3e-4 and the plateau scheduler with a minimum learning rate of 1e-5 for 250 epochs. The noise timesteps T is 1000, and the sigmoid schedule {β(t)}[0,T ] from 1e-7 to 2e-3 is used.