Dynamic Entity-Masked Graph Diffusion Model for Histopathology Image Representation Learning

Authors: Zhenfeng Zhuang, Min Cen, Yanfeng Li, Fangyu Zhou, Lequan Yu, Baptiste Magnier, Liansheng Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model has conducted pretraining experiments on three large histopathological datasets. The advanced predictive performance and interpretability of H-MGDM are clearly evaluated on comprehensive downstream tasks such as classification and survival analysis on six datasets.
Researcher Affiliation Academia 1Department of Computer Science at School of Informatics, Xiamen University, Xiamen, China 2School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, Anhui, China 3School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China 4Euro Mov Digital Health in Motion, Univ Montpellier, IMT Mines Ales, Ales, France 5Service de Medecine Nucleaire, Centre Hospitalier Universitaire de Nimes, Universite de Montpellier, Nimes, France EMAIL, EMAIL
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations, but it does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code Yes Code https://github.com/centurion-crawler/H-MGDM
Open Datasets Yes Datasets The proposed framework underwent pretraining and classification using three large extensive histopathology datasets: Komura et al. (Komura et al. 2022) (1.6M images, 32 cancer types), Prostate ANnotation Data Archive (PANDA) dataset (Bulten et al. 2022) (11,000 digitized H&E-stained WSIs, obtaining 12.5M images with 6 level Gleason region annotations), our in-house colorectal cancer data IBD (23M images, with 360K patches annotated into 9 common tissue types). For the survival analysis task, we compare the performances on two public cohorts: TCGA-KIRC (512 cases) and TCGA-ESCA (155 cases) and one privately collected primary-metastatic pathology colorectal cancer dataset CRC-PM (388 cases).
Dataset Splits Yes Survival Analysis performance across SOTAs features on external public validation data by conducting the 5-fold evaluation procedure with 5 runs. The experimental results of CI are reported by mean std." and "Based on the median survival risk output by our model, each cancer cohort is divided into high-risk and low-risk groups.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions 'Experiments involving H-MGDM and comparative methods are conducted...'
Software Dependencies No The paper mentions using 'Adam' for optimization and 'SLIC algorithms' for partitioning, but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA, or their versions).
Experiment Setup Yes Experiments involving H-MGDM and comparative methods are conducted with a batch size set to 64 max. SLIC with the initial region number of 500, window size a is 64. Latent downsampling factors f = 2. Also, pre-training is optimized by Adam(Kingma and Ba 2014) with an initial learning rate of 3e-4 and the plateau scheduler with a minimum learning rate of 1e-5 for 250 epochs. The noise timesteps T is 1000, and the sigmoid schedule {β(t)}[0,T ] from 1e-7 to 2e-3 is used.