reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global Context-aware Representation Learning for Spatially Resolved Transcriptomics

Authors: Yunhak Oh, Junseok Lee, Yeongmin Kim, Sangwoo Seo, Namkyeong Lee, Chanyoung Park

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the superiority of Spotscape in various downstream tasks, including single-slice and multi-slice scenarios.
Researcher Affiliation	Academia	1Graduate School of Data Science, KAIST, Daejeon, Republic of Korea 2Department of Industrial & Systems Engineering, KAIST, Daejeon, Republic of Korea 3Department of Mathematical Sciences, KAIST, Daejeon, Republic of Korea. Correspondence to: Chanyoung Park <EMAIL>.
Pseudocode	Yes	C. Pseudo Code In this section, we provide pseudocode of Spotscape in Algorithm 1.
Open Source Code	Yes	Our code is available at the following link: https: //github.com/yunhak0/Spotscape.
Open Datasets	Yes	We conduct a comprehensive evaluation of Spotscape across five datasets derived from different sequencing technologies. For single-slice experiments, we use the dorsolateral prefrontal cortex (DLPFC) dataset... Additionally, we assess the middle temporal gyrus (MTG) dataset... as well as the Mouse embryo dataset. Lastly, we utilize non-small cell lung cancer (NSCLC) data... In multi-slice experiments, we integrate the four slices from the same patient in the DLPFC dataset... while analyzing the differences between the control and AD groups in the MTG dataset... Lastly, we evaluate heterogeneous alignment using the Mouse embryo dataset... and the Breast Cancer dataset... Further details about data statistics can be found in Table 5 of Appendix A.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits or specific percentages/counts for the datasets used in its experiments. It mentions using K-means clustering for spatial domain identification, which is an unsupervised task, and describes multi-slice integration/alignment, but no detailed splits for model training or evaluation phases.
Hardware Specification	Yes	All the experiments are conducted on Intel Xeon Gold 6326 CPU and NVIDIA Ge Force A6000 (48GB).
Software Dependencies	Yes	Spotscape is implemented in Python 3 (version 3.9.7) using Py Torch 2.1.1 (https: //pytorch.org/) with Pytorch Geometric (https://github.com/pyg-team/pytorch_geometric) packages.
Experiment Setup	Yes	To ensure a fair comparison, we conducted a hyperparameter search for both Spotscape and the baseline methods. The best-performing hyperparameters were selected by evaluating the NMI with the first seed. Specifically, for Spotscape, the hyperparameter search was conducted only for the learning rate, with the search space consisting of {0.00001, 0.00005, 0.0001, 0.0005, 0.001}. The remaining hyperparameters were fixed, and the ones used to report the experimental results are listed in Table 7. Table 7. Hyperparameter settings of Spotscape. (lists specific values for λRecon, λSC, λPCL, λSS, GCN encoder dimensions, τ, Top-k, Training epochs, Warm-up epochs, Learning rate, Feature masking rate, Edge masking rate)