reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation

Authors: He Li, Haoang Chi, Mingyu Liu, Wanrong Huang, Liyang Xu, Wenjing Yang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the effectiveness of our approach, we conduct simulation experiments and real data experiments. Simulation experiments show that our estimator has a stronger estimation capability than baseline methods. Real data experiments provide a valuable conclusion to the causal effect of conflicts on forest loss in Colombia. The source code is available at this URL.
Researcher Affiliation	Academia	1College of Computer Science and Technology, National University of Defense Technology, Changsha, China 2Intelligent Game and Decision Lab, Academy of Military Science, Beijing, China. Correspondence to: Wenjing Yang <EMAIL>.
Pseudocode	No	The paper describes the deep-learning-based realization of their estimators in Section 4.6 and shows a 'Full model architecture' in Figure 2, but it does not present any formal pseudocode or algorithm blocks.
Open Source Code	Yes	The source code is available at this URL.
Open Datasets	Yes	The Global Forest Change Data is an annually updated global dataset capturing forest loss derived from time-series images acquired by the Landsat satellite (Hansen et al., 2013). ... Data is publicly available online from https://glad.earthengine.app/view/global-forest-change. The UCDP Georeferenced Event Dataset is a comprehensive dataset documenting global conflicts, providing details on parameters such as the temporal aspects and geographical coordinates of conflict events (Croicu & Sundberg, 2015). ... Data is publicly available online from https://ucdp.uu.se/downloads/index.html#ged_global.
Dataset Splits	No	The paper mentions generating synthetic datasets with different time lengths (32, 48, 64) and using real-world data from 2002 to 2022. However, it does not provide specific details on how these datasets were split into training, validation, or testing sets (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification	Yes	All experiments were conducted on an NVIDIA RTX 4090 GPU and an Intel Core i7 14700KF processor.
Software Dependencies	No	The paper mentions using "CNNs" and a "Transformer-based neural network" and also "the torch.nn module in PyTorch to implement linear regression" and "a Python library called Econ ML". However, it does not provide specific version numbers for any of these software components (e.g., PyTorch version, Econ ML version, Python version).
Experiment Setup	Yes	For CNNs in section 4.6.1, the details are epoch = 200, learning rate = 0.001, batch size = 64. For the Transformer-based neural network in Figure 2, the details are as follows: The number of blocks in the Transformer encoder is 8, the number of attention heads per layer is 16, the number of layers in the Multi-Layer Perceptron decoder (MLP) is 8, epoch = 300, learning rate = 0.0001.