reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains

Authors: Zhuo He, Shuang Li, Wenze Song, Longhui Yuan, Jian Liang, Han Li, Kun Gai

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate SYNC on several commonly used benchmarks, including two synthetic datasets (Circle, Sine) and five real-world datasets (RMNIST, Portraits, Caltran, Power Supply, ONP). The results of our proposed SYNC, along with various baseline methods, are presented in Table 1. Ablation study We conduct an ablation study on RMNIST to evaluate the effectiveness of various components in our method, and results are presented in Table 2. Figure 5: (c) and (d) present the test accuracy trajectory of SYNC and various baselines on Circle and Portraits.
Researcher Affiliation	Collaboration	1Independent Researcher, China 2Kuaishou Technology, China. Correspondence to: Shuang Li <EMAIL>. Based on the email address, Shuang Li is affiliated with Beijing University of Aeronautics and Astronautics (an academic institution). Authors Jian Liang, Han Li, and Kun Gai are affiliated with Kuaishou Technology (an industry company). The other authors are listed as Independent Researchers. Therefore, the paper presents a collaboration between academic and industry affiliations.
Pseudocode	Yes	D. Algorithm of SYNC Algorithm 1 Training procedure for SYNC Algorithm 2 Testing procedure for SYNC
Open Source Code	No	The paper does not contain an unambiguous statement or a direct link to a source-code repository indicating that the code for the methodology described in this paper is publicly available.
Open Datasets	Yes	We evaluate SYNC on several commonly used benchmarks, including two synthetic datasets (Circle, Sine) and five real-world datasets (RMNIST, Portraits, Caltran, Power Supply, ONP). Circle (Pesaranghader & Viktor, 2016) contains evolving 30 domains... Sine (Pesaranghader & Viktor, 2016) includes 24 evolving domains... RMNIST (Ghifary et al., 2015) consists of MNIST digits... Portraits (Yao et al., 2022a) is a real-world dataset... Caltran (Hoffman et al., 2014) is a surveillance dataset... Power Supply (Dau et al., 2019) is created by an Italian electricity company... ONP (Fernandes et al., 2015) is collected from the Mashable website...
Dataset Splits	Yes	All domains are split into source domains, intermediate domains and target domains according to the ratio of {1/2 : 1/6 : 1/3}. The intermediate domains are utilized as validation set for model selection. In Appendix E.1: Circle ... (15 source domains, 5 validation domains, and 10 target domains).
Hardware Specification	Yes	All experiments in this work are performed on a single NVIDIA Ge Force RTX 4090 GPU with 24GB memory, using the Py Torch packages, and are based on Domain Bed (Gulrajani & Lopez-Paz, 2021).
Software Dependencies	No	All experiments in this work are performed on a single NVIDIA Ge Force RTX 4090 GPU with 24GB memory, using the Py Torch packages, and are based on Domain Bed (Gulrajani & Lopez-Paz, 2021). The paper mentions 'Py Torch packages' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	Training details on different datasets are shown in Table 3, where B denotes the batch size, α1 and α2 represent the trade-off hyper-parameter for the loss function LMI and Lcausal, respectively. τ is the mask ratio of the masker and N represents the dimension of the latent space. Table 3: Training details on different datasets. Dataset B Epochs Optimizer Learning Rate α1 α2 τ N