reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Federated Causal Structure Learning with Non-identical Variable Sets

Authors: Yunxia Wang, Fuyuan Cao, Kui Yu, Jiye Liang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on synthetic, benchmark and real-world datasets demonstrate the effectiveness of our proposed method.
Researcher Affiliation	Academia	1School of Computer and Information Technology, Shanxi University, Taiyuan, China 2School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China.
Pseudocode	Yes	Algorithm 1 Fed CDnv Algorithm 2 C-Update Algorithm 3 S-Fed G
Open Source Code	No	The paper does not provide concrete access to source code. It only mentions general techniques used or future work on privacy protection, without offering a specific repository link or explicit statement of code release for the described methodology.
Open Datasets	Yes	Datasets. (1) Synthetic data. The underlying DAGs are generated using the Erd os-R enyi (Erd os & R enyi, 1959) graph model with the graph size n. (2) Benchmark data. We use 6 networks with ranging from 20 to 111 variables: Child, Alarm, Insurance, Barley, Child3, and Alarm3. (3) Realworld data. We use Sachs (Sachs et al., 2005) to evaluate the performance of the methods, randomly selecting 9 out of 11 variables as those observed by each client.
Dataset Splits	No	The paper describes how synthetic data is generated and mentions using benchmark and real-world datasets (Sachs), but it does not provide specific details on how these datasets are split into training, validation, or test sets. It details parameters for generating synthetic data and for client-specific variable sets, but not data sample splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. It discusses communication costs in terms of data matrix sizes but not computational hardware.
Software Dependencies	No	The paper mentions types of tests like "G2 tests of conditional independence" and algorithms like "FCI (Spirtes et al., 2000)" and "RFCI (Colombo et al., 2012)", but it does not specify any software libraries, frameworks, or solvers with version numbers that would be needed to replicate the experiment.
Experiment Setup	Yes	Parameters. For each invocation of Fed CDnv, the default settings for each problem instance (set of datasets) are generated using the values shown in Table 4 of Appendix D.1. The default parameters are as follows: G2 tests of conditional independence, the number of clients \|C\| = 6, the signiﬁcance level α = 0.05, θ1 = 0.049, θ2 = 0.45, λ = 0.15%, δ = 0.85%, the sample size nk [100, ns] with ns = 2000.