Federated Causal Structure Learning with Non-identical Variable Sets
Authors: Yunxia Wang, Fuyuan Cao, Kui Yu, Jiye Liang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on synthetic, benchmark and real-world datasets demonstrate the effectiveness of our proposed method. |
| Researcher Affiliation | Academia | 1School of Computer and Information Technology, Shanxi University, Taiyuan, China 2School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China. |
| Pseudocode | Yes | Algorithm 1 Fed CDnv Algorithm 2 C-Update Algorithm 3 S-Fed G |
| Open Source Code | No | The paper does not provide concrete access to source code. It only mentions general techniques used or future work on privacy protection, without offering a specific repository link or explicit statement of code release for the described methodology. |
| Open Datasets | Yes | Datasets. (1) Synthetic data. The underlying DAGs are generated using the Erd os-R enyi (Erd os & R enyi, 1959) graph model with the graph size n. (2) Benchmark data. We use 6 networks with ranging from 20 to 111 variables: Child, Alarm, Insurance, Barley, Child3, and Alarm3. (3) Realworld data. We use Sachs (Sachs et al., 2005) to evaluate the performance of the methods, randomly selecting 9 out of 11 variables as those observed by each client. |
| Dataset Splits | No | The paper describes how synthetic data is generated and mentions using benchmark and real-world datasets (Sachs), but it does not provide specific details on how these datasets are split into training, validation, or test sets. It details parameters for generating synthetic data and for client-specific variable sets, but not data sample splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. It discusses communication costs in terms of data matrix sizes but not computational hardware. |
| Software Dependencies | No | The paper mentions types of tests like "G2 tests of conditional independence" and algorithms like "FCI (Spirtes et al., 2000)" and "RFCI (Colombo et al., 2012)", but it does not specify any software libraries, frameworks, or solvers with version numbers that would be needed to replicate the experiment. |
| Experiment Setup | Yes | Parameters. For each invocation of Fed CDnv, the default settings for each problem instance (set of datasets) are generated using the values shown in Table 4 of Appendix D.1. The default parameters are as follows: G2 tests of conditional independence, the number of clients |C| = 6, the significance level α = 0.05, θ1 = 0.049, θ2 = 0.45, λ = 0.15%, δ = 0.85%, the sample size nk [100, ns] with ns = 2000. |