reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal Abstraction Learning based on the Semantic Embedding Principle

Authors: Gabriele D’Acunto, Fabio Massimo Zennaro, Yorgos Felekis, Paolo Di Lorenzo

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that the proposed methods succeed on both synthetic and real-world brain data with different degrees of prior information about the structure of CA. Our experiments on synthetic and brain data, across different levels of prior knowledge, confirm good performance of the proposed methods. This section provides the empirical assessment of Lin SEPAL-ADMM, Lin SEPAL-PG and CLin SEPAL with different degrees of prior knowledge, from full (fp) to partial (pp). We monitor four metrics to evaluate the learned CA b V : (i) constructiveness, as required by Def. 4.2; (ii) DKL b V evaluating the alignment between φ b V #(χℓ) and χh; (iii) the Frobenius distance between the absolute value of b V and that of the ground truth V , normalized by V F to make the settings comparable; (iv) the F1 score computed using the support of the learned CAs and that of V to evaluate structural interventional consistency.
Researcher Affiliation	Academia	1Department of Information Engineering, Electronics and Telecommunications, Sapienza University, Rome, Italy 2National Inter-University Consortium for Telecommunications (CNIT), Parma, Italy 3Department of Informatics, University of Bergen, Bergen, Norway 4Department of Computer Science, University of Warwick, Coventry, UK. Correspondence to: Gabriele D Acunto <EMAIL>.
Pseudocode	Yes	The Lin SEPAL-ADMM algorithm is summarized in Algorithm 1. The Lin SEPAL-PG algorithm is summarized in Algorithm 2. The CLin SEPAL method is summarized in Algorithm 3 The full prior version of CLin SEPAL is summarized in Algorithm 4.
Open Source Code	Yes	Code: https://github.com/SPAICOM/calsep
Open Datasets	Yes	We apply CLin SEPAL to resting-state functional magnetic resonance imaging (rs-f MRI) data, using the dataset from (D Acunto et al., 2024) (refer to the paper for details on the dataset). The data, publicly released as part of the Human Connectome Project (Smith et al., 2013), comprises recordings from 100 healthy adults with a parcellation scheme that divides the brain into 89 regions of interest (ROIs)
Dataset Splits	No	The paper describes generating synthetic data for evaluation: 'For each setting, we instantiate S = 30 ground truth abstractions V , and for each simulation s [S] we run all the methods R = 50 times, with different initializations.' For the brain data, it describes simulating scenarios based on an existing dataset: 'We simulate a first investigating team of neuroscientists taking zero-mean stationary time series for the left hemisphere of the first adult in the dataset... We generate the data for the second team using a ground truth linear CA B, V St(45, 14)...'. There is no explicit mention of training, validation, or test dataset splits in the context of fixed datasets.
Hardware Specification	No	No specific hardware details (like GPU models, CPU types, or cloud instance specifications) are mentioned in the paper for running the experiments.
Software Dependencies	No	The paper mentions several software tools: 'In our experiments, we use the conjugate gradient implementation in (Boumal et al., 2014).' (referring to Manopt, a Matlab toolbox). It also states: 'In our experiments, we use the OSQP (Stellato et al., 2020) implementation available in cvxpy (Diamond & Boyd, 2016).'. While these tools are named and cited, no specific version numbers for these software dependencies are explicitly provided.
Experiment Setup	Yes	Appendix K: Metrics and Hyper-parameters. This section provides the definition of the metrics monitored in our empirical assessment in Secs. 6 and 7. Additionally, we report the hyper-parameters configuration for Algorithms 1 to 3 used in the experiments. ... CLin SEPAL: ρ = 1, τ = 10 3, ε = 0.1 for the full prior case and ε = 0.01 for the partial prior case, τ c = 10 3, τ a = 10 4, τ r = 10 4 . The same hyper-parameters were used in the experiments in Sec. 7; Lin SEPAL-ADMM: ρ = 1, λ = 1, τ a = 10 4, τ r = 10 4; Lin SEPAL-PG: λ = 1, ρ = 1/ 2 Σℓ 2 F , γ = 0.5, τ KL = 10 4, K = 1000 .