reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal Discovery with Unobserved Confounding and Non-Gaussian Data

Authors: Y. Samuel Wang, Mathias Drton

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the eﬀectiveness of our procedure in simulations and an application to an ecology data set.
Researcher Affiliation	Academia	Y. Samuel Wang EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14853, USA Mathias Drton EMAIL Department of Mathematics & Munich Data Science Institute Technical University of Munich 85748 Garching bei M unchen, Germany
Pseudocode	Yes	Algorithm 1 BANG procedure 1: Input: Data Y Rp n and S Rp p which is the (potentially sample) covariance of Y 2: For all v V , set c pa(v) = and c sib(v) = V \ {v} 3: Set all elements of D Rp p to be 0 and l = 1 4: while maxv \|c sib(v)\| l do 5: for v V do 6: Prune c sib(v) using Algorithm 2 7: Certify pseudo-parents of v and update c pa(v), c sib(v), and D using Algorithm 3 8: end for 9: if D was updated, reset l = 1; else set l = l + 1 10: end while 11: Remove ancestors which are not parents from c pa(v) for all v V using Algorithm 4 12: Return: E = {(u, v) : u c pa(v)}, E = {{u, v} : u c sib(v)}
Open Source Code	Yes	Available at https://github.com/ysamwang/ng Bap
Open Datasets	Yes	Grace et al. (2016) use a structural equation model to examine the relationships between land productivity and the richness of plant diversity. They consider measurements taken at 1126 plots which are locations across 39 diﬀerent sites.
Dataset Splits	No	The paper mentions generating synthetic data with varying sample sizes (e.g., "We let n = 500, 1000, 1500" in Section 6.1, and "We let n = 2500, 5000, 7500, 10000, 25000, 50000" in Section 6.2) and the number of replications ("200 replications" or "50 replications"). For the real-world ecology data, it mentions "measurements taken at 1126 plots". However, it does not provide specific train/test/validation splits for any of the datasets, either synthetic or real.
Hardware Specification	No	The acknowledgments section mentions computational resources, stating: "This research was supported in part through the computational resources and staﬀcontributions provided for the Mercury high performance computing cluster at The University of Chicago Booth School of Business which is supported by the Oﬃce of the Dean." This provides a general name for a computing cluster but lacks specific hardware details such as CPU/GPU models, memory specifications, or other detailed computer specifications used for running experiments.
Software Dependencies	No	The paper mentions several software implementations and packages used for comparison: "For Parcel Li NGAM we use the Matlab implementation available from the author s website3; for RCD we use the lingam python package4; for FCI+, we use the R package pcalg (Kalisch et al., 2012); and for GBS we use the R package greedy Baps (Nowzohour, 2017)." However, it does not provide specific version numbers for Matlab or the Python/R packages, which are necessary for reproducible software dependencies.
Experiment Setup	Yes	For FCI+, RCD, and BANG we set the nominal level of each hypothesis test performed to α = .05, .01, .001. For GBS, we allow 100 random restarts, the same number used in the simulations by Nowzohour (2017). For BANG with EL, we set K = 3 for the gamma and lognormal errors (since they are skewed) and let K = 4 for the uniform and T13 (since they are symmetric).