reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Gaussian Graphical Modelling Without Independence Assumptions for Uncentered Data

Authors: Bailey Andrew, David R. Westhead, Luisa Cutillo

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first compare performances on synthetic data, in which the graphs are generated from a Barabasi-Albert (power-law) distribution. We generate means for our synthetic data from a variety of distributions; see Table 1. We also compare them on Erdos-Renyi graphs, achieving similar results, but for conciseness we defer those results to the appendix. Next, we compare performances on the real-world COIL-20 video dataset ((Nene, Nayar, and Murase 1996)), demonstrating clear improvements. Finally, we show that, without properly taking into account the mean, precision matrix estimation will yield clearly wrong results on transcriptomics datasets such as the mouse embryo stem cell E-MTAB-2805 dataset (Buettner et al. 2015).
Researcher Affiliation	Academia	Bailey Andrew, David R Westhead, Luisa Cutillo University of Leeds EMAIL, EMAIL, EMAIL
Pseudocode	Yes	See Algorithm 1 for a pseudocode presentation of our algorithm.
Open Source Code	Yes	Code https://github.com/Bailey Andrew/Noncentral-KSNormal
Open Datasets	Yes	Next, we compare performances on the real-world COIL-20 video dataset ((Nene, Nayar, and Murase 1996)), demonstrating clear improvements. Finally, we show that, without properly taking into account the mean, precision matrix estimation will yield clearly wrong results on transcriptomics datasets such as the mouse embryo stem cell E-MTAB-2805 dataset (Buettner et al. 2015).
Dataset Splits	No	The paper describes the datasets used (synthetic, COIL-20, E-MTAB-2805) and discusses preprocessing steps or how correct edges were defined for evaluation, but it does not specify explicit training/test/validation splits with percentages, counts, or a detailed methodology for partitioning the data to reproduce experiments.
Hardware Specification	Yes	All experiments were run on a 2020 Mac Book Pro with an M1 chip and 8 GB of RAM.
Software Dependencies	Yes	Our method was implemented using Num Py 1.25.2 (Harris et al. 2020) and Sci Py 1.12.0 (Virtanen et al. 2020); for precision matrix routines, we used Greenewald, Zhou, and Hero s reference implementation of Tera Lasso (2019) as well as Gm GM 0.5.3.
Experiment Setup	Yes	We chose thresholding/regularization parameters such that there would be approximately 144 edges.