reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Methods for Recovering Conditional Independence Graphs: A Survey

Authors: Harsh Shrivastava, Urszula Chajewska

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The focus of this paper is to review different methods and recent techniques developed to recover CI graphs. [...] Recovered graph structures for a sub-network of the E. coli consisting of 43 genes and 30 interactions with increasing number of samples. [...] CI graphs from u GLAD model used to analyse a lung cancer data from (Kaggle, 2022). [...] The CI graph recovered by u GLAD for the Infant Mortality 2015 data from CDC [...]. [left] u GLAD graph for archaea at family level in a collection of wastewater processing digesters. [...] [right] A dynamic network inference framework showing three snapshots of the automobile sensor network [...]. [left] Rectangular 10 20 latitude-longitude grids of windspeed locations [...]. [right] Graphical representation of latitude (left) and longitude factors [...]. The CI graph was recovered by the L2G algorithm, which is a deep unfolding approach to learn graph topologies, refer to (Pu et al., 2021).
Researcher Affiliation	Industry	Harsh Shrivastava EMAIL Urszula Chajewska EMAIL Microsoft Research, Redmond, USA
Pseudocode	No	The paper describes methods using mathematical formulations (e.g., Eq. 1, 2, 3, 4, 5) and architectural diagrams (Fig. 2) but does not include any explicitly labeled pseudocode or algorithm blocks. The procedural steps are described in narrative text without structured formatting.
Open Source Code	Yes	Table 1 lists some of the prominent methods for CI graph recovery along with their recommended implementations. This compilation will help the readers choose the right models for their applications. [...] GLAD https://github.com/Harshs27/GLAD (Shrivastava et al., 2020) [...] u GLAD https://github.com/Harshs27/u GLAD (Shrivastava et al., 2022b) [...] Tera Lasso https://github.com/kgreenewald/teralasso (Greenewald et al., 2019) [...] Sy Glasso https://github.com/ywa136/syglasso (Wang et al., 2020) [...] FASJEM https://github.com/QData/FASJEM (Wang et al., 2017) [...] ADMM https://github.com/tpetaja1/tvgl (Hallac, Park, Boyd, & Leskovec, 2017) [...] Newton-CG(MDMC) Matlab package (Zhang et al., 2018)
Open Datasets	Yes	CI graphs from u GLAD model used to analyse a lung cancer data from (Kaggle, 2022). Kaggle (2022). Lung Cancer. https://www.kaggle.com/datasets/nancyalaswad90/ lung-cancer?select=survey+lung+cancer.csv. [...] The CI graph recovered by u GLAD for the Infant Mortality 2015 data from CDC (United States Department of Health and Human Services, Division of Vital Statistics (DVS), 2015). United States Department of Health and Human Services, Division of Vital Statistics (DVS) (2015). Birth Cohort Linked Birth Infant Death Data Files, 2004-2015. Compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program, on CDC WONDER On-line Database. Accessed at https: //www.cdc.gov/nchs/data_access/vitalstatsonline.htm.
Dataset Splits	No	The paper describes the number of samples (M) and features (D) used in general terms for various methods (e.g., 'M samples and D features'). For instance, Figure 4 mentions 'M=10' and 'M=100' samples to illustrate the effect of sample size on FDR, TPR, and FPR. However, it does not explicitly provide any details about how datasets are split into training, validation, or test sets for reproducibility.
Hardware Specification	No	The paper describes various computational methods and algorithms, including those efficient for 'large scale problems involving thousands of variables'. However, it does not specify any particular hardware (e.g., GPU/CPU models, memory details) used for conducting experiments or for obtaining the results presented in its figures.
Software Dependencies	No	The paper mentions several software packages and libraries. For example, it notes a 'Graphical Lasso function implementation of python s scikit-learn package (Pedregosa et al., 2011)', a 'Python package' for G-ISTA, and 'R package' for JGL, GGMncv, and Miss Glasso. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	No	The paper discusses the theoretical underpinnings and general approaches of various CI graph recovery algorithms, such as the use of regularization parameters (e.g., 'λ Θ 1,off'). For the GLAD model, it mentions 'adaptive sequence of penalty hyperparameters'. However, it does not provide specific, concrete experimental setup details like learning rates, batch sizes, number of epochs, or specific optimizer configurations for any of the demonstrations or evaluations presented in the paper.