Generalized Score Matching for Non-Negative Data

Authors: Shiqing Yu, Mathias Drton, Ali Shojaie

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Simulation results and applications to RNAseq data are given in Section 7.
Researcher Affiliation Academia Shiqing Yu EMAIL Department of Statistics University of Washington, Seattle, WA, U.S.A. Mathias Drton EMAIL Department of Mathematical Sciences University of Copenhagen, Copenhagen, Denmark and Department of Statistics University of Washington, Seattle, WA, U.S.A. Ali Shojaie EMAIL Department of Biostatistics University of Washington, Seattle, WA, U.S.A.
Pseudocode No The paper mentions 'We use a coordinate-descent method analogous to Algorithm 2 in Lin et al. (2016)' but does not present a pseudocode or algorithm block within its own text.
Open Source Code No In our implementation for pairwise interaction models of Section 5.1 (that will become available in an R package), we optimize our loss functions with respect to a symmetric matrix K̂; in the non-centered case the vector η̂ is also included.
Open Datasets Yes In this section we apply our regularized generalized h-score matching estimator for truncated non-centered GGMs to RNAseq data also studied in Lin et al. (2016), since the same model is considered therein. The data consists of n = 487 prostate adenocarcinoma samples from The Cancer Genome Atlas (TCGA) data set.
Dataset Splits No The paper describes using 'm = 100 variables and n = 80 and n = 1000 samples' for simulations and 'n = 487 prostate adenocarcinoma samples' for RNAseq data, but does not specify how these samples are split into training, validation, or test sets.
Hardware Specification No The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU types, or other computing specifications.
Software Dependencies No The paper mentions that an 'R package' will be made available for their implementation, but it does not specify any software names with version numbers for R or any other libraries used in their experiments.
Experiment Setup Yes In our simulation experiments, we consider m = 100 variables and n = 80 and n = 1000 samples... The amplifier is set based on Theorem 16 to δ = C(n, m) = 1.8647 for truncated GGMs... We choose h(x) = min(x, 3) and use the upper-bound multiplier (high)... and choose the regularization parameter λ so that the estimated graph has exactly m = 333 edges.