Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes

Authors: Justin D. Silverman, Kimberly Roche, Zachary C. Holmes, Lawrence A. David, Sayan Mukherjee

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through both simulations and analyses of real datasets using MLN models, we show that our inference schemes are both highly accurate and often 4-5 orders of magnitude faster than MCMC.
Researcher Affiliation Academia Justin D. Silverman EMAIL College of Information Science and Technology, Department of Statistics, and Institute for Computational and Data Science Penn State University University Park, PA, 16802, USA Kimberly Roche EMAIL Program in Computational Biology and Bioinformatics Duke University Durham, NC, 27708, USA Zachary C. Holmes EMAIL Department of Molecular Genetics and Microbiology Duke University Durham, NC, 27708, USA Lawrence A. David EMAIL Department of Molecular Genetics and Microbiology and Center for Genomic and Computational Biology Duke University Durham, NC, 27708, USA Sayan Mukherjee EMAIL Departments of Statistical Science, Mathematics, Computer Science, Biostatistics & Bioinformatics Duke University Durham, NC, 27708, USA
Pseudocode Yes Algorithm 1: The Collapse-Uncollapse (CU) Sampler for Marginally LTP Models Data: Y, υ, B, K, A Result: S samples of the form {Ψ(s), η(s)} Sample {η(1), . . . , η(S)} p(η | Y ) where p(η | Y ) is an LTP; for s in {1, . . . , S} do in parallel Sample Ψ(s) p(Ψ | η(s), Y );
Open Source Code Yes For inference of Marginally LTP models with multinomial observations and log-ratio link functions, we developed the R package fido (Silverman, 2019). Fido implements the CU sampler with Laplace approximation described above using optimized C++ code. ... Additionally all code required to reproduce the results of the next two sections, including the alternative implementations of multinomial logistic-normal linear models discussed in Section 5 is available as a Git Hub repository at github.com/jsilve24/fido paper code.
Open Datasets Yes To demonstrate that LA Collapsed (from the R package fido) provides an accurate and efficient means of modeling real microbiome data, we reanalyzed a previously published study comparing microbial composition in the terminal ileum of subjects with CD to healthy controls (Gevers et al., 2014). Sequence count data was obtained from the R package Microbe DS (github.com/twbattaglia/Microbe DS). Sequence count data was obtained from the R package Fido (github.com/jsilve24/fido).
Dataset Splits No For each evaluated triple (N, D, Q), three simulated data-sets were created based on the multinomial logistic-normal linear model with the following specified likelihood: Y j = Multinomial(nj, π j) π j = ALR 1 D (η j) η j = N(ΛX j, Σ). To allow us to compare to alternative implementations we randomly subset the data to contain 83 samples.
Hardware Specification No All replicates of the simulated count data were supplied to the various implementations independently and the models were fit on identical hardware, allotted 64GB RAM, 4 cores, and restricted to a 48-hour upper limit on run-time.
Software Dependencies Yes All implementations were compiled and run using gcc version 6.2.0, R version 3.4.2, and Intel(R) Math Kernel Library version 2019 where possible.
Experiment Setup Yes Prior hyper-parameters were chosen to reflect common default choices, e.g., mean parameters set to zero, and covariance parameters set to the identity matrix. The prior degrees-of-freedom parameter ν is defined on the range ν > D, this parameter was chosen as ν = D + 10. To evaluate the impact of using small values for the degree-of-freedom parameter ν in model priors, we set ν = D + 3. A full description of our prior assumptions is given in Appendix I. To evaluate the impact of using small values for the degree-of-freedom parameter ν in model priors, we set ν = D +2. Details on these kernels as well as the matrix functions Θ are described further in Appendix J.