Bayesian Graphical Models for Multivariate Functional Data

Authors: Hongxiao Zhu, Nate Strawn, David B. Dunson

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Simulation Study Three simulation studies were conducted to assess the performance of posterior inference using the Gaussian process graphical models outlined in Section 2.3 and Section 3. Simulation 1 corresponds to the smooth functional data case (without measurement error), and Simulation 2 corresponds to the noisy data case when measurement error is considered. Both simulations are based on a true underlying graph with 6 nodes, demonstrated in Figure 1 (a). In simulation 3, we show the performance of the proposed Bayesian inference in a p > n case, with the number of nodes p = 60 and the sample size n = 50.
Researcher Affiliation Academia Hongxiao Zhu EMAIL Department of Statistics Virginia Tech, 250 Drillfield Drive (MC 0439) Blacksburg, VA 24061, USA Nate Strawn EMAIL Department of Mathematics and Statistics Georgetown University Washington D.C. 20057, USA David B. Dunson EMAIL Department of Statistical Science Duke University Durham NC 27708, USA
Pseudocode Yes The following MCMC algorithm describes the steps to generate posterior samples based on (8). Algorithm 1 Step 0. Set an initial decomposable graph G and set the prior parameters c0, δ, and UC. Step 1. With probability 1 q, propose e G by randomly adding or deleting an edge from G (each with probability 0.5) within the space of decomposable graphs; with probability q, propose e G from a discrete uniform distribution supported on the set of all decomposable graphs. Accept the new e G with probability 1, p( e G | {c M i }, c M 0 ) p(G | e G) p(G | {c M i }, c M 0 ) p( e G | G) Repeat Step 1 for a large number of iterations until convergence is achieved. Detailed derivations are available in the online appendix. The above algorithm is a Metropolis-Hastings sampler with a mixture of local and heavier-tailed proposals, also called a small-world sampler. The local move involves randomly adding or deleting one edge based on the current graph, and the global move is achieved through the discrete uniform proposal. Guan et al. (2006) and Guan and Krone (2007) have shown that the small-world sampler leads to much faster convergence especially when the posterior distribution is either multi-modal or spiky. ... Algorithm 2 Step 0 Set initial values for {c M i }, G and set the model parameters δ, c M 0 , U and Λ. Step 1 Conditional on {c M i }, update G p(G | {c M i }, c M 0 ) using the small-world sampler as described in Step 1 of Algorithm 1, where p(G | {c M i }, c M 0 ) is computed based on (11). Step 2 Given G, update QC p(QC | {c M i }, G), which takes the same form as (6) except that δ and U are replaced by eδ and e U respectively using the formulae in Theorem 2. Step 3 Conditional on G and QC, update c M i N(µi, V), where V = (ΦT Λ 1Φ+Q 1 C ) 1 and µi = V(ΦT Λ 1yi + Q 1 C c M 0 ). Repeat Step 1 3 for a large number of iterations until convergence is achieved.
Open Source Code No The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets No 5. Analysis of Event-related Potential Data in an Alcoholism Study We apply the proposed method to event-related potential data from an alcoholism study. Data were initially obtained from 64 electrodes placed on subjects scalps that captured EEG signals at 256 Hz during a one-second period. The measurements were taken from 122 subjects, of which 77 belonged to the alcoholism group and 45 to the control group. Each subject completed 120 trials. During each trial, the subject was exposed to either a single stimulus (a single picture) or two stimuli (a pair of pictures) shown on a computer monitor. ... The paper mentions using data from an 'alcoholism study' and describes its characteristics, but does not provide a specific link, DOI, repository, or formal citation to access this dataset.
Dataset Splits No 5. Analysis of Event-related Potential Data in an Alcoholism Study ...Based on the preprocessed ERP curves, we further removed subjects with missing nodes, and balanced the sample size across the two groups, producing multivariate functional data with n = 44 and p = 64 for both the alcoholic and the control group. The paper describes overall sample sizes for groups (n=44 for alcoholic and control groups) but does not provide specific training/test/validation splits for experimental reproduction.
Hardware Specification Yes The running-time was obtained using a laptop with Intel(R) Core(TM) i5 CPU, M430 with 2.27 GHZ processor and 4GB RAM.
Software Dependencies No We band-pass filtered the EEG signals to extract the α frequency band in the range of 8 12.5 Hz. The filtering was performed by applying the eegfilt function in the EEGLAB toolbox of Matlab. The paper mentions
Experiment Setup Yes We apply Algorithm 1 and set δ = 5 and U = b Z b Rb Z, where b Z = diag{bλ1/2 jk , k = 1, . . . , Mj, j = 1, . . . , p}, {bλjk} are the estimated eigenvalues and b R is set to be the identity marix. A total of 5, 000 MCMC iterations are performed.