Mixtures of Gaussian Process Experts with SMC^2

Authors: Teemu Härkönen, Sara Wade, Kody Law, Lassi Roininen

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6, we provide our construction of the predictive distribution, and results are presented for two illustrative one-dimensional data sets along with one-dimensional motorcycle helmet acceleration, 3D NASA Langley glide-back booster simulation, and 4D Colorado precipitation data sets in Section 7. We demonstrate the method using multiple simulated and real-life one-dimensional data sets along with two multi-dimensional data sets. We also show L1 and L2 vector norms between the computed predictive densities, the ground truth densities and the L1 vector norm between the obtained predictive median and the ground truth median with predictive log-likelihood results in Tables 2 and 3 for the two synthetic data sets.
Researcher Affiliation Collaboration Teemu H ark onen EMAIL Department of Electrical Engineering and Automation Aalto University Espoo, FI-02150, Finland School of Engineering Sciences LUT University Lappeenranta, Yliopistonkatu 34, FI-53850, Finland Sara Wade EMAIL School of Mathematics University of Edinburgh Edinburgh, EH9 3FD, United Kingdom Kody Law EMAIL Department of Mathematics University of Manchester Manchester, M13 9PL, United Kingdom KLAI Ltd London, EC1V 2NX, UK Lassi Roininen EMAIL School of Engineering Sciences LUT University Lappeenranta, Yliopistonkatu 34, FI-53850, Finland
Pseudocode Yes We present a schematic in Figure 5 and pseudo-code in Algorithm 1 for the SMC sampler Υ(Θ(0:t)1:M , A(t) | C, X, Y ). In the following Section, we combine the above inner SMC sampler with sampling of the gating network parameters Ψ and partition C. Algorithm 1 SMC for fully Bayesian GP estimation. Algorithm 2 PMCMC step for pt(C, Ψ, Θ | X, Y ) p(Y | X, C, Θ)κ(t)p(C | X, Ψ)π0(Ψ, Θ) Pseudocode for the method is shown in Algorithm 3.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide links to a code repository for the methodology described.
Open Datasets Yes The Colorado precipitation data set has been collected by the Colorado Climate Center and is available online. The Langley glide-back booster is a rocket booster designed by NASA to allow the booster to glide down from the atmosphere instead of requiring crashing it into the ocean. A more detailed description of the booster and the related data set is provided in Gramacy and Lee (2008). We also consider the motorcycle data set studied in Silverman (1985), which shows obvious heteroskedastic noise and non-stationarity, with areas exhibiting different behaviours.
Dataset Splits No The paper states, "The data set consists of 3167 data points of which we use the subset of N = 1900 data points," but it does not specify explicit training, validation, or test splits for the experimental evaluation, nor does it describe cross-validation setups.
Hardware Specification Yes The right panel shows median wall-clock time of the two methods as a function of the likelihood evaluations on a fixed computer architecture with 64 physical central processing unit cores of an AMD Ryzen Threadripper 3990X. Note that this architecture limits the parallelism of both methods equally.
Software Dependencies No The paper mentions "the treed GP R package (Gramacy, 2007)" but does not provide specific version numbers for any software libraries or tools used in the implementation of their own methodology.
Experiment Setup Yes For IS-Mo E and SMC2-Mo E, we employ an upper bound of K = 7 experts and consider different choices of the Dirichlet concentration parameter α = 0.1, 1, K/2. In all cases, the number of outer particles J, inner particles M, and time steps T are chosen so that MJT 2 is similar to the number of IS samples in order to have similar run times for IS and SMC2 and fair comparisons based on a fixed computational budget. The prior distributions used for one-dimensional and higher-dimensional data sets are presented in Table 1, respectively. We use η = 0.9, which delivers good performance for our problem.