Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Jump Gaussian Process Model for Estimating Piecewise Continuous Regression Functions

Authors: Chiwoo Park

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Several advantages of using the proposed approach over the conventional GP and piecewise GP modeling approaches are shown by various simulated experiments and real data studies. Keywords: Piecewise Regression, Jump Regression, Gaussian Process Regression, Local Data Selection, Local Data Partitioning
Researcher Affiliation Academia Chiwoo Park EMAIL Department of Industrial and Manufacturing Engineering Florida State University 2525 Pottsdamer St., Tallahassee, FL 32310-6046, USA.
Pseudocode Yes Algorithm 1: Jump Gaussian Process Regression by the Classification EM (JGP-CEM) ... Algorithm 2: Jump Gaussian Process Regression with Variation EM (JGP-VEM)
Open Source Code No The paper does not provide an explicit statement about releasing code for the described methodology or a direct link to a code repository. The provided links refer to the paper's license and attribution, not source code.
Open Datasets No The paper mentions three datasets: "The first dataset contains 10,153 corrosion sensor readings...", "The second dataset contains a compressive sensing (CS) result of nanoparticle agglomerates.", and "The third dataset contains 52 carbon nanotube yields...". While it cites prior work in relation to these datasets, it does not provide concrete access information (e.g., links, DOIs, specific repositories) for the datasets themselves in the present paper.
Dataset Splits Yes We randomly split the dataset into 90% training dataset and 10% test dataset. ... Since the dataset size is quite small, we use a leave-one-out cross-validation scheme instead of the conventional train and test data split.
Hardware Specification No The paper does not provide specific hardware details such as CPU/GPU models, processor types, or detailed computer specifications used for running experiments.
Software Dependencies No The paper states: "We use the R implementations of la GP, TGP and Dyna Tree from the authors." However, it does not provide specific version numbers for R or for the mentioned implementations (la GP, TGP, Dyna Tree).
Experiment Setup Yes For a model fit, we use the n-nearest neighborhood as local data, where the value of n varies over {25, 35, 50}. We varied the noise σ2 over {1, 4, 9}, which correspond to the signal-to-noise ratio, 14 decibel (d B), 8 d B and 4.4 d B respectively. ... We varied the training data size N over {600, 1200, 2500}. ... We varied the input dimension d from 2 to 10 and the training data size N over {1000, 3000, 5000, 10000, 20000, 30000}, while we fixed Nt = 150 and chose the local neighborhood size n = 25, as suggested from our simulation study in Section 3.