reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Jump Gaussian Process Model for Estimating Piecewise Continuous Regression Functions

Authors: Chiwoo Park

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Several advantages of using the proposed approach over the conventional GP and piecewise GP modeling approaches are shown by various simulated experiments and real data studies. Keywords: Piecewise Regression, Jump Regression, Gaussian Process Regression, Local Data Selection, Local Data Partitioning
Researcher Affiliation	Academia	Chiwoo Park EMAIL Department of Industrial and Manufacturing Engineering Florida State University 2525 Pottsdamer St., Tallahassee, FL 32310-6046, USA.
Pseudocode	Yes	Algorithm 1: Jump Gaussian Process Regression by the Classiﬁcation EM (JGP-CEM) ... Algorithm 2: Jump Gaussian Process Regression with Variation EM (JGP-VEM)
Open Source Code	No	The paper does not provide an explicit statement about releasing code for the described methodology or a direct link to a code repository. The provided links refer to the paper's license and attribution, not source code.
Open Datasets	No	The paper mentions three datasets: "The ﬁrst dataset contains 10,153 corrosion sensor readings...", "The second dataset contains a compressive sensing (CS) result of nanoparticle agglomerates.", and "The third dataset contains 52 carbon nanotube yields...". While it cites prior work in relation to these datasets, it does not provide concrete access information (e.g., links, DOIs, specific repositories) for the datasets themselves in the present paper.
Dataset Splits	Yes	We randomly split the dataset into 90% training dataset and 10% test dataset. ... Since the dataset size is quite small, we use a leave-one-out cross-validation scheme instead of the conventional train and test data split.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, processor types, or detailed computer specifications used for running experiments.
Software Dependencies	No	The paper states: "We use the R implementations of la GP, TGP and Dyna Tree from the authors." However, it does not provide specific version numbers for R or for the mentioned implementations (la GP, TGP, Dyna Tree).
Experiment Setup	Yes	For a model ﬁt, we use the n-nearest neighborhood as local data, where the value of n varies over {25, 35, 50}. We varied the noise σ2 over {1, 4, 9}, which correspond to the signal-to-noise ratio, 14 decibel (d B), 8 d B and 4.4 d B respectively. ... We varied the training data size N over {600, 1200, 2500}. ... We varied the input dimension d from 2 to 10 and the training data size N over {1000, 3000, 5000, 10000, 20000, 30000}, while we ﬁxed Nt = 150 and chose the local neighborhood size n = 25, as suggested from our simulation study in Section 3.