reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood

Authors: Yuta Shikuri

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Theoretical and experimental results demonstrate that the approximation performance is influenced by the granularity of summarized data relative to the length scale of covariance functions. Experiments on a real-world dataset highlight the practicality of our method for spatial modeling.
Researcher Affiliation	Industry	Yuta Shikuri Tokio Marine Holdings, Inc. Tokyo, Japan EMAIL
Pseudocode	No	The paper describes mathematical formulations and procedures but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	No	The text states, 'The code was implemented in the python programming language 3.7.3 version.' However, it does not provide an explicit statement of code release or a link to a repository for the methodology described in the paper. It refers to third-party tools like SciPy and GPy but not to the authors' own implementation code.
Open Datasets	Yes	We investigate the usage of our method in spatial modeling tasks, using the California housing dataset 1. 1https://lib.stat.cmu.edu/datasets/
Dataset Splits	Yes	The dataset was randomly shuffled and then split to obtain the n training data points and n = 20640 n test data points. [...] Let n = 10000.
Hardware Specification	Yes	We used a 64bit Windows machine with the Intel Core i9-9900K @ 3.60 GHz and 64 GB of RAM.
Software Dependencies	Yes	This optimization was implemented with version 1.7.3 of the Sci Py software 2, with the parameters set to their default values. We used a 64bit Windows machine with the Intel Core i9-9900K @ 3.60 GHz and 64 GB of RAM. The code was implemented in the python programming language 3.7.3 version. As a hyperparameter, we also optimized the variance σ (0, ) of the Gaussian likelihood. [...] implemented in version 1.12.0 of the GPy software 3.
Experiment Setup	Yes	The mean function was τ( ) = g 1( 1 n Pn i=1 yi). The covariance functions were designed by multiplying a constant kernel with the functions used in fig. 3 and adding white noise. [...] The hyperparameters of the covariance functions were optimized using the L-BFGS-B algorithm, starting from an initial value of 1. This optimization was implemented with version 1.7.3 of the Sci Py software 2, with the parameters set to their default values. [...] As a hyperparameter, we also optimized the variance σ (0, ) of the Gaussian likelihood. [...] The domain [32.54, 41.95] [ 124.35, 114.31] in latitude and longitude was divided into girds. The data points within each grid were summarized, with the centers of the grids serving as the representative features, and the summary statistics being the sample means.