Patchwork Kriging for Large-scale Gaussian Process Regression
Authors: Chiwoo Park, Daniel Apley
JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using three spatial datasets and three higher dimensional datasets, we investigate the numerical performance of the approach and compare it to several state-of-the-art approaches. Keywords: Local Kriging, Model Split and Merge, Pseudo Observations, Spatial Partition |
| Researcher Affiliation | Academia | Chiwoo Park EMAIL Department of Industrial and Manufacturing Engineering Florida State University 2525 Pottsdamer St., Tallahassee, FL 32310-6046, USA. Daniel Apley EMAIL Dept. of Industrial Engineering and Management Sciences Northwestern University 2145 Sheridan Rd., Evanston, IL 60208-3119, USA. |
| Pseudocode | Yes | Algorithm 1 Computation Steps for Patchwork Kriging |
| Open Source Code | No | The paper does not explicitly state that the authors are providing open-source code for their methodology. It mentions using the 'R package Random Field' but this is a third-party tool, not their own implementation's code. |
| Open Datasets | Yes | The second dataset, ICETHICK, contains ice thickness measurements at 32,481 locations on the western Antarctic ice sheet and is available at http://nsidc.org/. It has two predictors that represent the longitude and latitude of a measurement location, and the corresponding independent variable is the ice thickness measurement. The third dataset, PROTEIN, has nine input variables that describe the tertiary structure of proteins and one independent variable that describes the physiochemical property of proteins. These data, which are available at https://archive.ics.uci.edu/ml/datasets, consist of 45,730 observations. The fourth dataset, SARCOS, contains measurements from a seven degrees-of-freedom SARCOS anthropomorphic robot arm. There are 21 predictors that describe the positions, moving velocities and accelerations of seven joints of the robot arm, and the seven response variables are the corresponding torques at the seven joints. We only use the first response variable for this numerical study. The dataset, which is available at http://www.gaussianprocess.org/gpml/data/, contains 44,484 training observations and 4,449 test observations. The last dataset, FLIGHT, consists of 800,000 flight records randomly selected from the database available at http://stat-computing.org/dataexpo/2009/. |
| Dataset Splits | Yes | We randomly split each dataset into a training set containing 90% of the total observations and a test set containing the remaining 10% of the observations. |
| Hardware Specification | Yes | All numerical experiments were performed on a desktop computer with Intel Xeon Processor W3520 and 6GB RAM. |
| Software Dependencies | No | The paper mentions using the "R package Random Field" for generating synthetic data, but it does not specify any version numbers for this or other software used in their own implementation, beyond algorithms like Cholesky decomposition. |
| Experiment Setup | Yes | For patchwork kriging, we varied B {3, 5} and K {256, 512, 1024}. For the PGP, we used K = 623, while the number of finite element meshes per local region was varied from 5 to 25 with step size 5. For RBCM, we varied the number of local experts K {100, 150, 200, 250, 300, 600}. For PIC, K was varied over {100, 200, 300, 400, 600}, and the total number of pseudo inputs was also varied over {50, 70, 80, 100, 150, 200, 300}. (Example 1: TCO.L2 Dataset) |