On Perfect Clustering for Gaussian Processes

Authors: Juan Cuesta-Albertos, Subhajit Dutta

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Good empirical performance of the proposed methodology is demonstrated using simulated as well as benchmark data sets, when compared with some popular parametric and nonparametric methods for such functional data. 4 Analysis of Simulated Datasets 5 Analysis of Benchmark Datasets
Researcher Affiliation Academia Juan A. Cuesta-Albertos EMAIL Departamento de Matemáticas, Estadística y Computación Universidad de Cantabria, Spain Subhajit Dutta EMAIL Department of Mathematics and Statistics IIT Kanpur, India
Pseudocode Yes A pseudo-code for this procedure is given in Algorithm 2 (see Section O of the Appendix). Algorithm 1 Clustering Algorithm Algorithm 2 Cross Validation Algorithm to Choose the Value of d
Open Source Code Yes The R codes for our methods are available here: GP clustering. R codes for our clustering methods are available from this link: GP clustering.
Open Datasets Yes We have applied our proposed methods to some benchmark data sets, Wheat (from the R package fds), Satellite (available at https://www.math.univ-toulouse.fr/ ferraty/SOFTWARES/NPFDA /index.html), Cars (kindly provided by the first author of Torrecilla et al (2020)) and Velib (from the R package fun FEM).
Dataset Splits No The paper does not provide specific train/test/validation splits made by the authors for their experiments, only sample sizes for simulations and mentions existing class assignments or single executions for benchmark data. For our simulation study, we consider two class problems (J = 2). The sample size of each class was set to be 250. To evaluate the clustering algorithms, we ran a single execution (without splitting).
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments.
Software Dependencies No We have used the function optishrink available in the R package denoise R. computed the adjusted Rand index using the function RRand in the R package phyclust. Several competent methods for functional clustering using functional mixed mixture models are implemented in the function funcit from the R package funcy. The methodology developed by Chiou and Li (2007) is available in the function FClust from the R package fdaspace. Velib (from the R package fun FEM). The DHP method is available from the journal website, and we used those Matlab codes for our comparisons. No specific version numbers for R or the listed packages are provided.
Experiment Setup Yes We set s = 1 for location only problems. In location and scale problems, we fixed s = 3, while for scale only problems the mean functions µZ1 and µZ2 were set to be the constant function 0 and s = 3 was retained. The sample size of each class was set to be 250. Our experiment was replicated 100 times. We repeat this partitioning B(= 50) times and average it over these B samples to get ˆDCV d .