Hierarchical and Stochastic Crystallization Learning: Geometrically Leveraged Nonparametric Regression with Delaunay Triangulation
Authors: Jiaqi Gu, Guosheng Yin
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the asymptotic properties of our method and conduct numerical experiments on both synthetic and real data to demonstrate the advantages of our method over the existing ones. |
| Researcher Affiliation | Academia | Jiaqi Gu EMAIL Department of Mathematics and Statistics University of South Florida Tampa, FL 33620, USA Guosheng Yin EMAIL Department of Statistics and Actuarial Science School of Computing and Data Science University of Hong Kong Hong Kong SAR, China |
| Pseudocode | Yes | Algorithm 1 DELAUNAYSPARSE (Chang et al., 2020) ... Algorithm 2 Crystallization search ... Algorithm 3 Stochastic crystallization search |
| Open Source Code | No | The paper discusses various algorithms (Algorithm 1, 2, 3) and compares methods but does not provide any explicit statement about releasing its own implementation code or a link to a code repository. The license link provided is for the paper itself, not the code. |
| Open Datasets | Yes | We conduct numerical experiments on both synthetic and real data... apply the deterministic crystallization learning to several real data sets from the UCI repository. The critical assessment of protein structure prediction (CASP) data set (Betancourt and Skolnick, 2001)... The Concrete data set (Yeh, 1998)... Parkinson s telemonitoring data set... We also apply the hierarchical crystallization learning to the Year Prediction MSD data set from the UCI repository. |
| Dataset Splits | Yes | For each data set, we simulate 100 training data sets {(xi, yi) : i = 1, . . . , n} under different values of sample size n and dimension d. For each training data set, we evaluate the prediction performance of our method on 100 randomly generated target points z1, . . . , z100. ... For each data set, we take 100 bootstrap samples without replacement of size n (n = 200, 500, 1000 or 2000) for training and 100 bootstrap samples of size 100 for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or cloud computing instance types used for running the experiments. It only refers to general 'computation time/power' in a theoretical context. |
| Software Dependencies | No | The paper mentions several methods and algorithms such as k-NN, local linear regression, kernel regression, Gaussian process models, and GAM (Generalized Additive Models, Hastie and Tibshirani (1990)), but it does not specify the version numbers of any software libraries, programming languages, or tools used for their implementation. |
| Experiment Setup | Yes | We implement the crystallization learning with L = 3 for d = 5, 10 and L = 2 for d = 20, 50... For k = 1, . . . , 100, we implement the stochastic crystallization learning to estimate µ(zk) with B = 100 randomly generated sets of simplicies under the energy distribution (5) with the maximal energy loss Λ = 0, 0.1, . . . , 3.0. ... We implement the two-layer hierarchical crystallization learning with n = 2C representative points and L = 2... |