reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Supervised Score-Based Modeling by Gradient Boosting

Authors: Changyuan Zhao, Hongyang Du, Guangyuan Liu, Dusit Niyato

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Via the ablation experiment in selected examples, we demonstrate the outstanding performances of the proposed techniques. Additionally, we compare our model with other probabilistic models, including Natural Gradient Boosting (NGboost), Classification and Regression Diffusion Models (CARD), Diffusion Boosted Trees (DBT), and non-probabilistic gradient boosting models. The experimental results show that our model outperforms existing models in both accuracy and inference time. Experiments on regression and classification tasks show that SSM can achieve better performance than existing methods and significantly shorten the inference time.
Researcher Affiliation	Academia	1College of Computing and Data Science, Nanyang Technological University 2CNRS@CREATE, 1 Create Way, # 08-01 Create Tower, Singapore 138602 3Department of Electrical and Electronic Engineering, University of Hong Kong 4The Energy Research Institute @ NTU, Interdisciplinary Graduate Program EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Training Initialization: {σi}L i=1, training set D, loss coefficient λ(σi) 1: repeat 2: choose (x, y) D, σ {σi}L i=1, and ey N(y, σ) 3: Take gradient descent step on 4: θ 1 2λ(σi) sθ(ey, σ, x) + ey y σ2 2 2 5: until 6: converged Algorithm 2: Inference Initialization: {σi}L i=1, ε, T, {β(σi)}L 1 i=1 , x I 1: Initialize y0 2: for i 1 to L 1 do 3: αi ϵ σ2 i σ2 L 4: repeat 5: yt yt 1 + αisθ(yt 1, σi, x I) 6: until σ2 i sθ(yt, σi, x I) < β(σi) 7: y0 y T 8: end for 9: for t 1 to T do 10: yt yt 1 + ϵsθ(yt 1, σL, x I) 11: end for 12: return y T
Open Source Code	No	The paper does not contain any explicit statement or link indicating the release of its own source code for the methodology described.
Open Datasets	Yes	we first perform experiments on 5 selected toy examples (linear regression, quadratic regression, log-log linear regression, log-log cubic regression, and sinusoidal regression) proposed in (Han, Zheng, and Zhou 2022). We further evaluate our model on 10 UCI regression tasks (Dua and Graff 2017). For classification tasks, we compare our model with CARD on CIFAR-10 and CIFAR-100, focusing on both accuracy and the inference time (Krizhevsky 2009).
Dataset Splits	No	The paper states: "We employ the same experimental settings as those used in the CARD model (Han, Zheng, and Zhou 2022)" for UCI regression tasks. For toy examples and CIFAR datasets, it does not explicitly provide the specific percentages or counts for training, validation, and test splits within the paper's text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup	No	The paper states: "As discussed in (Song and Ermon 2020), we need to design many parameters to ensure the effectiveness of training and inference, including (i) the choices of noise scales {σi}L i=1; (ii) the step size ϵ in Langevin dynamics; (iii) the inference steps t in Langevin equation." It also mentions "We employ the same experimental settings as those used in the CARD model (Han, Zheng, and Zhou 2022)" for UCI tasks. However, it does not provide concrete hyperparameter values (e.g., specific learning rates, batch sizes, epochs, or the numerical values for its own {σi}L i=1, ϵ, and T) used in its own experiments within the main text, instead referring to external work or describing parameter selection techniques rather than the actual values used.