reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Sampling to Reduce Epistemic Uncertainty Using Prediction Interval-Generation Neural Networks

Authors: Giorgio Morales, John W. Sheppard

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our approach on three unidimensional synthetic problems and a multi-dimensional dataset based on an agricultural field for selecting experimental fertilizer rates. The results demonstrate that our method consistently converges faster to minimum epistemic uncertainty levels compared to Normalizing Flows Ensembles, MC-Dropout, and simple GPs.
Researcher Affiliation	Academia	Gianforte School of Computing Montana State University, Bozeman, MT 59717, USA EMAIL; EMAIL
Pseudocode	No	The paper describes the proposed method and algorithms using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/NISL-MSU/Adaptive Sampling
Open Datasets	Yes	We considered three 1-D problems: cos (Morales and Sheppard 2023b), hetero (Depeweg et al. 2018), and cosqr. All three problems are affected by heteroscedastic noise, and their function equations are shown in Table 1. ... In particular, we consider the following yield function: y = f(x) = x P π + 1 tanh 0.1 x Nr where x = [x P , x A, x V H, x Nr] comprises the following site-specific covariates: annual precipitation (mm), terrain aspect (radians), Sentinel-1 backscattering coefficient from the Vertical Transmit-Horizontal Receive Polarization band, and applied N rate (lbs/ac), respectively. The aleatoric noise is modeled as εa(x) = N(0, (x P + x Nr)/150).
Dataset Splits	No	The paper describes an adaptive sampling process where initial incomplete datasets are generated, and then samples are added iteratively. It does not provide traditional fixed training/test/validation dataset splits of a pre-existing dataset. The text says: "For each case, we generated incomplete datasets as initial states, as shown in Fig. 3." and "The AS process was executed for each problem for 50 iterations. This process is repeated 10 times, initializing the problems with a different seed each time."
Hardware Specification	No	Computational efforts were performed on the Tempest HPC System, operated by University Information Technology Research Cyberinfrastructure at MSU. This mentions a system name but lacks specific hardware details like GPU/CPU models, memory, etc.
Software Dependencies	No	The paper discusses various methods used (ASPINN, Dual AQD, MC-Dropout, NF-Ensemble, standard GP, RBF kernel) and provides citations for some, but it does not specify software names with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	For ASPINN, we trained feed-forward NNs with varying depths: two hidden layers with 100 units for problems cos and hetero; and three hidden layers with 500, 100, and 50 units, respectively, for cosqr. The networks ˆft and ˆgt share the same architecture except for the last layer, as ˆft uses one output, while ˆgt uses two outputs. Furthermore, ASPINN uses two hyperparameters: the neighbor distance threshold θ and the kernel length r. We performed a grid search with the values θ = [0.1, 0.15, 0.2, 0.25] and r = [0.1, 0.15, 0.2, 0.25], and selected θ = 0.25 and r = 0.15 for all experiments. Dual AQD, the PI-generation method used by ASPINN, uses a hyperparameter η as a scale factor... We chose a scale factor η = 0.1. ... For NF-Ensemble, we used flows with 200 hidden units for problems cos and hetero and 300 hidden units for problem cosqr. We employed ensembles consisting of five models trained during 30,000 epochs. For the standard GP, we used the same RBF kernel used by ASPINN. We utilized an inference implementation based on black-box matrix-matrix multiplication (Gardner et al. 2018) that uses 3000 training epochs.