reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Estimating Epistemic and Aleatoric Uncertainty with a Single Model

Authors: Matthew Chan, Maria Molina, Chris Metzler

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our method on two distinct real-world tasks: x-ray computed tomography reconstruction and weather temperature forecasting. Source code is publicly available at https://github.com/matthewachan/hyperdm.
Researcher Affiliation	Academia	Matthew A. Chan Department of Computer Science University of Maryland College Park, MD 20742 EMAIL Maria J. Molina Department of Atmospheric and Oceanic Science University of Maryland College Park, MD 20742 EMAIL Christopher A. Metzler Department of Computer Science University of Maryland College Park, MD 20742 EMAIL
Pseudocode	No	The paper includes diagrams to illustrate concepts (e.g., Figure 1 for Hyper DM framework) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Source code is publicly available at https://github.com/matthewachan/hyperdm.
Open Datasets	Yes	Using the Lung Nodule Analysis 2016 (LUNA16) [51] dataset, we form a target image distribution X by extracting 1,200 CT images, applying 4 pixel binning to produce 128 128 resolution images, and normalizing each image by mapping pixel values between [ 1000, 3000] Hounsfield units to the interval [ 1, 1].
Dataset Splits	Yes	The dataset is finally split into a training dataset comprised of 1,000 image-measurement pairs and a validation dataset of 200 data pairs.
Hardware Specification	Yes	All baselines are trained on a single NVIDIA RTX A6000 using a batch size of 32, an Adam [31] optimizer, and a learning rate of 1 10 4.
Software Dependencies	No	The paper mentions using an 'Adam [31] optimizer' and 'Py Torch' (in the NeurIPS checklist justification), but it does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	All baselines are trained on a single NVIDIA RTX A6000 using a batch size of 32, an Adam [31] optimizer, and a learning rate of 1 10 4. Training is run over 500 epochs in our initial experiment and 400 epochs in our CT and weather experiments. DMs are trained using a Markov chain of T = 100 timesteps.