reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Credal Bayesian Deep Learning

Authors: Michele Caprio, Souradeep Dutta, Kuk Jin Jang, Vivian Lin, Radoslav Ivanov, Oleg Sokolsky, Insup Lee

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present our experimental results in section 4, and we examine the related work in section 5. Section 6 concludes our work. In the appendices, we give further theoretical and philosophical arguments and we prove our claims. ... We evaluate our CBDL method on 4 standard image datasets, CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), Fashion-MNIST (Xiao et al., 2017), and MNIST (Lecun et al., 1998). ... Table 3: In-Distribution evaluation on noise corruptions of increasing severity. ... Table 4: Out-Of-Distribution evaluation. ... Table 5: Out-Of-Distribution evaluation. AUROC for OOD detection is computed using both aleatoric and epistemic predictive uncertainty measures of the different approaches.
Researcher Affiliation	Academia	Michele Caprio EMAIL Department of Computer Science, University of Manchester Souradeep Dutta EMAIL Department of Computer Science, University of British Columbia Kuk Jin Jang EMAIL Department of Computer and Information Science, University of Pennsylvania Vivian Lin EMAIL Department of Computer and Information Science, University of Pennsylvania Radoslav Ivanov EMAIL Department of Computer Science, Rensselaer Polytechnic Institute Oleg Sokolsky EMAIL Department of Computer and Information Science, University of Pennsylvania Insup Lee EMAIL Department of Computer and Information Science, University of Pennsylvania
Pseudocode	Yes	Algorithm 1 Credal Bayesian Deep Learning (CBDL) Training and Inference
Open Source Code	Yes	Code and datasets for implementation and replication of results will be available at https://github.com/PRECISE/credal-bayesian-deep-learning.git.
Open Datasets	Yes	We evaluate our CBDL method on 4 standard image datasets, CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), Fashion-MNIST (Xiao et al., 2017), and MNIST (Lecun et al., 1998). ... We use the problem settings in Tumu et al. (2023) to define the problem of obtaining prediction sets for future positions of an autonomous racing agent. ... The dataset Dall is created by collecting simulated trajectories of autonomous race cars in the F1Tenth-Gym (O Kelly et al., 2020); for details, see Tumu et al. (2023). ... Fortunately, the UVa-Padova simulator (Dalla Man et al., 2013) allows us to create datasets with and without meal inputs.
Dataset Splits	Yes	The dataset Dall is created by collecting simulated trajectories of autonomous race cars in the F1Tenth-Gym (O Kelly et al., 2020); for details, see Tumu et al. (2023). As shown in Figure 7, different racing lines were utilized including the center, right, left, and optimal racing line for the Spielberg track. ... In total, the Dall consists of 34686 train instances, 4336 validation instances, and 4336 test instances. ... In this case study, the training data was obtained by randomly initializing the BG value in the range [120, 190], and simulating the patient for 720 minutes. The controller was executed at 5 minutes intervals.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU/CPU models or processor details. It mentions computational efficiency for methods, but not for its own setup.
Software Dependencies	No	We implement and train a Resnet-20 Bayesian Neural Network model inside the library Bayesian-torch (Krishnan et al., 2019). ... The HDR of a well-known distribution (for instance, a Normal) can be routinely obtained in R, e.g. using package HDInterval (Juat et al., 2022). ... We implemented a simple model predictive controller, using an off-the-shelf implementation of the covariance matrix adaptation evolution strategy (CMA-ES). The paper mentions software libraries and tools but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We use a learning-rate of 0.001, batch-size of 128, and train the networks using Mean-Field Variational Inference for 200 epochs. The inference is carried out by performing multiple forward passes through parameters drawn from the posterior distribution. We used 20 Monte-Carlo samples in the experiments. ... The Bayesian Neural Network models have 2 hidden layers, with 10 neurons each, for the case study with 4 different seeds. ... The neural networks were trained for 200 time steps, with a learning rate of 0.001, and batch size of 128 using Mean-Field Variational Inference. The training dataset consisted of 28400 training samples, recorded without meals. ... The model predictive control planning horizon was k = 5, with a fixed seed for the randomized solver.