reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Uncertainty Estimation and Interpretability with Bayesian Non-negative Decision Layer

Authors: XINYUE HU, Zhibin Duan, Bo Chen, Mingyuan Zhou

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that with enhanced disentanglement capabilities, BNDL not only improves the model s accuracy but also provides reliable uncertainty estimation and improved interpretability. (...) 5 EXPERIMENTS
Researcher Affiliation	Academia	1 National Key Laboratory of Radar Signal Processing, Xidian University, Xi an, 710071, China. 2 School of Mathematics and Statistics, Xi an Jiaotong University, Xi an, Shaanxi, China. 3 Mc Combs School of Business, The University of Texas at Austin, Austin, TX 78712
Pseudocode	No	The paper describes the variational inference network and process using equations, but no explicit pseudocode or algorithm block.
Open Source Code	Yes	Our code is available at https://github.com/XYHu122/BNDL. (...) The novel methods introduced in this paper are accompanied by detailed descriptions (Sec. 3), and their implementations are provided at https://github.com/XYHu122/BNDL.
Open Datasets	Yes	We assessed the effectiveness of BNDL across multiple datasets, including CIFAR-10, CIFAR-100, and Image Net-1k. (...) The CIFAR-10, CIFAR-100, and Image Net-1k datasets we used are all publicly available standard datasets. As for Places-10, it is a subset of Places365 (Zhou et al., 2017) containing the classes airport terminal , boat deck , bridge , butcher s shop , churchoutdoor , hotel room , laundromat , river , ski slope and volcano .
Dataset Splits	No	The paper mentions using "publicly available standard datasets" but does not explicitly state the dataset splits (e.g., train/validation/test percentages or sample counts) or cite a specific source for predefined splits.
Hardware Specification	Yes	All experiments are conducted on Linux servers equipped with 32 AMD EPYC 7302 16-Core Processors and 2 NVIDIA 3090 GPUs.
Software Dependencies	Yes	Models are implemented in Py Torch version 1.12.1, scikit-learn version 1.0.2 and Python 3.7.
Experiment Setup	Yes	Training from scratch setup For Res Net-18 on CIFAR-10 and CIFAR-100, we set the batch size to 128, learning rate to 0.1, training epochs to 150, and weight decay to 5e-4. For Res Net-50 on Image Net-1k, we set the batch size to 256, weight decay to 1e-4, epochs to 200, and learning rate to 0.1. (...) Fine-tuning setup For Places-10, we set the learning rate to 0.1, batch size to 128, and epochs to 100. For Image Net-1k, CIFAR-10, and CIFAR-100, we set the learning rate to 0.001 and epochs to 200.