reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BARNN: A Bayesian Autoregressive and Recurrent Neural Network

Authors: Dario Coscia, Max Welling, Nicola Demo, Gianluigi Rozza

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on PDE modelling and molecular generation demonstrate that BARNN not only achieves comparable or superior accuracy compared to existing methods, but also excels in uncertainty quantification and modelling long-range dependencies.
Researcher Affiliation	Collaboration	Dario Coscia 1 2 Max Welling 2 Nicola Demo 3 Gianluigi Rozza 1 1Mathematics Area, International School of Advanced Studies, Italy 2Informatics Institute, University of Amsterdam, The Netherlands 3FAST Computing Srl, Italy.
Pseudocode	Yes	D.1. Pseudocodes Algorithm 1 Local reparametrization trick: Given H minibatch of activations for layer l = 1, . . . , L use the local reparametrization trick to compute the linear layer output. Algorithm 2 Training Bayesian Neural PDE Solvers Algorithm 3 Training Bayesian Recurrent Neural Networks
Open Source Code	Yes	We make our code publicly available at https://github.com/dario-coscia/barnn.
Open Datasets	Yes	We use the data set provided in (Özçelik et al., 2024), which contains a collection of 1.9 M SMILES extracted from the ChEMBL data set (Zdrazil et al., 2024).
Dataset Splits	Yes	We generate 1024 trajectories for training, and 100 for testing. All models are trained using the same architecture, a 2-layer neural network of 64 units and Relu activation. Table 5: PDE parameters setup. The discretization in time and space is indicated by nt and nx respectively, tmax represents the simulation physical time, and L is the domain length. Train Burgers 0.1 0.25 14 2π ... Test Burgers 0.1 0.25 32 2π
Hardware Specification	Yes	Computations were performed on a single Quadro RTX 4000 GPU with 8-GB of memory and requires approximately one day to train all models. The training was distributed across four Tesla P100 GPUs with 16-GB of memory each, and we used a batch size of 256 for memory requirements.
Software Dependencies	No	We perform the PDEs experiment using PINA (Coscia et al., 2023) software, which is a Python library based on PyTorch (Paszke et al., 2019) and PyTorch Lightning (Falcon & The Py Torch Lightning team, 2019) used for Scientific Machine Learning and includes Neural PDE Solvers, Physics Informed Networks and more. For the Molecules experiments we used PyTorch Lightning (Falcon & The Py Torch Lightning team, 2019) and RDKit for postprocessing analysis of the molecules. Explanation: The paper mentions software like PyTorch, PyTorch Lightning, PINA, and RDKit but does not provide specific version numbers for these libraries, only citations to their foundational papers.
Experiment Setup	Yes	All models were trained for 7000 epochs using Adam with 5e-4 of learning rate and 1e-8 of weight decay for regularization. ... All models were trained for 12 epochs using Adam with 2e-4 of learning rate and 1e-8 of weight decay for regularization. ... We used three LSTM-type layers with the hidden dimension of 1024 accounting for 25 M parameters model overall. For the dropout LSTM we used 0.2 as dropout coefficient