reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Solving Probabilistic Linear Diophantine Equations

Authors: Patrick Kreitzberg, Oliver Serang

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Table 2: Table comparing the runtimes of untrimmed vs. trimmed p-convolution trees. The same problem, Y = X1 + X2 + + Xm with p = 1, was used for both trees, where X1, . . . , Xm {0, 1} and Y {0, . . . , m}. The times reported are to get posteriors on all variables. Table 3: Tables of error and time analysis for Evergreen versus specialized methods. All methods used n = 2 and varied with the number of the input variables, m. Times are averaged over ten iterations.
Researcher Affiliation	Academia	Patrick Kreitzberg EMAIL Department of Mathematics University of Montana Missoula, MT 59812-0003, USA Oliver Serang EMAIL Department of Computer Science University of Montana Missoula, MT 59812-0003, USA
Pseudocode	No	The paper describes the methods in detailed prose, such as in sections 2.1 Trimmed p-convolution Trees, 2.4 Generalizing the Noisy-or for Use in Trimmed p-convolution Trees, and 2.5 Underﬂow/overﬂow Trimming, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The C++11 code for the Evergreen Forest engine, its modules, demos, and utilities for visualizing graphs in Python are freely available under an MIT software license and can be downloaded at https://bitbucket.org/orserang/evergreenforest.
Open Datasets	Yes	The 18mix data set is a mixture of eighteen proteins from several diﬀerent species: bovine, E. coli, B. licheniformis, rabbit, horse, and chicken (Klimek et al., 2008). The IPRG data set contains 5,592 E. coli proteins (Lee et al., 2018). The yeast data set has 3,443 yeast proteins (Ramakrishnan et al., 2009).
Dataset Splits	No	The paper mentions using specific datasets such as '18mix data set', 'IPRG data set', and 'yeast data set' for protein inference experiments. However, it does not provide details on how these datasets were split into training, validation, or test sets.
Hardware Specification	Yes	The computer used had dual AMD EPYC 7351 16-core processors with 256gb of RAM.
Software Dependencies	Yes	All programs for benchmarking results are written in C++ and compiled with g++ version 7.4.0 using compiler options -std=c++11 -O3 -march=native -mtune=native. p-convolution tree runtimes where calculated using Evergreen Forest engine.
Experiment Setup	Yes	Fido Parameters IPRG Data Set 18mix Data Set Yeast Data Set α 0.09017 0.01818 0.02439 β 0.1459 0.2868 0.1622 γ 0.1854 0.005026 0.003104. For this reason, we set the convergence threshold to zero so that we could measure the time taken for one million iterations, instead of measuring how long until convergence. α, β, and γ were found using a golden-section search (Kiefer, 1953).