reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improving Reproducibility in AI Research: Four Mechanisms Adopted by JAIR

Authors: Odd Erik Gundersen, Malte Helmert, Holger Hoos

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This article involving the current editor-in-chief (Malte Helmert) and editor-in-chief at the time it was ﬁrst written (Holger Hoos) has an editorial character, as it concerns the running of the journal itself and is explicitly not a research article.
Researcher Affiliation	Academia	Odd Erik Gundersen EMAIL Norwegian University of Science and Technology Malte Helmert EMAIL University of Basel Holger Hoos EMAIL RWTH Aachen University & Universiteit Leiden
Pseudocode	No	The paper discusses the importance of pseudocode descriptions in other research articles, as seen in Appendix A: "4. Conceptual outlines and/or pseudo-code descriptions of the AI methods introduced in this work are provided, and important implementation details are discussed." However, the paper itself, being a position paper, does not present any pseudocode for its own proposed mechanisms.
Open Source Code	No	The paper discusses the sharing of source code as a mechanism for improving reproducibility in AI research, stating: "Open source: All code used for experiments in the paper is shared in a public repository..." It also mentions in Appendix A: "1. All source code required for conducting experiments is included in an online appendix or will be made publicly available upon publication of the paper." However, this paper does not present its own methodology with associated source code, as it is an editorial/position piece.
Open Datasets	No	The paper describes criteria for open data badges for other research articles, stating: "Open data: All data used in the paper is shared in a public repository, following best practices for long-term accessibility." It also includes a section in Appendix A for "Articles using data sets" with details about data availability. However, this paper itself does not use or provide any specific dataset.
Dataset Splits	No	The paper discusses dataset splits in the context of reproducibility for other research, noting in Appendix A: "7. All methods used for preprocessing, augmenting, batching or splitting data sets (e.g., in the context of hold-out or cross-validation) are described in detail." However, this paper does not report on experiments or use datasets, therefore no dataset splits are provided.
Hardware Specification	No	The paper mentions the description of execution environments as part of reproducibility checklists for computational experiments in Appendix A: "8. The execution environment for experiments, i.e., the computing infrastructure (hardware and software) used for running them, is described, including GPU/CPU makes and models; amount of memory (cache and RAM); make and version of operating system; names and versions of relevant software libraries and frameworks." However, this paper does not conduct experiments and therefore does not specify any hardware.
Software Dependencies	No	The paper mentions the description of execution environments as part of reproducibility checklists for computational experiments in Appendix A: "8. The execution environment for experiments, i.e., the computing infrastructure (hardware and software) used for running them, is described, including GPU/CPU makes and models; amount of memory (cache and RAM); make and version of operating system; names and versions of relevant software libraries and frameworks." However, this paper does not conduct experiments and therefore does not list software dependencies with version numbers.
Experiment Setup	No	The paper refers to the description of parameters and hyperparameters for experiments in Appendix A: "13. All (hyper-) parameter settings for the algorithms/methods used in experiments have been reported, along with the rationale or method for determining them." And "14. The number and range of (hyper-) parameter settings explored prior to conducting ﬁnal experiments have been indicated, along with the eﬀort spent on (hyper-) parameter optimisation." However, this paper does not describe any experiments of its own, so no specific experimental setup details are provided.