reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Risk Measures and Upper Probabilities: Coherence and Stratification

Authors: Christian Fröhlich, Robert C. Williamson

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate how this new approach to uncertainty helps tackling practical machine learning problems. ... For our experiments (Section 6), we suggest two ways to evaluate the tail risk of a loss distribution.
Researcher Affiliation	Academia	Christian Frohlich EMAIL Robert C. Williamson EMAIL University of Tübingen and Tübingen AI Center Tübingen, Germany
Pseudocode	No	The paper describes methods mathematically and in prose but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about providing source code or links to code repositories.
Open Datasets	Yes	We use the MNIST26 and the adult27 data set. ... 26http://yann.lecun.com/exdb/mnist/ 27https://archive.ics.uci.edu/ml/datasets/Adult
Dataset Splits	Yes	For MNIST, we use 6000 training images and test on the remaining 54,000 images. For adult we use 10,000 data points as the training set and the remaining 38,842 as the test set. ... We use 80% of the red wines (1279 examples) in the training set and correspondingly, 1279 examples of white wine. The test set consists of 320 red wines and 3619 white wines.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, memory amounts) used for experiments.
Software Dependencies	No	We use the pytorch library to implement our experiments and the Adam optimizer with a learning rate of 0.001. We initialize the matrix Vk with the classical PCA solution, obtained from sklearn.decomposition.PCA. While specific software libraries are mentioned, their version numbers are not provided.
Experiment Setup	Yes	For MNIST we use k 50 components and for adult k 5. We train for 2000 epochs with a learning rate of 0.001. ... We pretrain for 2000 epochs using the expectation as the risk measure and the Adam optimizer with a learning rate of 0.01. Using this initialization, we then train for each risk measure for 5000 epochs with a learning rate of 0.001. For winequality we use a simple feedforward network (one hidden layer of 24 units followed by a nonlinear Re Lu activation) with the ℓ1 loss. We train for 3000 epochs using the Adam optimizer with a learning rate of 0.01.