reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning

Authors: Daniel C. Castro, Jeremy Tan, Bernhard Kainz, Ender Konukoglu, Ben Glocker

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To address this issue we introduce Morpho-MNIST, a framework that aims to answer: to what extent has my model learned to represent speciﬁc factors of variation in the data? We extend the popular MNIST dataset by adding a morphometric analysis enabling quantitative comparison of trained models, identiﬁcation of the roles of latent variables, and characterisation of sample diversity. We further propose a set of quantiﬁable perturbations to assess the performance of unsupervised and supervised methods on challenging tasks such as outlier detection and domain adaptation.
Researcher Affiliation	Academia	Daniel C. Castro EMAIL Jeremy Tan EMAIL Bernhard Kainz EMAIL Biomedical Image Analysis Group Imperial College London London SW7 2AZ, United Kingdom Ender Konukoglu EMAIL Computer Vision Laboratory ETH Zürich 8092 Zürich, Switzerland Ben Glocker EMAIL Biomedical Image Analysis Group Imperial College London London SW7 2AZ, United Kingdom
Pseudocode	No	The paper describes a "Processing Pipeline" in Section 2.1 with numbered steps but does not present it as a formal pseudocode or algorithm block. The steps are described in natural language.
Open Source Code	Yes	Data and code are available at https://github.com/dccastro/Morpho-MNIST.
Open Datasets	Yes	We extend the popular MNIST dataset by adding a morphometric analysis... Data and code are available at https://github.com/dccastro/Morpho-MNIST. ...The MNIST (modiﬁed NIST) dataset (Le Cun et al., 1998) was constructed from handwritten digits in NIST Special Databases 1 and 3, now released as Special Database 19 (Grother and Hanaoka, 2016).
Dataset Splits	Yes	Figure A.1: Distribution of morphological attributes for plain MNIST digits. Top: training set; bottom: test set. ...We take the global dataset, restrict the training data to digits whose thickness is below 3 px, and evaluate the model s response to test digits across the entire thickness range...
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'scikit-image' and 'scikit-image defaults (van der Walt et al., 2014)' but does not provide specific version numbers for this or any other software library used.
Experiment Setup	Yes	We train a β-VAE with β = 4, as in Higgins et al. (2017) s experiments on small binary images, and a vanilla GAN with non-saturating loss, both with 64-dimensional latent space. ...Models were trained for 20 epochs using 64 images per batch, with no hyperparameter tuning. ...k-nearest-neighbours (k NN) using k = 5 neighbours and ℓ1 distance weighting, a support vector machine (SVM) with polynomial kernel and penalty parameter C = 100, a multi-layer perceptron (MLP) with 784 200 200 L architecture (L: number of outputs), and a Le Net-5 (Le Cun et al., 1998) convolutional neural network (CNN).