Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning
Authors: Daniel C. Castro, Jeremy Tan, Bernhard Kainz, Ender Konukoglu, Ben Glocker
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To address this issue we introduce Morpho-MNIST, a framework that aims to answer: to what extent has my model learned to represent specific factors of variation in the data? We extend the popular MNIST dataset by adding a morphometric analysis enabling quantitative comparison of trained models, identification of the roles of latent variables, and characterisation of sample diversity. We further propose a set of quantifiable perturbations to assess the performance of unsupervised and supervised methods on challenging tasks such as outlier detection and domain adaptation. |
| Researcher Affiliation | Academia | Daniel C. Castro EMAIL Jeremy Tan EMAIL Bernhard Kainz EMAIL Biomedical Image Analysis Group Imperial College London London SW7 2AZ, United Kingdom Ender Konukoglu EMAIL Computer Vision Laboratory ETH Zürich 8092 Zürich, Switzerland Ben Glocker EMAIL Biomedical Image Analysis Group Imperial College London London SW7 2AZ, United Kingdom |
| Pseudocode | No | The paper describes a "Processing Pipeline" in Section 2.1 with numbered steps but does not present it as a formal pseudocode or algorithm block. The steps are described in natural language. |
| Open Source Code | Yes | Data and code are available at https://github.com/dccastro/Morpho-MNIST. |
| Open Datasets | Yes | We extend the popular MNIST dataset by adding a morphometric analysis... Data and code are available at https://github.com/dccastro/Morpho-MNIST. ...The MNIST (modified NIST) dataset (Le Cun et al., 1998) was constructed from handwritten digits in NIST Special Databases 1 and 3, now released as Special Database 19 (Grother and Hanaoka, 2016). |
| Dataset Splits | Yes | Figure A.1: Distribution of morphological attributes for plain MNIST digits. Top: training set; bottom: test set. ...We take the global dataset, restrict the training data to digits whose thickness is below 3 px, and evaluate the model s response to test digits across the entire thickness range... |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'scikit-image' and 'scikit-image defaults (van der Walt et al., 2014)' but does not provide specific version numbers for this or any other software library used. |
| Experiment Setup | Yes | We train a β-VAE with β = 4, as in Higgins et al. (2017) s experiments on small binary images, and a vanilla GAN with non-saturating loss, both with 64-dimensional latent space. ...Models were trained for 20 epochs using 64 images per batch, with no hyperparameter tuning. ...k-nearest-neighbours (k NN) using k = 5 neighbours and ℓ1 distance weighting, a support vector machine (SVM) with polynomial kernel and penalty parameter C = 100, a multi-layer perceptron (MLP) with 784 200 200 L architecture (L: number of outputs), and a Le Net-5 (Le Cun et al., 1998) convolutional neural network (CNN). |