ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method
Authors: Miles Everett, Mingjun Zhong, Georgios Leontidis
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we conduct benchmark tests on five diverse datasets, with most being standard in Capsule Network research. We commence our evaluations with the MNIST dataset (Le Cun et al., 2010), considered the foundational dataset for image classification. This initial phase allows us to ascertain our network s base performance, prior to engaging in more computationally demanding tests. Subsequently, we employ the more challenging Fashion MNIST (Xiao et al., 2017) and CIFAR10 datasets (Krizhevsky et al., 2009) to test our network s proficiency in classifying more complex images. |
| Researcher Affiliation | Academia | Miles Everett EMAIL Department of Computing Science University of Aberdeen, UK Mingjun Zhong EMAIL Department of Computing Science University of Aberdeen, UK Georgios Leontidis EMAIL Interdisciplinary Centre for Data and AI Department of Computing Science University of Aberdeen, UK |
| Pseudocode | No | Our routing algorithm uses a non-iterative approach to reduce computational overhead. In the following, we describe how the lower-level Capsules are mapped to the upper-level Capsules using our proposed prototype routing mechanism which is visualised in Figure 2. Each Proto Caps routing layer functions by projecting the pose matrices of the lower-level Capsules into a shared subspace denoted as S. The projection is executed by using the following multi-layer perceptron (MLPproj): poseproj i = MLPproj(posei), (5) |
| Open Source Code | Yes | Code is available at https://github.com/mileseverett/Proto Caps. |
| Open Datasets | Yes | We commence our evaluations with the MNIST dataset (Le Cun et al., 2010)... Subsequently, we employ the more challenging Fashion MNIST (Xiao et al., 2017) and CIFAR10 datasets (Krizhevsky et al., 2009)... we utilize the Small NORB dataset (Le Cun et al., 2004)... we make use of the Imagewoof dataset (Howard, 2019). As a subset of Image Net (Deng et al., 2009)... |
| Dataset Splits | Yes | We commence our evaluations with the MNIST dataset (Le Cun et al., 2010), considered the foundational dataset for image classification. [...] we employ the more challenging Fashion MNIST (Xiao et al., 2017) and CIFAR10 datasets (Krizhevsky et al., 2009) to test our network s proficiency in classifying more complex images. [...] we utilize the Small NORB dataset (Le Cun et al., 2004). [...] we make use of the Imagewoof dataset (Howard, 2019). |
| Hardware Specification | No | Explanation: The paper mentions GPU memory limitations for other methods ('Unfortunately the iterative methods require too much gpu memory (>80GB) in order to process the Imagewoof dataset at comparable sizes') but does not provide specific hardware details (GPU/CPU models, memory amounts, etc.) used for its own experiments. |
| Software Dependencies | No | Explanation: The paper mentions the 'FVCore library (FAIR, 2023)' and 'Timm library (Wightman, 2019)' but does not provide specific version numbers for these software libraries, only the publication years of their respective papers. |
| Experiment Setup | Yes | To train our Networks, we use the cross entropy loss function and Adam optimizer with weight decay. We train for 350 epochs and use a fixed learning rate scheduler which decreases the learning rate at 150 and 250 epochs and a batch size of 64. This is the same as in the SRCaps paper (Hahn et al., 2019). For Small Norb, we instead train for 100 epochs to avoid overfitting on this simple dataset. |