reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification

Authors: Ardhendu Behera, Zachary Wharton, Pradeep R P G Hewage, Asish Bera929-937

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach using six state-of-the-art (Sot A) backbone networks and eight benchmark datasets. Our method signiﬁcantly outperforms the Sot A approaches on six datasets and is very competitive with the remaining two.
Researcher Affiliation	Academia	Ardhendu Behera, Zachary Wharton, Pradeep R P G Hewage and Asish Bera Department of Computer Science, Edge Hill University St Helen Road, Lancashire United Kingdom, L39 4QP EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the approach in text and figures, but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://ardhendubehera.github.io/cap/.
Open Datasets	Yes	We comprehensively evaluate our model on widely used eight benchmark FGVC datasets: Aircraft (Maji et al. 2013), Food-101 (Bossard, Guillaumin, and Gool 2014), Stanford Cars (Krause et al. 2013), Stanford Dogs (Khosla et al. 2011), Caltech Birds (CUB-200) (Wah et al. 2011), Oxford Flower (Nilsback and Zisserman 2008), Oxford-IIIT Pets (Parkhi et al. 2012), and NABirds (Van Horn et al. 2015).
Dataset Splits	Yes	Statistics of datasets and their train/test splits are shown in Table 1. We use the top-1 accuracy (%) for evaluation. Experimental settings: In all our experiments, we resize images to size 256 × 256, apply data augmentation techniques of random rotation (±15 degrees), random scaling (1 ± 0.15) and then random cropping to select the ﬁnal size of 224 × 224 from 256 × 256.
Hardware Specification	Yes	The model is trained for 150 epochs using an NVIDIA Titan V GPU (12 GB).
Software Dependencies	No	We use Keras+Tensorﬂow to implement our algorithm. The paper does not specify version numbers for these software dependencies.
Experiment Setup	Yes	In all our experiments, we resize images to size 256 × 256, apply data augmentation techniques of random rotation (±15 degrees), random scaling (1 ± 0.15) and then random cropping to select the ﬁnal size of 224 × 224 from 256 × 256. We set the cluster size to 32 in our learnable pooling approach. We apply Stochastic Gradient Descent (SGD) optimizer to optimize the categorical cross-entropy loss function. The SGD is initialized with a momentum of 0.99, and an initial learning rate 1e-4, which is multiplied by 0.1 after every 50 epochs. The model is trained for 150 epochs using an NVIDIA Titan V GPU (12 GB).