reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

Authors: Spyros Gidaris, Andrei Bursuc, Oriane Siméoni, Antonín Vobecký, Nikos Komodakis, Matthieu Cord, Patrick Perez

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments Setup. We evaluate our MOCA method by training Vi T-B/16 models on the Image Net-1k (Russakovsky et al., 2015) dataset. ... We evaluate the learned representations on the k-NN Image Net classification task. ... In Tab. 5a, we compare our method MOCA against other self-supervised methods... We further evaluate our method on the Cityscapes (Cordts et al., 2016) semantic segmentation dataset... We present results on COCO detection and instance segmentation in Tab. 7.
Researcher Affiliation	Collaboration	Spyros Gidaris1, Andrei Bursuc1, Oriane Siméoni1, Antonin Vobecky1,2,3 Nikos Komodakis4,5,6, Matthieu Cord1, Patrick Pérez1 1Valeo.ai 2Czech Institute of Informatics, Robotics and Cybernetics at the Czech Technical University in Prague 3Czech Technical University in Prague,Faculty of Electrical Engineering 4University of Crete 5IACM-Forth 6Archimedes/Athena RC Correspondance: EMAIL
Pseudocode	Yes	C.4 Image augmentations pseudo-code Here we provide Py Torch pseudo-code for the image augmentations used in MOCA for generating the two unmasked random views x1 and x2. import torchvision.transforms as T normalize = T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)) aug_view1 = T.Compose([...]) aug_view2 = T.Compose([...])
Open Source Code	Yes	We provide the implementation code at https://github.com/valeoai/MOCA.
Open Datasets	Yes	We evaluate our MOCA method by training Vi T-B/16 models on the Image Net-1k (Russakovsky et al., 2015) dataset. ... We further evaluate our method on the Cityscapes (Cordts et al., 2016) semantic segmentation dataset... We use the COCO 2017 set consisting of 118K training images and 5k validation.
Dataset Splits	Yes	We train using the full Cityscapes training set of 2975 images as well as 100 or 374 training images, representing 1/30 and 1/8 of the full training set. For these 100 and 374 low-shot settings, we use three different splits of 100 or 374 training images respectively following the protocol of French et al. (2020) and report the average m Io U performance over the three splits. ... We use the COCO 2017 set consisting of 118K training images and 5k validation. ... Low-shot Image Net-1k classification. Here we adopt the low-shot evaluation protocol of MSN (Assran et al., 2022) and use as few as 1, 2, or 5 training images per class as well as using 1% of the Image Net-1k s training data, which corresponds to 13 images per class.
Hardware Specification	Yes	The batch size is 2048 split over 8 A100 GPUs. ... Time and Memory : per epoch training time and GPU memory footprint measured with a single 8-A100 node and batch size 2048.
Software Dependencies	No	The Py Torch pseudo-code for this augmentation stategy is provided in Appendix C.4. ... For the logistic regression, we use the cyanure package (Mairal, 2019). (Does not provide specific version numbers for PyTorch or cyanure.)
Experiment Setup	Yes	Setup. We evaluate our MOCA method by training Vi T-B/16 models on the Image Net-1k (Russakovsky et al., 2015) dataset. We use the Adam W optimizer (Loshchilov & Hutter, 2019) with β1 = 0.9, β2 = 0.999 and weight decay 0.05. The batch size is 2048 split over 8 A100 GPUs. For the learning rate lr, we use a linear warm-up from 0 to its peak value for 30 epochs and then decrease it over the remaining epochs with a cosine annealing schedule. The peak lr is 1.5 10 4 and the number of training epochs is 100 or 200. More implementation details are provided in the appendix. In Tab. 11a we provide the implementation details for the Vi T-B/16-based MOCA model that we used for producing the results of Sec. 4.4 in the main paper and in Tab. 11b the optimization setting for its training.