reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

solo-learn: A Library of Self-supervised Methods for Visual Representation Learning

Authors: Victor Guilherme Turrisi da Costa, Enrico Fini, Moin Nabi, Nicu Sebe, Elisa Ricci

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Benchmarks. We benchmarked the available SSL methods on CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 Krizhevsky et al. (2009) and Image Net-100 (Deng et al., 2009) and made public the pretrained checkpoints. For Barlow Twins, BYOL, Mo Co V2+, NNCLR, Sim CLR and VICReg, hyperparameters were heavily tuned, reaching higher performance than reported on original papers or third-party results. Tab. 1 presents the top-1 and top-5 accuracy values for the online linear evaluation. For Image Net-100, traditional oﬄine linear evaluation is also reported. We also compare with the results reported by Lightly in Tab. 3.
Researcher Affiliation	Collaboration	Victor G. Turrisi da Costa EMAIL University of Trento Trento, Italy; Enrico Fini EMAIL University of Trento Trento, Italy; Moin Nabi EMAIL SAP AI Research Berlin, Germany; Nicu Sebe EMAIL University of Trento Trento, Italy; Elisa Ricci EMAIL University of Trento and Fondazione Bruno Kessler Trento, Italy.
Pseudocode	No	The paper describes the architecture of the solo-learn library and its components, including a diagram in Figure 1, but does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code is available at https://github.com/vturrisi/solo-learn.
Open Datasets	Yes	Benchmarks. We benchmarked the available SSL methods on CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 Krizhevsky et al. (2009) and Image Net-100 (Deng et al., 2009) and made public the pretrained checkpoints.
Dataset Splits	No	The paper mentions 'online linear evaluation' and 'oﬄine linear evaluation' and provides accuracy tables for these. However, it does not provide specific training/validation/test splits for the main self-supervised learning task or for the linear evaluation beyond these general terms.
Hardware Specification	No	The paper mentions supporting 'distributed training pipelines with mixed-precision' and targeting 'researchers with fewer resources, namely from 1 to 8 GPUs'. However, it does not specify exact GPU models (e.g., NVIDIA A100) or other hardware components used for their experiments.
Software Dependencies	No	Implemented in Python, using Pytorch and Pytorch lightning. The code that powers the library is written in Python, using Pytorch (Paszke et al., 2019) and Pytorch Lightning(PL) (Team, 2019) as back-ends and Nvidia DALI for fast data loading. While software is mentioned and cited with years, specific version numbers (e.g., PyTorch 1.9) are not provided.
Experiment Setup	Yes	For consistency, we ran three diﬀerent methods (Barlow Twins, BYOL and NNCLR) for 20 epochs on Image Net-100.