reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Beyond Random Augmentations: Pretraining with Hard Views

Authors: Fabio Ferreira, Ivo Rapant, Jörg Franke, Frank Hutter

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We are the first to demonstrate hard view pretraining s effectiveness at scale, particularly training on the full Image Net-1k dataset, and evaluating across multiple SSL methods, Conv Nets, and Vi Ts. As a result, HVP sets a new state-of-the-art on DINO Vi T-B/16, reaching 78.8% linear evaluation accuracy (a 0.6% improvement) and consistent gains of 1% for both 100 and 300 epoch pretraining, with similar improvements across transfer tasks in DINO, Sim Siam, i BOT, and Sim CLR.
Researcher Affiliation	Academia	Fabio Ferreira University of Freiburg Ivo Rapant University of Freiburg Jörg K.H. Franke University of Freiburg Frank Hutter ELLIS Institute Tübingen & University of Freiburg
Pseudocode	Yes	Algorithm 1 Pretraining with Hard Views
Open Source Code	Yes	We make our Py Torch Paszke et al. (2019) code, models, and all used hyperparameters publicly available under https://github.com/automl/hvp.
Open Datasets	Yes	To the best of our knowledge, we are the first to demonstrate the effectiveness of a hard view sampling strategy at scale, particularly on modern architectures like Vision Transformers (Vi Ts) and training on the full Image Net dataset. Table 2: HVP compares favorably against models trained without it when fine-tuned (F.T.) to or linearly evaluated (Lin.) on other datasets (averaged over 3 seeds; 100-ep. preraining). In Table 2, we apply both the linear evaluation (Lin.) and finetuning (F.T.) protocols to our models across a diverse set of datasets consisting of CIFAR10 (Krizhevsky, 2009), CIFAR100, Flowers102 (Nilsback & Zisserman, 2008), Food101 (Bossard et al., 2014), and i Naturalist 2021 (i Naturalist 2021 competition dataset). For object detection and instance segmentation, we use the COCO (Lin et al., 2014) dataset with Cascade Mask R-CNN (Cai & Vasconcelos, 2019; He et al., 2017).
Dataset Splits	Yes	We report the top-1 validation accuracy on frozen features, as well as the k-NN classifier performance, in Table 1. In Table 2, we apply both the linear evaluation (Lin.) and finetuning (F.T.) protocols to our models across a diverse set of datasets consisting of CIFAR10 (Krizhevsky, 2009), CIFAR100, Flowers102 (Nilsback & Zisserman, 2008), Food101 (Bossard et al., 2014), and i Naturalist 2021 (i Naturalist 2021 competition dataset).
Hardware Specification	Yes	We primarily ran our experiments with 8x NVIDIA Ge Force RTX 2080 Ti nodes, with which the pretraining and linear evaluation duration ranged from 3.5 to 25 days. J Computational Overhead of HVP The details of hardware and software used for this analysis are: one single compute node with 8 NVIDIA RTX 2080 Ti, AMD EPYC 7502 (32-Core Processor), 512GB RAM, Ubuntu 22.04.3 LTS, Py Torch 2.0.1, CUDA 11.8.
Software Dependencies	Yes	The details of hardware and software used for this analysis are: one single compute node with 8 NVIDIA RTX 2080 Ti, AMD EPYC 7502 (32-Core Processor), 512GB RAM, Ubuntu 22.04.3 LTS, Py Torch 2.0.1, CUDA 11.8.
Experiment Setup	Yes	For DINO, we additionally compare Res Net-50 (He et al., 2016) against the Vi T-S/16 (Dosovitskiy et al., 2020) architecture. Table 8: Pretraining Image Net hyperparameters for the runs with DINO Vi T-S/16. For 300 epochs, we use a batch size of 1024. Table 9: Pretraining Image Net hyperparameters for the runs with DINO Vi T-B/16. Table 10: Pretraining Image Net hyperparameters for the runs with Sim Siam. For 300 epochs, we use a batch size of 1024. Table 11: Pretraining Image Net hyperparameters for the runs with Sim CLR. Table 12: Finetuning hyperparameters for DINO Vi T-S/16. Table 13: Finetuning hyperparameters for Sim Siam and Res Net-50. Table 14: Hyperparameters object detection and instance segmentation on COCO.