Collapse-Proof Non-Contrastive Self-Supervised Learning

Authors: Emanuele Sansone, Tim Lebailly, Tinne Tuytelaars

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical findings on image datasets, including SVHN, CIFAR-10, CIFAR-100, and Image Net-100. Our approach effectively combines the strengths of feature decorrelation and cluster-based self-supervised learning methods, overcoming training failure modes while achieving strong generalization in clustering and linear classification tasks. The experimental analysis is divided into four main parts. Firstly, we compare CPLearn against noncontrastive approaches from the families of feature decorrelation and cluster-based methods on three image datasets, i.e. SVHN (Netzer et al., 2011), CIFAR-10, CIFAR100 (Krizhevsky et al., 2009).
Researcher Affiliation Academia 1Department of Electrical Engineering (ESAT), KU Leuven, Belgium 2CSAIL, MIT, US.
Pseudocode Yes Figure 2: In CPLearn, minimizing the proposed objective together with the corresponding projector ensures that the embedding representations are clustered and at the same time that their features are decorrelated. This guarantees that the representations are collapse-proof, meaning that dimensional, cluster, intra-cluster and representation collapses are prevented. Algorithm 1 Pseudocode for CPLearn
Open Source Code No We use the repository from (da Costa et al., 2022) for SVHN and CIFAR experiments, and the one from (Caron et al., 2021) for Image Net-100 experiments. This refers to external codebases utilized, not the authors' own code release for the presented methodology.
Open Datasets Yes We validate our theoretical findings on image datasets, including SVHN, CIFAR-10, CIFAR-100, and Image Net-100. We use a Res Net-8 backbone network with f = 128 for SVHN and CIFAR10, and with f = 256 for CIFAR-100, following the methodology from (Sansone, 2023). For Image Net-100, we use a standard small Vi T with f = 384, following the methodology from (Caron et al., 2021).
Dataset Splits Yes The experimental analysis is divided into four main parts. Firstly, we compare CPLearn against noncontrastive approaches from the families of feature decorrelation and cluster-based methods on three image datasets, i.e. SVHN (Netzer et al., 2011), CIFAR-10, CIFAR100 (Krizhevsky et al., 2009). Figure 4: Realization of embedding covariance (left) and adjacency matrices (right) for the whole CIFAR-10 test dataset. Evaluation. For linear probe evaluation, we followed standard practice by removing the projector head and train a linear classifier on the backbone representation.
Hardware Specification Yes We used a Vi T-small backbone network and train it for 100 epochs with learning rate equal to 5e 4 and batch-size per GPU equal to 64 on a node with 8 NVIDIA A100 GPUs.
Software Dependencies No The paper mentions software by name (e.g., "Py Torch-like pseudo-code", "Adam optimizer", "solo-learn", "DINO codebase") but does not provide specific version numbers for these components.
Experiment Setup Yes We use a Res Net-8 backbone network with f = 128 for SVHN and CIFAR-10, and with f = 256 for CIFAR-100... The β parameter in Eq. 7 is chosen from the range {0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10}... We used a Res Net-18 backbone network on CIFAR-10 and train it for 1000 epochs with Adam optimizer, learning rate equal to 1e 3 and batch-size equal to 64 on 1 A100 GPU. (Also refers to Table 6 in Appendix I, which lists more hyperparameters like batch size, epochs, Adam betas, learning rate, and data augmentation parameters).