Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods
Authors: Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron Courville, Alessandro Sordoni
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a large-scale empirical study with a diverse family of SSL algorithms, we find that CLID better correlates with in-distribution model performance than other competing recent evaluation schemes. We also benchmark CLID on out-of-domain generalization, where CLID serves as a predictor of the transfer performance of SSL models on several visual classification tasks, yielding improvements with respect to the competing baselines. |
| Researcher Affiliation | Collaboration | Yuchen Lu Mila, University of Montreal Zhen Liu Mila, University of Montreal Aristide Baratin SAIT AI Lab, Montreal Romain Laroche Microsoft Research Aaron Courville Mila, University of Montreal, CIFAR Alessandro Sordoni Microsoft Research, MILA |
| Pseudocode | No | The paper describes methods like Intrinsic Dimension (ID) and Cluster Learnability (CL) using mathematical formulations and descriptive text, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper references official implementations for baseline methods (Wang & Isola (2020) and MCR2 (Yu et al., 2020)) with URLs, but it does not provide explicit source code for the methodology (CLID) described in this paper. There is no statement of release for their own code. |
| Open Datasets | Yes | We select in total 28 self-supervised learning checkpoints trained on Image Net over different algorithms, architecture, and training epochs. A complete list can be found in Table 4 in the appendix. [...] We collect 7 out-of-domain downstream visual classification tasks. |
| Dataset Splits | Yes | We use the KNN evaluation on the validation data using the ground-truth labels to measure the performance of the model, which has been shown to be well correlated with the linear evaluation but computationally less expensive (Caron et al., 2021). [...] We re-use the dataset split to assess the performance of a KNN classifier on this labelled dataset. |
| Hardware Specification | Yes | All our experiments are computed on a single V100 GPU. |
| Software Dependencies | No | The paper mentions using K-means clustering, KNN classifier, and MINE (Mutual Information Neural Estimation), but it does not specify any version numbers for these software components or other libraries/frameworks used. |
| Experiment Setup | Yes | For the computation of cluster learnability, we choose the square root of the dataset size as the number of clusters in Kmeans. We report results with 1 neighbor for our KNN learner. We normalize the features and use cosine distance for the K-means clustering and KNN learner5. [...] We follow the official implementation6 with = 2 and t = 2 as default values for the tunable parameters in Eqn 1. [...] We use a batch size 128, learning rate 0.0005 and weight decay 0.001. The network is trained for 50000 steps on the training images, and we report MINE on the validation data. |