High Fidelity Visualization of What Your Self-Supervised Representation Knows About

Authors: Florian Bordes, Randall Balestriero, Pascal Vincent

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our first experiments aim at evaluating the abilities of our model to generate realistic-looking images whose representations are close to the conditioning. To do so, we trained our Representation-Conditionned Diffusion Model (RCDM), conditioned on the 2048 dimensional representation given by a Resnet50 (He et al., 2016) trained with Dino (Caron et al., 2021) on Image Net (Russakovsky et al., 2015). Then we compute the representations of a set of images from Image Net validation data to condition the sampling from the trained RCDM. Fig. 1a shows it is able to sample images that are very close visually from the one that is used to get the conditioning. We also evaluated the generation abilities of our model on out of distribution data. Fig. 1b shows that our model is able to sample new views of an OOD image. We also quantitatively validate that the generated images representations are close to the original image representation in Tab. 1b, Fig. 17, Fig. 18 and Fig. 19.
Researcher Affiliation Collaboration Florian Bordes EMAIL Meta AI Research Mila, Université de Montréal Randall Balestriero Meta AI Research Pascal Vincent Meta AI Research Mila, Université de Montréal Canada CIFAR AI Chair
Pseudocode No The paper describes methods and processes through narrative text and diagrams (e.g., Figure 6b), but it does not include any explicitly labeled pseudocode blocks or algorithms with structured steps.
Open Source Code Yes Code and trained models are available at https://github.com/facebookresearch/RCDM.
Open Datasets Yes Dino (Caron et al., 2021) on Image Net (Russakovsky et al., 2015). ... We used cell images from microscope and a photo of a status (Both from Wikimedia Commons), sketch and cartoons from PACS (Li et al., 2017), image segmentation from Cityscape (Cordts et al., 2016) and an image of the Earth by NASA.
Dataset Splits Yes Then we compute the representations of a set of images from Image Net validation data to condition the sampling from the trained RCDM. ... We compute the FID (Heusel et al., 2017) and IS (Salimans et al., 2016) on 10 000 samples generated by each models with 10 000 images from the validation set of Image Net as reference. ... we compute the rank and mean reciprocal rank (MRR) of the image used as conditioning within the closest set of neighbor in the representation space of the samples generated from the valid set (50K samples).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU or CPU models, memory amounts) used for running the experiments. It mentions the code is based on other works and uses their hyperparameters but no hardware specifications are given.
Software Dependencies No RCDM, is based on the same code as Dhariwal & Nichol (2021) (https://github.com/openai/guided-diffusion) ... To obtain our conditional RCDM, one just needs to replace the Group Normalization layers in that architecture by a conditional batch normalization layer of Brock et al. (2019) (using the code from https://github.com/ajbrock/Big GAN-Py Torch). ... The self-supervised pretrained models we used to extract the conditioning representations were obtained from the model-zoo of VISSL (Goyal et al., 2021) (code from https://github.com/facebookresearch/vissl). The paper references several codebases/frameworks but does not provide specific version numbers for underlying software dependencies like Python, PyTorch, or CUDA.
Experiment Setup No RCDM, is based on the same code as Dhariwal & Nichol (2021) (https://github.com/openai/guided-diffusion) and uses the same hyper-parameters (See Appendix I of Dhariwal & Nichol (2021) for details about the hyper-parameters).