High Fidelity Visualization of What Your Self-Supervised Representation Knows About
Authors: Florian Bordes, Randall Balestriero, Pascal Vincent
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our first experiments aim at evaluating the abilities of our model to generate realistic-looking images whose representations are close to the conditioning. To do so, we trained our Representation-Conditionned Diffusion Model (RCDM), conditioned on the 2048 dimensional representation given by a Resnet50 (He et al., 2016) trained with Dino (Caron et al., 2021) on Image Net (Russakovsky et al., 2015). Then we compute the representations of a set of images from Image Net validation data to condition the sampling from the trained RCDM. Fig. 1a shows it is able to sample images that are very close visually from the one that is used to get the conditioning. We also evaluated the generation abilities of our model on out of distribution data. Fig. 1b shows that our model is able to sample new views of an OOD image. We also quantitatively validate that the generated images representations are close to the original image representation in Tab. 1b, Fig. 17, Fig. 18 and Fig. 19. |
| Researcher Affiliation | Collaboration | Florian Bordes EMAIL Meta AI Research Mila, Université de Montréal Randall Balestriero Meta AI Research Pascal Vincent Meta AI Research Mila, Université de Montréal Canada CIFAR AI Chair |
| Pseudocode | No | The paper describes methods and processes through narrative text and diagrams (e.g., Figure 6b), but it does not include any explicitly labeled pseudocode blocks or algorithms with structured steps. |
| Open Source Code | Yes | Code and trained models are available at https://github.com/facebookresearch/RCDM. |
| Open Datasets | Yes | Dino (Caron et al., 2021) on Image Net (Russakovsky et al., 2015). ... We used cell images from microscope and a photo of a status (Both from Wikimedia Commons), sketch and cartoons from PACS (Li et al., 2017), image segmentation from Cityscape (Cordts et al., 2016) and an image of the Earth by NASA. |
| Dataset Splits | Yes | Then we compute the representations of a set of images from Image Net validation data to condition the sampling from the trained RCDM. ... We compute the FID (Heusel et al., 2017) and IS (Salimans et al., 2016) on 10 000 samples generated by each models with 10 000 images from the validation set of Image Net as reference. ... we compute the rank and mean reciprocal rank (MRR) of the image used as conditioning within the closest set of neighbor in the representation space of the samples generated from the valid set (50K samples). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU or CPU models, memory amounts) used for running the experiments. It mentions the code is based on other works and uses their hyperparameters but no hardware specifications are given. |
| Software Dependencies | No | RCDM, is based on the same code as Dhariwal & Nichol (2021) (https://github.com/openai/guided-diffusion) ... To obtain our conditional RCDM, one just needs to replace the Group Normalization layers in that architecture by a conditional batch normalization layer of Brock et al. (2019) (using the code from https://github.com/ajbrock/Big GAN-Py Torch). ... The self-supervised pretrained models we used to extract the conditioning representations were obtained from the model-zoo of VISSL (Goyal et al., 2021) (code from https://github.com/facebookresearch/vissl). The paper references several codebases/frameworks but does not provide specific version numbers for underlying software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | No | RCDM, is based on the same code as Dhariwal & Nichol (2021) (https://github.com/openai/guided-diffusion) and uses the same hyper-parameters (See Appendix I of Dhariwal & Nichol (2021) for details about the hyper-parameters). |