Discriminating image representations with principal distortions
Authors: Jenelle Feather, David Lipshutz, Sarah Harvey, Alex Williams, Eero Simoncelli
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a demonstration of our method, we generated principal distortions for computational models previously proposed to capture aspects of the human visual system. All models were implemented in Py Torch (Ansel et al., 2024) and simulations were performed on NVIDIA GPUs (RTX A6000 and A100 models). |
| Researcher Affiliation | Academia | 1Center for Computational Neuroscience, Flatiron Institute, Simons Foundation 2Department of Neuroscience, Baylor College of Medicine 3Center for Neural Science, New York University |
| Pseudocode | Yes | Algorithm 1: Computing the principal distortions via projected gradient descent |
| Open Source Code | Yes | The code used to generate principal distortions, details of loading the models, and example principal distortion generation code is available at https://github.com/Lab For Computational Vision/principal-distortions |
| Open Datasets | Yes | Distortions were generated for images from the Kodak TID 2008 dataset (Ponomarenko et al., 2009). Image Net object classification task Alex Net (Krizhevsky et al., 2012) and Res Net50 (He et al., 2016) |
| Dataset Splits | Yes | We use a subset of 100 Image Net images where each image was chosen from a unique class (randomly chosen from the set of images at https: //github.com/Eli Schwartz/imagenet-sample-images). For the experiments with background and foreground blur, we used the Image Net-9 dataset with foreground/background masks (Xiao et al., 2021). We chose eight random images from each category of images, resulting in 72 total images. |
| Hardware Specification | Yes | simulations were performed on NVIDIA GPUs (RTX A6000 and A100 models). |
| Software Dependencies | Yes | All models were implemented in Py Torch (Ansel et al., 2024) |
| Experiment Setup | Yes | For each image and each comparison, we ran the gradient descent procedure for principal distortion optimization for 2500 iterations, using an exponentially decaying learning rate that started at 10.0 and decayed to 0.001 by the final step. The exception to this is for the experiment with Vi T (Base Patch16-224) vs. Efficient Net-B0, where we ran the optimization for only 500 iterations. We used a target distortion size of α = 0.1 and at each step of the optimization, we also scaled ϵ so that the image s + 1000ϵ would not be clipped when the RGB value was represented between 0 1; that is, we scaled ϵ so that 0 s[i] + 1000ϵ[i] 1 for each value i in the image. |