LAION-C: An Out-of-Distribution Benchmark for Web-Scale Vision Models

Authors: Fanfei Li, Thomas Klein, Wieland Brendel, Robert Geirhos, Roland S. Zimmermann

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To address this, we introduce LAION-C as a benchmark alternative for Image Net-C. LAION-C consists of six novel distortion types specifically designed to be OOD, even for web-scale datasets such as LAION. In a comprehensive evaluation of stateof-the-art models, we find that the LAION-C dataset poses significant challenges to contemporary models, including MLLMs such as Gemini and GPT-4o. We additionally conducted a psychophysical experiment to evaluate the difficulty of our corruptions for human observers, enabling a comparison of models to lab-quality human robustness data.
Researcher Affiliation Collaboration 1Max Planck Institute for Intelligent Systems, T ubingen, Germany 2ELLIS Institute T ubingen 3T ubingen AI Center 4Google Deep Mind. Correspondence to: Fanfei Li <EMAIL>.
Pseudocode No The paper describes the construction of new OOD distortions (e.g., Mosaic, Glitched, Vertical Lines, Geometric Shapes, Stickers, Luminance Checkerboard) with parameters for different intensity levels (Tables 2-7). However, it does not present these procedures as formal pseudocode or algorithm blocks.
Open Source Code Yes The evaluation code for LAION-C is publicly available at: https://github.com/Fanfei Li/LAION-C.
Open Datasets Yes The LAION-C dataset is published on Zenodo. A link to the dataset is provided via the Git Hub repository.
Dataset Splits No Since the dataset is primarily used for benchmarking purposes, splitting specifics are not provided. Essentially, the entire dataset is a validation set.
Hardware Specification Yes Our experiments were conducted in a darkened cabin, using a 22 VIEWPixx 3D light LCD monitor (VPixx Technologies, Saint-Bruno, Canada) at a refresh rate of 120 Hz (scanning backlight mode on). The screen measures 484 302 mm, at a resolution of 1920 1200 pixels. ... The experiment was implemented using the Psychophysics Toolbox (Kleiner et al., 2007, version 3.0.12) in MATLAB (Release 2016a, The Math Works, Inc., Natick, Massachusetts, United States) using a 12-core desktop computer (AMD HD7970 graphics card Tahiti by AMD, Sunnyvale, California, United States) running Kubuntu 14.04 LTS.
Software Dependencies Yes The experiment was implemented using the Psychophysics Toolbox (Kleiner et al., 2007, version 3.0.12) in MATLAB (Release 2016a, The Math Works, Inc., Natick, Massachusetts, United States) using a 12-core desktop computer (AMD HD7970 graphics card Tahiti by AMD, Sunnyvale, California, United States) running Kubuntu 14.04 LTS.
Experiment Setup Yes Participants were given 2.5 s to view each image, followed by a 2 s response window to classify the image by clicking on a set of icons. ... To motivate high performance, a monetary bonus was awarded for surpassing fixed, predetermined performance thresholds for each block.