Forte : Finding Outliers with Representation Typicality Estimation

Authors: Debargha Ganguly, Warren Morningstar, Andrew Yu, Vipin Chaudhary

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrating Forte s superior performance compared to state-of-the art supervised and unsupervised baselines on various OOD detection tasks, and synthetic image detection, including photorealistic images generated by advanced techniques like Stable Diffusion.
Researcher Affiliation Collaboration Debargha Ganguly 1, Warren Morningstar2, Andrew Yu1, Vipin Chaudhary1 1Case Western Reserve University, Cleveland, OH, USA 2Google Research, Mountain View, CA, USA
Pseudocode Yes Algorithm 1 OOD Detection Using Per-Point PRDC Metrics in Forte
Open Source Code Yes Our code is available at github.com/Debargha G/forte.
Open Datasets Yes Using public datasets like Fast MRI Zbontar et al. (2018); Knoll et al. (2020) and the Osteoarthritis Initiative (OAI) Nevitt et al. (2006), we simulate realistic scenarios where models trained on one dataset (treated as in-distribution) are confronted with another (considered OOD).
Dataset Splits Yes We split the data into three parts: one-third for held-out testing, one-third as the reference distribution, and one-third as a test distribution that is drawn from the reference distribution.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory amounts, or cloud instance types) are provided in the paper.
Software Dependencies No The image generation pipeline is implemented using the Hugging Face Transformers (Wolf et al., 2020) and Diffusers (von Platen et al., 2022) libraries, which provide high-level APIs for working with pre-trained models. While libraries are mentioned, no specific version numbers are provided for reproducibility.
Experiment Setup Yes We train One-Class SVM (Schölkopf et al., 2001), Gaussian Kernel Density Estimation (Parzen, 1962), and Gaussian Mixture Model (Reynolds et al., 2009) on the reference summary statistics. For reliable measurements, Forte is run with 10 random seeds. We use the Stable Diffusion 2.0 base model and generate images with varying strength parameters (0.3, 0.5, 0.7, 0.9, 1.0) to control the influence of the input image on the generated output