Few-Shot Learner Generalizes Across AI-Generated Image Detection

Authors: Shiyu Wu, Jing Liu, Jing Li, Yequan Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that FSD achieves state-of-the-art performance by +11.6% average accuracy on the Gen Image dataset with only 10 additional samples. More importantly, our method is better capable of capturing the intra-category commonality in unseen images without further training.
Researcher Affiliation Academia 1Institute of Automation, Chinese Academy of Sciences, Beijing, China 2Beijing Academy of Artificial Intelligence, Beijing, China 3University of Chinese Academy of Sciences, Beijing, China 4Harbin Institute of Technology, Shenzhen, China.
Pseudocode Yes Pseudo code to compute the loss J(ϕ) is provided in Algorithm 1.
Open Source Code Yes Our code is available at https://github.com/teheperinko541/Few-Shot AIGI-Detector.
Open Datasets Yes We evaluate our proposed method on the widely used Gen Image dataset (Zhu et al., 2023), which contains 1, 331, 167 real images from Image Net (Deng et al., 2009) and 1, 350, 000 generated images from 7 diffusion models and one GAN.
Dataset Splits Yes The real images are first divided into 8 subsets. Each subset is subsequently partitioned into a training part and a test part. Each generator is associated with one specific subset and uses the category labels of the real images within the subset to produce synthetic images. Consequently, the Gen Image dataset consists of 8 fake-vs-real subsets for analysis. (...) At every training step, 3 classes are randomly selected from the training set, with 5 samples chosen from each class for the support set and another 5 samples per class for the query set. (...) To maintain a consistent total number of test samples across different settings, we set a fixed ratio of 1 : 3 between the support set and the query set.
Hardware Specification Yes Our method is implemented with the Py Torch library and all the experiments are conducted on a single A100 with 40GB memory.
Software Dependencies No Our method is implemented with the Py Torch library
Experiment Setup Yes To be more comparable with previous works, we adopt the Res Net-50 (He et al., 2016) pretrained on Image Net (Deng et al., 2009) as the backbone of our model, which outputs a prototype vector of 1024 dimensions. Following Wang et al. (2023), the input images are first resized to 256 256 and then randomly cropped to 224 224 with random horizontal flipping during training. In contrast, only a center crop to 224 224 is performed after resizing the images during testing. The distance in the metric space is measured with Squared Euclidean Distance. (...) We employ Adam as the optimizer to minimize the cross-entropy loss with a base learning rate of 10 4. We also adopt a Step LR scheduler with γ = 0.5 and step size = 80000. Each classifier is trained for 200, 000 steps, with a batch size of 16.