How to Evaluate and Mitigate IP Infringement in Visual Generative AI?
Authors: Zhenting Wang, Chen Chen, Vikash Sehwag, Minzhou Pan, Lingjuan Lyu
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluation, we discovered that the state-of-the-art visual generative models can generate content that bears a striking resemblance to characters protected by intellectual property rights... Experiments on well-known character IPs like Spider-Man, Iron Man, and Superman demonstrate the effectiveness of our proposed defense method. |
| Researcher Affiliation | Collaboration | 1Rutgers University 2Sony AI 3Northeastern University. Correspondence to: Lingjuan Lyu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 outlines our mitigation method. The input and the output of our defensive generation paradigm are the text prompt P and the final generated content I, respectively. In Line 2-3, we defend the name-based intellectual property infringement by using the LLM-based method described in 5.3. |
| Open Source Code | Yes | Our data and code can be found at https://github.com/ Zhenting Wang/GAI_IP_Infringement. |
| Open Datasets | Yes | Our data and code can be found at https://github.com/ Zhenting Wang/GAI_IP_Infringement. The potential reason for such IP infringement phenomenon is that the visual generative models have the memorizations on the training data (e.g., LAION dataset (Schuhmann et al., 2022) and Web Vid dataset (Bain et al., 2021)) of the visual generative AI might contain a large amount of publicly available copyrighted data. |
| Dataset Splits | No | The paper describes how images were generated for evaluation (e.g., 100 images for open-source models, 20 for closed-source models per character/prompt type) but does not specify traditional training/test/validation dataset splits for model development or reproduction. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware specifications such as GPU or CPU models, memory, or cloud computing instances used for running experiments. |
| Software Dependencies | No | The paper mentions using GPT-4 and GPT-4V(ision) as models/APIs and Stable Diffusion as a base, but it does not list specific software libraries or frameworks with their version numbers (e.g., Python, PyTorch, TensorFlow, or diffusers library versions) that would be needed to reproduce the implementation. |
| Experiment Setup | Yes | Wang et al. (2024b) found that a value of 7.5 provides a good balance in most classifier-free diffusion guidance settings. Therefore, we adopt 7.5 as the default value in our setup. |