The Superposition of Diffusion Models Using the Itô Density Estimator
Authors: Marta Skreta, Lazar Atanackovic, Joey Bose, Alexander Tong, Kirill Neklyudov
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the utility of using SUPERDIFF for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, as well as improved conditional molecule generation and unconditional de novo structure design of proteins. We test the applicability of our approach SUPERDIFF using the two proposed superimposition strategies for image and protein generation tasks. For images, we first demonstrate the ability to combine the outputs of two models trained on disjoint datasets such that they yield better performance than the model trained on both of the datasets (see Sec. 4.1). In addition, we demonstrate the ability to interpolate between densities which correspond to concept interpolation in the image space (see Sec. 4.2). For proteins, we demonstrate that combining two different generative models leads to improvements in designability and novelty generation (see Sec. 4.3). Across all our experimental settings, we find that combining pre-trained diffusion models using SUPERDIFF leads to higher fidelity generated samples that better match task specification in text-conditioned image generation and also produce more diverse generated protein structures than comparable composition strategies. |
| Researcher Affiliation | Academia | Marta Skreta ,1,2 Lazar Atanackovic ,1,2 Avishek Joey Bose3,4 Alexander Tong4,5 Kirill Neklyudov4,5, 1University of Toronto 2Vector Institute 3University of Oxford 4Mila Quebec AI Institute 5Université de Montréal |
| Pseudocode | Yes | Algorithm 1: SUPERDIFF pseudocode (for OR and AND operations) |
| Open Source Code | Yes | https://github.com/necludov/super-diffusion |
| Open Datasets | Yes | We empirically demonstrate the utility of using SUPERDIFF for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, as well as improved conditional molecule generation and unconditional de novo structure design of proteins. ... Protein Data Bank (PDB, (Berman et al., 2000)) |
| Dataset Splits | No | We split CIFAR-10 into two disjoint training sets of equal size (first 5 labels and last 5 labels), train two diffusion models on each part, and generate the samples jointly using both models. The paper describes a specific partition of the training data for the base models but does not explicitly state the train/test/validation splits used for evaluating the SUPERDIFF approach or refer to standard splits for evaluation in the main text. |
| Hardware Specification | No | The paper makes general references to 'large computing hardware' but does not specify any particular GPU models, CPU types, or other hardware configurations used for their experiments. |
| Software Dependencies | No | The paper mentions several tools and models like 'Proteus', 'Frame Diff', 'ESMFold', 'Protein MPNN', 'Foldseek', 'UMAP', 'KMeans', 'Stable Diffusion v1-4', and 'LDMol'. However, none of these are accompanied by specific version numbers. Essential software dependencies such as Python, PyTorch, or CUDA versions are not provided. |
| Experiment Setup | Yes | For the choice of hyperparameters, architecture, and data preprocessing, we follow (Song et al., 2020). ... we set κ = 0.5. ... We generate 20 images for 20 different concept pairs (tasks)... All results are averages over 50 generated proteins at lengths {100, 150, 200, 250, 300} for 500 timesteps. ... We use temperature values T = 1 for all variants of SUPERDIFF. ... Metrics are taken from 5 runs of batch-size 512. |