Lifelong Scalable Generative System via Online Maximum Mean Discrepancy
Authors: Fei Ye, Adrian G. Bors
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive results show that the proposed approach performs better than the state-of-the-art. We perform a series of experiments on unsupervised image generation, showing that DEMU significantly relieves the DDPM model forgetting. The contributions of this research study are as follows: (4) We perform a series of experiments showing that the proposed methodology achieves state-of-the-art performance. |
| Researcher Affiliation | Academia | 1School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 2Department of Computer Science, University of York, York YO10 5GH, UK EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm Implementation We illustrate the learning procedure of a single model for DEMU in Fig. 2. Training DEMU consists of 4 steps : Step 1 (Training). Initially, if the memory is empty M = , we build the first memory unit M1 = {x1} and add it to M. We train the DDPM model ϵθ and the representation network {encϕ, decη} on M xi using the DDPM objective function and reconstruction loss, respectively. Step 2 (Sample selection). We form a joint set Mk xi and sort it to M using Eq. (8). Then the current memory unit Mk is updated using (9). Step 3 (Memory expansion). We build the second memory unit M2 at the training step (T100) to use the memory expansion criterion from Eq. (5). Other samples are added in the same way to the memory buffers. Step 4 (Memory reduction). If the memory buffer is full, we repeatedly perform Eq. (10) and Eq. (11) to remove those memory units containing redundant information units, until the memory reaches its maximum capacity, |M| = D. |
| Open Source Code | Yes | The code is available 1 https://github.com/dtuzi123/DEMU |
| Open Datasets | Yes | Datasets DEMU DEMU-D LTS LGM R-VAE R-DDPM CGKD-GAN CVA CGKD-WAE CGKD-VAE Split MNIST 20.18 19.23 71.67 66.31 55.67 63.26 54.34 21.46 47.98 48.72 Split Fashion 43.34 37.64 128.84 109.20 103.25 82.23 85.23 67.28 87.92 88.16 Split SVHN 62.12 53.49 87.25 72.60 65.18 87.22 101.26 57.14 100.15 102.87 Split CIFAR10 83.67 78.70 124.22 177.15 155.72 106.18 115.38 74.97 162.12 163.75 ... Celeb A-3DChair 81.53 79.79 186.25 241.14 210.18 183.72 132.12 142.62 154.45 156.62 Celeb A-CACD 73.11 69.85 124.87 117.76 121.52 103.52 78.00 92.83 142.52 145.23 Split MINIImage Net 141.66 123.65 179.78 216.06 205.12 181.15 176.18 177.17 241.11 243.37 |
| Dataset Splits | Yes | We divide each original dataset into five independent parts, where each part contains samples from two consecutive classes, as in (Aljundi, Kelchtermans, and Tuytelaars 2019), resulting in Split MNIST, Split Fashion, Split SVHN and Split CIFAR10. ... The Split MINIImage Net data stream is divided into 16 tasks, each containing samples from five successive classes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using specific models like Denoising Diffusion Generative Model (DDPM), Variational Autoencoders (VAEs), and Generative Adversarial Nets (GANs), but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.9, Python 3.8, TensorFlow 2.x). |
| Experiment Setup | Yes | Baselines and hyperparameters. ... The batch size b for each processing time is considered as b = 64. The maximum memory size for all models is D = 2, 000. ... We change the threshold λ [0.02, 0.05] in Eq. (5) when training on Split MNIST and examine the DEMU s performance. ... The proposed mechanism evaluates expansion signals measured by the MMD distance between the information recorded by each expert and that of the incoming data batch, expanding the model whenever needed. ... λ2 [0, 3] is a threshold controlling the dynamic expansion process. |