Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion
Authors: Enrico Ventura, Beatrice Achilli, Gianluigi Silvestri, Carlo Lucibello, Luca Ambrogioni
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our main contributions are: I) an in-depth theoretical random-matrix analysis of the distribution of Jacobian spectra in diffusion models on linear manifolds and II) a detailed experimental analysis of Jacobian spectra extracted from trained networks on linear manifolds and on image datasets. The analysis of these spectra is important as it provides a detailed picture of the latent geometry that guides the generative diffusion process. We show that the linear theory predicts several phenomena that we observed in trained networks. |
| Researcher Affiliation | Academia | 1Department of Computing Sciences, BIDSA, Bocconi University, Milan, MI 20100, Italy. 2Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6500 HD Nijmegen, the Netherlands. 3One Planet Research Center, imec-the Netherlands, Wageningen, the Netherlands. |
| Pseudocode | Yes | Algorithm 1 Estimate singular values at x0 Algorithm 2 Estimate singular values at x0 with central difference |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-sourcing the code for the described methodology. |
| Open Datasets | Yes | Fig. 5 shows the temporal evolution of the spectrum estimated numerically from the Jacobian of models trained on MNIST, Cifar10 and Celeb A. |
| Dataset Splits | No | For each data-set, the number of used training data-points amounts to the full set of data available. For the linear models, we used a Variance Exploding continuous score model trained with 2M steps (batch size 128). |
| Hardware Specification | Yes | For all experiments we primarily utilized NVIDIA Tesla V100 GPUs with 32 GB of memory. |
| Software Dependencies | No | The paper mentions using a "Pixel CNN++" model architecture but does not specify any software libraries or frameworks with version numbers for reproducibility. |
| Experiment Setup | Yes | We use the variance scheduler with βmin = 10^-4 and βmin = 2 * 10^-2, T = 1000 time steps, and score model backbone (Pixel CNN++ (Salimans et al., 2017)). Furthermore, for each of the datasets, we adjusted the partameters to account for the different complexity (see Table 1). For the linear models, we used a Variance Exploding continuous score model trained with 2M steps (batch size 128). The model had a Residual architecture with size 128 hidden channels in each layer, two residual blocks comprised by two linear layers with Si Lu. |