Hessian Geometry of Latent Space in Generative Models

Authors: Alexander Lobashev, Dmitry Guskov, Maria Larchenko, Mikhail Tamm

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The method approximates the posterior distribution of latent variables given generated samples and uses this to learn the log-partition function, which defines the Fisher metric for exponential families. Theoretical convergence guarantees are provided, and the method is validated on the Ising and TASEP models, outperforming existing baselines in reconstructing thermodynamic quantities. Applied to diffusion models, the method reveals a fractal structure of phase transitions in the latent space, characterized by abrupt changes in the Fisher metric. ... To support our theoretical reasoning, in this section validate our method on the exactly solvable statistical models, namely Ising and TASEP, and compare our estimation of log Z(t) with a ground truth. Then, we evaluate our method on two-dimensional slices of diffusion models, comparing the path length and curvature of the learned trajectories with those produced by other methods.
Researcher Affiliation Collaboration 1Glam AI, San Francisco, USA 2Artificial Neural Computing Corp., Weston, FL, USA 3Magicly AI, Dubai, UAE 4School of Digital Technologies, Tallinn University, Tallinn, Estonia. Correspondence to: Alexander Lobashev <EMAIL>, Dmitry Guskov <EMAIL>.
Pseudocode No The paper describes methods like approximating the posterior distribution and estimating the log partition function, but it does not present these procedures in structured pseudocode or algorithm blocks.
Open Source Code Yes Our source code is available at https://github.com/alobashev/ hessian-geometry-of-diffusion-models.
Open Datasets No Our dataset consists of N = 5.4 105 samples of spin configurations on the square lattice of size L L = 128 128 with periodic boundary conditions. We consider the parameter ranges β 1 = T [Tmin, Tmax] = [1, 5], H [Hmin, Hmax] = [ 2, 2] similar to the ranges used in (Walker, 2019). ... We generate a dataset of N = 150000 stationary TASEP configurations on a 1d lattice with M = 16384 sites. The rates α(β) of adding (removing) particles at the left(right) boundary are sampled from the uniform prior distribution over a square [0, 1] [0, 1].
Dataset Splits Yes In all our experiments the training set consists of 80% of samples and the other 20% are used for testing.
Hardware Specification Yes For all our numerical experiments the training was performed on a single Nvidia-HGX compute node with 8 A100 GPUs.
Software Dependencies No The paper mentions using U2-Net (Qin et al., 2020) and Adam optimizer, but does not provide specific version numbers for these or other software libraries/environments.
Experiment Setup Yes For our generation we use 50 inference steps, classifier free guidance scale set to 5. Prompt is chosen as High quality picture, 4k, detailed and negative prompt blurry, ugly, stock photo. ... We trained U2-Net using Adam optimizer with learning rate 0.00001 and batch size of 2048 for NU2Net steps = 20000 gradient update steps.