Controllable Data Generation with Hierarchical Neural Representations

Authors: Sheyang Tang, Xiaoyu Xu, Jiayan Qiu, Zhou Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across different modalities demonstrate that CHINR improves generalizability and offers flexible hierarchical control over the generated content. We conduct extensive experiments across various modalities, demonstrating CHINR s superior performance in reconstruction and generation metrics.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Canada 2School of Computing and Mathematical Sciences, University of Leicester, Leicester, United Kindom.
Pseudocode No The paper describes the proposed method in Sections 3, 3.1, 3.2, and 3.3.1, and 3.3.2, detailing the LoE network and the Hierarchical Conditional Diffusion Model (HCDM) through descriptive text and mathematical formulations. However, it does not include any explicitly labeled pseudocode or algorithm blocks with structured, step-by-step procedures in a code-like format.
Open Source Code No The paper does not contain any explicit statements about releasing source code, nor does it provide any links to a code repository. The text does not indicate that code is available in supplementary materials or through an anonymous review link.
Open Datasets Yes In our experiments, both Stage-1 and Stage-2 are trained and evaluated on the Celeb A-HQ 642 (Karras et al., 2018), Shape Net 643 (Chang et al., 2015), SRN-Cars (Sitzmann et al., 2019), and AMASS (Mahmood et al., 2019) datasets.
Dataset Splits Yes We use retrieval to compare the generalizability of CHINR and m NIF on the Celeb A-HQ dataset. Specifically, we generate samples and retrieve the closest images from the training set. ... Figure 14 shows this process by using images from the test-split of Celeb A-HQ as the targets, and train-split images as the searched set.
Hardware Specification Yes All experiments are implemented in Pytorch and run on a single Nvidia RTX3090 GPU.
Software Dependencies No The paper mentions that experiments are "implemented in Pytorch" but does not specify a version number for Pytorch or any other software libraries or dependencies with their respective version numbers.
Experiment Setup Yes In Stage-1, we train Lo Es via meta-learning on Celeb A-HQ, Shape Net, and AMASS, and with autodecoding on SRN-Cars. We use a batch size of 32, an outer learning rate of 1e 4, an inner learning rate of 1 with 3 steps, and train the Lo E for 800 epochs in the meta-learning setting. For auto-decoding experiments on SRN-Cars, we use a batch size of 8, a learning rate of 1e 4, and train the Lo E for 1000 epochs. In both settings, we use the Adam W (Loshchilov, 2017) optimizer without weight decay. In Stage-2, we set the training batch size to be 32, learning rate 1e 4, and cosine scheduler with minimum learning rate 0.0. We train the HCDM for 1000 epochs with the Adam W optimizer.