Controllable Data Generation with Hierarchical Neural Representations
Authors: Sheyang Tang, Xiaoyu Xu, Jiayan Qiu, Zhou Wang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across different modalities demonstrate that CHINR improves generalizability and offers flexible hierarchical control over the generated content. We conduct extensive experiments across various modalities, demonstrating CHINR s superior performance in reconstruction and generation metrics. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Canada 2School of Computing and Mathematical Sciences, University of Leicester, Leicester, United Kindom. |
| Pseudocode | No | The paper describes the proposed method in Sections 3, 3.1, 3.2, and 3.3.1, and 3.3.2, detailing the LoE network and the Hierarchical Conditional Diffusion Model (HCDM) through descriptive text and mathematical formulations. However, it does not include any explicitly labeled pseudocode or algorithm blocks with structured, step-by-step procedures in a code-like format. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide any links to a code repository. The text does not indicate that code is available in supplementary materials or through an anonymous review link. |
| Open Datasets | Yes | In our experiments, both Stage-1 and Stage-2 are trained and evaluated on the Celeb A-HQ 642 (Karras et al., 2018), Shape Net 643 (Chang et al., 2015), SRN-Cars (Sitzmann et al., 2019), and AMASS (Mahmood et al., 2019) datasets. |
| Dataset Splits | Yes | We use retrieval to compare the generalizability of CHINR and m NIF on the Celeb A-HQ dataset. Specifically, we generate samples and retrieve the closest images from the training set. ... Figure 14 shows this process by using images from the test-split of Celeb A-HQ as the targets, and train-split images as the searched set. |
| Hardware Specification | Yes | All experiments are implemented in Pytorch and run on a single Nvidia RTX3090 GPU. |
| Software Dependencies | No | The paper mentions that experiments are "implemented in Pytorch" but does not specify a version number for Pytorch or any other software libraries or dependencies with their respective version numbers. |
| Experiment Setup | Yes | In Stage-1, we train Lo Es via meta-learning on Celeb A-HQ, Shape Net, and AMASS, and with autodecoding on SRN-Cars. We use a batch size of 32, an outer learning rate of 1e 4, an inner learning rate of 1 with 3 steps, and train the Lo E for 800 epochs in the meta-learning setting. For auto-decoding experiments on SRN-Cars, we use a batch size of 8, a learning rate of 1e 4, and train the Lo E for 1000 epochs. In both settings, we use the Adam W (Loshchilov, 2017) optimizer without weight decay. In Stage-2, we set the training batch size to be 32, learning rate 1e 4, and cosine scheduler with minimum learning rate 0.0. We train the HCDM for 1000 epochs with the Adam W optimizer. |