Gradient-Free Generation for Hard-Constrained Systems
Authors: Chaoran Cheng, Boran Han, Danielle Maddix, Abdul Fatir Ansari, Andrew Stuart, Michael W Mahoney, Bernie Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate that our framework consistently outperforms baseline approaches in various zero-shot constrained generation tasks and also achieves competitive results in the regression tasks without additional fine-tuning. Our code is available at https://github.com/amazon-science/ECI-sampling. We demonstrate the effectiveness of our approach across various PDE systems, showing that ECI-guided generation strictly adheres to physical constraints and accurately captures complex distribution shifts induced by these constraints. |
| Researcher Affiliation | Collaboration | Chaoran Cheng1 , Boran Han2, Danielle C. Maddix2, Abdul Fatir Ansari2, Andrew Stuart3, Michael W. Mahoney4, Yuyang Wang2 1University of Illinois Urbana-Champaign 2Amazon Web Services 3Stores Foundational AI, Amazon 4Amazon SCOT |
| Pseudocode | Yes | Algorithm 1 Sampling from FFM (Euler Method) 1: Input: Learned vector field vθ, Euler steps N. 2: Sample noise function u0 µ0(u). 3: for t 0, 1/N, 2/N, . . . , (N 1)/N do 4: ut+1/N ut + vθ(ut, t)/N 5: return u1 |
| Open Source Code | Yes | Our code is available at https://github.com/amazon-science/ECI-sampling. |
| Open Datasets | No | This section provides a more detailed description of the datasets used in this work and their generation procedure. In addition to the statistics in Table 2, we further provide additional information in Table 7 for pre-training the FFM as our generative prior. For dataset types, synthetic indicates that the exact solutions are calculated on the fly based on the randomly sampled PDE parameters for both the training and test datasets. We manually assign the training set with 5k solutions and the test set with 1k solutions. On the other hand, simulated indicates that the solutions are pre-generated using numerical PDE solvers and are different for the training and test datasets. |
| Dataset Splits | Yes | We manually assign the training set with 5k solutions and the test set with 1k solutions. On the other hand, simulated indicates that the solutions are pre-generated using numerical PDE solvers and are different for the training and test datasets. |
| Hardware Specification | Yes | All FFMs are trained on a single NVIDIA A100 GPU with a batch size of 256, an initial learning rate of 3 10 4, and 20k iterations (approximately 1000 epochs for a 5k training dataset). For 3D data, we use a two-layer FNO with a frequency cutoff of 16 16 16, a time embedding channel of 16, a hidden channel of 32, and a projection dimension of 256, which gives a total of 9.46M trainable parameters for efficiency concerns. This model is trained on 4 NVIDIA A100 GPUs with a batch size of 24 (per GPU) for approximately a total number of 2M iterations (or 5000 epochs) with an initial learning rate of 3 10 4. |
| Software Dependencies | No | The paper does not explicitly provide specific version numbers for key software components or libraries, beyond mentioning the use of the Fourier Neural Operator (FNO) model and the Dopri5 ODE solver. |
| Experiment Setup | Yes | For 2D data, we use a four-layer FNO with a frequency cutoff of 32 32, a time embedding channel of 32, a hidden channel of 64, and a projection dimension of 256, which gives a total of 17.9M trainable parameters. All FFMs are trained on a single NVIDIA A100 GPU with a batch size of 256, an initial learning rate of 3 10 4, and 20k iterations (approximately 1000 epochs for a 5k training dataset). For 3D data, we use a two-layer FNO with a frequency cutoff of 16 16 16, a time embedding channel of 16, a hidden channel of 32, and a projection dimension of 256, which gives a total of 9.46M trainable parameters for efficiency concerns. This model is trained on 4 NVIDIA A100 GPUs with a batch size of 24 (per GPU) for approximately a total number of 2M iterations (or 5000 epochs) with an initial learning rate of 3 10 4. |