Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching
Authors: Aaron J Havens, Benjamin Kurt Miller, Bing Yan, Carles Domingo-Enrich, Anuroop Sriram, Daniel S. Levine, Brandon M Wood, Bin Hu, Brandon Amos, Brian Karrer, Xiang Fu, Guan-Horng Liu, Ricky T. Q. Chen
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models where we perform amortized conformer generation across many molecular systems. |
| Researcher Affiliation | Collaboration | 1University of Illinois Urbana-Champaign 2FAIR at Meta 3New York University 4Microsoft Research New England. |
| Pseudocode | Yes | Algorithm 1 Adjoint Sampling |
| Open Source Code | Yes | Code & and Benchmarks provided at github.com/facebookresearch/ adjoint sampling. |
| Open Datasets | Yes | We are releasing the data we used in this paper as a challenging Conformation Benchmark with the goal of fostering development of scalable, amortized sampling algorithms. |
| Dataset Splits | Yes | We split the SPICE molecules into a train set and a 80 molecule test set, allowing us to validate our sampler s ability to extrapolate to unseen molecules. |
| Hardware Specification | No | Each GPU maintains its own buffer... across 8 GPUs. We use 8 GPUs with 1024 8 initial molecules... The paper only mentions the number of GPUs used, but not specific models or types of GPUs, which is required for a 'Yes' answer. |
| Software Dependencies | Yes | We used the default hyperparameters from CREST version 3.0.2 including a 6 kcal/mol cutoff on final conformers. |
| Experiment Setup | Yes | For the DW-4 energy we use an EGNN with 3 layers and 128 hidden features. We train Adjoint Sampling for 1000 outer iterations, generating 512 new samples and energy evaluations per iteration into a buffer of max size 10000. In each iteration we optimize on 500 batches of batch size 512 from the replay buffer, using a learning rate of 3 10 4. We use a geometric noise schedule with σmin = 10 4 and σmax = 3.0. |