Regulatory DNA Sequence Design with Reinforcement Learning
Authors: Zhao Yang, Bing Su, Chuan Cao, Ji-Rong Wen
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types, demonstrating its ability to generate high-fitness CREs while maintaining sequence diversity. The paper includes multiple tables (e.g., Table 2, 3, 4, 5) showing performance comparisons, ablation studies, and evaluation metrics across different datasets and settings, which are characteristic of empirical studies. |
| Researcher Affiliation | Collaboration | Zhao Yang1 Bing Su1 Chuan Cao2 Ji-Rong Wen1 1Gaoling School of Artificial Intelligence, Renmin University of China 2Microsoft Research AI4Science |
| Pseudocode | Yes | The complete workflow is detailed in Appendix G Algorithm 1. Algorithm 1 TACO: RL-Based Fine-tuning for Autoregressive DNA Models |
| Open Source Code | Yes | The code is available at https://github.com/yangzhao1230/TACO. |
| Open Datasets | Yes | The yeast promoter dataset includes two types of growth media: complex (de Boer et al., 2020) and defined (Vaishnav et al., 2022). The human enhancer dataset consists of three cell lines: Hep G2, K562, and SK-N-SH (Gosai et al., 2024). |
| Dataset Splits | Yes | To simulate a progression from low-fitness to highfitness sequences, we further partitioned D into a subset Dlow for pre-training the policy. Specifically, we define three difficulty levels hard, medium, and easy based on fitness percentiles of 20 40, 40 60, and 60 80, respectively, for both media conditions in the yeast dataset. For the human enhancer datasets, we define the hard fitness range as values below 0.2, the medium range as values between 0.2 and 0.75, and the easy range as values between 0.75 and 2.5. |
| Hardware Specification | Yes | All experiments were conducted on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions software like "Light GBM (Ke et al., 2017)" and "Hyena DNA (Nguyen et al., 2024b)" but does not provide specific version numbers for these software libraries or other ancillary software components. |
| Experiment Setup | Yes | During optimization, we set the learning rate to 5e-4 for the yeast task and 1e-4 for the human task. The hyperparameter α, which controls the strength of the TFBS reward in equation 5, was set to 0.01. For each round, we generated K = 256 sequences, each with a fixed length, consistent with the datasets. A total of E = 100 optimization rounds were conducted. Table 9: Hyperparameters used for training the Light GBM regression model. Parameter Value Objective Regression Metric MAE Boosting Type GBDT Number of Leaves 63 Learning Rate 0.05 Feature Fraction 0.7 Seed Random State |