Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics
Authors: Josiah Kratz, Jacob Adamczyk
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that model-free deep RL is able to recover exact solutions and control cell populations even in the presence of long-range temporal dynamics. To further test our approach in more realistic settings, we demonstrate robust RL-based control strategies in environments with measurement noise and dynamic memory strength. ... Figure 3: Performance comparison of constant drug application, solution for the memoryless case, resistant fraction-based pulsing technique, and policy learned by RL. ... Figure 4: PPO and SAC fail to find a bang-bang control policy and have a lower performance than DQN; highlighting the need for discrete action algorithms, as informed by optimal control. |
| Researcher Affiliation | Academia | Josiah C. Kratz Computational Biology Department Carnegie Mellon University Pittsburgh, PA 15213, USA EMAIL Jacob Adamczyk Department of Physics University of Massachusetts Boston IAIFI Boston, MA 02125 EMAIL |
| Pseudocode | No | The paper describes methodologies in prose and mathematical equations but does not contain a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | All code to reproduce our experimental results can be found at https://github.com/Jacob HA/RL4Dosing. |
| Open Datasets | No | The paper describes a novel memory-based model for phenotypic switching and simulates cellular population dynamics based on this model. It does not mention using any specific publicly available datasets for its experiments; rather, the data is generated from the proposed dynamical system. |
| Dataset Splits | No | The paper uses a simulated environment based on a novel population model, rather than external datasets, and thus does not describe training/test/validation splits for a dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using open-source code from Stable-Baselines3 and implementations of DQN and FQF but does not provide specific version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | We tune over several hyperparameters (whose values we list in the Appendix). ... Table 2: Hyperparameters for Double DQN ... Table 3: Hyperparameters for (Noisy Net) FQF. ... The discount factor γ = 0.999, gradient steps per update (1), frames stacked (5) and buffer size (100,000) are fixed throughout. |