reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning system dynamics without forgetting

Authors: Xikun ZHANG, Dongjin Song, Yushan Jiang, Yixin Chen, Dacheng Tao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, we thoroughly investigate the influence of the system sequence configuration on model performance in CDL with both the widely adopted physics systems and our newly constructed Bio CDL, which demonstrates the advantage of MS-GODE over existing state-of-the-art techniques.
Researcher Affiliation	Academia	Xikun Zhang Generative AI Lab College of Computing and Data Science Nanyang Technological University Singapore 639798 EMAIL Dongjin Song & Yushan Jiang Department of Computer Science and Engineering University of Connecticut Storrs, CT USA EMAIL Yixin Chen Washington University in St. Louis St. Louis, MO USA EMAIL Dacheng Tao Generative AI Lab College of Computing and Data Science Nanyang Technological University Singapore 639798 EMAIL
Pseudocode	No	The paper describes the model architecture and mathematical formulations in detail but does not include any explicitly labeled pseudocode or algorithm blocks. It explains the workflow and components in prose and equations.
Open Source Code	Yes	Our code available at https://github.com/Queu Q/MS-GODE.
Open Datasets	No	Moreover, we construct a novel benchmark of biological dynamic systems for CDL, Bio-CDL, featuring diverse systems with disparate dynamics and significantly enriching the research field of machine learning for dynamic systems. Our new benchmark is constructed based on the EGFR receptor interaction model and Ran transportation model. But we have provided all code and instruction for generating the data, and will release the complete Bio-CDL benchmark to the public later.
Dataset Splits	Yes	During training, the first 60% part of the trajectory of each system is fed to the model to generate prediction for the remaining 40%. For each system sequence, 1,000 sequences are used for training, and another 1,000 sequences are used for testing. ... For each cellular system, we generate 100 iterations of simulation using VCell. Different from the physics systems, the temporal interval between two time steps of cellular system is not constant. Similar to the physics systems, the first 60% part of the observation of each system simulation is fed into the model to reconstruct the remaining 40% during training. For each system, we use 20 trajectories for training and 20 trajectories for testing.
Hardware Specification	Yes	All experiments are repeated 5 times on a Nvidia Titan Xp GPU.
Software Dependencies	Yes	The mdoel implementation is based on the following packages: [Python 3.6.10](https://www.python.org/) [Pytorch 1.4.0](https://pytorch.org/) [pytorch_geometric 1.4.3](https://pytorch-geometric.readthedocs.io/) torch-cluster==1.5.3 torch-scatter==2.0.4 torch-sparse==0.6.1 [torchdiffeq](https://github.com/rtqichen/torchdiffeq) [numpy 1.16.1](https://numpy.org/)
Experiment Setup	Yes	For the encoder network, the number of layers is 2, and the hidden dimension is 64. 1 head is used for the attention mechanism. The interaction network of the generator is configured as 1-layer network, and the number of hidden dimensions is 128. Finally, the decoder network is a fully-connected layer. The model is trained for 20 epochs over each system in the given sequence. We adopt the Adam W optimizer (Loshchilov & Hutter, 2017) and set the learning rate as 0.0005. In our work, task boundaries are provided to the models during training. ... --dropout_mask 0.0 --batch-size 10