Large Language-Geometry Model: When LLM meets Equivariance
Authors: Zongzhao Li, Jiacheng Cen, Bing Su, Tingyang Xu, Yu Rong, Deli Zhao, Wenbing Huang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Equi LLM delivers significant improvements over previous methods across molecular dynamics simulation, human motion simulation, and antibody design, highlighting its promising generalizability. |
| Researcher Affiliation | Collaboration | 1Gaoling School of Artificial Intelligence, Renmin University of China 2Beijing Key Laboratory of Research on Large Models and Intelligent Governance 3Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE 4DAMO Academy, Alibaba Group, Hangzhou, China 5Hupan Lab, Hangzhou, China. |
| Pseudocode | No | The paper describes the model architecture and mathematical formulations (e.g., equations for Equivariant Adapter in 3.2) but does not include a distinct block labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper mentions reproducing results using 'official implementation' for a baseline method (Geo AB) but does not provide any explicit statement or link to the source code for the proposed Equi LLM framework. |
| Open Datasets | Yes | In the dynamic simulation task, to demonstrate the broad applicability of our model across varying scales, we conduct experiments on two distinct datasets: the molecular-level MD17 (Chmiela et al., 2017) dataset and the macro-level Human Motion Capture (De la Torre et al., 2009) dataset. Following previous study MEAN (Kong et al., 2022), we selected complete antibody-antigen complexes from the SAb Dab (Dunbar et al., 2014) dataset to construct the training and validation sets. ... For test set, we selected 60 diverse complexes from the RAb D (Adolf-Bryfogle et al., 2018) dataset to evaluate the performance of different methods. |
| Dataset Splits | Yes | In order to expedite the dynamics simulations, we implement a sampling strategy based on previous research (Huang et al., 2022) to extract a subset of trajectories for the purposes of training, validation, and testing. This approach involves randomly selecting an initial point and then sampling 2 T timestamps. The first T timestamps are utilized as input for the models, while the remaining T timestamps represent the future states that the models need to predict. B.1 Implementation Details on MD17 dataset: The training, validation, and testing sets consist of 500, 2000, and 2000 samples, respectively. B.2 Implementation Details on Motion Capture: For subject #35 (Walk), the dataset comprises 1100 training, 600 validation, and 600 testing trajectories, whereas subject #102 (Basketball) includes 600 training, 300 validation, and 300 testing trajectories. |
| Hardware Specification | Yes | All computational experiments, encompassing model training, validation and testing phases, are executed on a single NVIDIA A100-80G GPU. |
| Software Dependencies | No | The paper does not explicitly mention any specific software dependencies or their version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | To ensure a fair comparison, all hyperparameters (e.g. learning rate, number of training epochs) are kept consistent across our model and all other baselines. Detailed information can be found in Appendix B.1. Table 6: Hyper-parameters of Equi LLM and other methods. The previous length Tp denotes the length of input sequence, the future length Tf denotes the length of output sequence, the time lag t denotes the interval between two timestamps, the hidden size denotes the size of hidden states in all Multi-Layer Perceptrons (MLPs) within the Equi LLM framework, and the layer denotes the number of layers. B.3 Implementation Details on Antibody Design: Equi LLM maintains identical hyper-parameters with MEAN: a 64-dimensional trainable embedding for each amino acid type, 128-dimensional hidden states, 3 network layers, a batch size of 16, and 20 training epochs. The Adam optimizer is employed with an initial learning rate of 0.001, which decays by 5% per epoch. |