Large Language-Geometry Model: When LLM meets Equivariance

Authors: Zongzhao Li, Jiacheng Cen, Bing Su, Tingyang Xu, Yu Rong, Deli Zhao, Wenbing Huang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that Equi LLM delivers significant improvements over previous methods across molecular dynamics simulation, human motion simulation, and antibody design, highlighting its promising generalizability.
Researcher Affiliation Collaboration 1Gaoling School of Artificial Intelligence, Renmin University of China 2Beijing Key Laboratory of Research on Large Models and Intelligent Governance 3Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE 4DAMO Academy, Alibaba Group, Hangzhou, China 5Hupan Lab, Hangzhou, China.
Pseudocode No The paper describes the model architecture and mathematical formulations (e.g., equations for Equivariant Adapter in 3.2) but does not include a distinct block labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper mentions reproducing results using 'official implementation' for a baseline method (Geo AB) but does not provide any explicit statement or link to the source code for the proposed Equi LLM framework.
Open Datasets Yes In the dynamic simulation task, to demonstrate the broad applicability of our model across varying scales, we conduct experiments on two distinct datasets: the molecular-level MD17 (Chmiela et al., 2017) dataset and the macro-level Human Motion Capture (De la Torre et al., 2009) dataset. Following previous study MEAN (Kong et al., 2022), we selected complete antibody-antigen complexes from the SAb Dab (Dunbar et al., 2014) dataset to construct the training and validation sets. ... For test set, we selected 60 diverse complexes from the RAb D (Adolf-Bryfogle et al., 2018) dataset to evaluate the performance of different methods.
Dataset Splits Yes In order to expedite the dynamics simulations, we implement a sampling strategy based on previous research (Huang et al., 2022) to extract a subset of trajectories for the purposes of training, validation, and testing. This approach involves randomly selecting an initial point and then sampling 2 T timestamps. The first T timestamps are utilized as input for the models, while the remaining T timestamps represent the future states that the models need to predict. B.1 Implementation Details on MD17 dataset: The training, validation, and testing sets consist of 500, 2000, and 2000 samples, respectively. B.2 Implementation Details on Motion Capture: For subject #35 (Walk), the dataset comprises 1100 training, 600 validation, and 600 testing trajectories, whereas subject #102 (Basketball) includes 600 training, 300 validation, and 300 testing trajectories.
Hardware Specification Yes All computational experiments, encompassing model training, validation and testing phases, are executed on a single NVIDIA A100-80G GPU.
Software Dependencies No The paper does not explicitly mention any specific software dependencies or their version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes To ensure a fair comparison, all hyperparameters (e.g. learning rate, number of training epochs) are kept consistent across our model and all other baselines. Detailed information can be found in Appendix B.1. Table 6: Hyper-parameters of Equi LLM and other methods. The previous length Tp denotes the length of input sequence, the future length Tf denotes the length of output sequence, the time lag t denotes the interval between two timestamps, the hidden size denotes the size of hidden states in all Multi-Layer Perceptrons (MLPs) within the Equi LLM framework, and the layer denotes the number of layers. B.3 Implementation Details on Antibody Design: Equi LLM maintains identical hyper-parameters with MEAN: a 64-dimensional trainable embedding for each amino acid type, 128-dimensional hidden states, 3 network layers, a batch size of 16, and 20 training epochs. The Adam optimizer is employed with an initial learning rate of 0.001, which decays by 5% per epoch.