Geometric Hyena Networks for Large-scale Equivariant Learning
Authors: Artem Moskalev, Mangal Prakash, Junjie Xu, Tianyu Cui, Rui Liao, Tommaso Mansi
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on all-atom property prediction of large RNA molecules and full protein molecular dynamics, Geometric Hyena outperforms existing equivariant models while requiring significantly less memory and compute that equivariant self-attention. Notably, our model processes the geometric context of 30k tokens 20 faster than the equivariant transformer and allows 72 longer context within the same budget. |
| Researcher Affiliation | Industry | Artem Moskalev 1 Mangal Prakash 1 Junjie Xu 1 2 3 Tianyu Cui 1 Rui Liao 1 Tommaso Mansi 1 1Johnson and Johnson Innovative Medicine |
| Pseudocode | Yes | Code 1. Pytorch implementation of the equivariant vector long convolution. |
| Open Source Code | No | No explicit statement about open-sourcing the code or a repository link for the Geometric Hyena implementation is provided in the paper. Code 1 shows an implementation snippet, but it is not a declaration of open-source availability for the full methodology. |
| Open Datasets | Yes | These include Open Vaccine (Das et al., 2020) and Ribonanza-2k (He et al., 2024) datasets for stability and degradation prediction where our model outperforms other state-of-the-art equivariant baselines by up to 15%. Moreover, it achieves a 6% improvement on the m RNA switching-factor prediction task (Groher et al., 2018) and outperforms other equivariant methods by 9% on all-atom protein molecular dynamics prediction. |
| Dataset Splits | Yes | For each experiment, we sample 2600 sequences for training, and 200 sequences each for validation and testing. |
| Hardware Specification | Yes | We record all runtimes on NVIDIA A10G GPU with CUDA 12.2. |
| Software Dependencies | Yes | We record all runtimes on NVIDIA A10G GPU with CUDA 12.2. |
| Experiment Setup | Yes | We train all models for 400 epochs with a batch size of 8. We employ Adam optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.001 and cosine learning rate scheduler (Loshchilov & Hutter, 2016) with 10 epochs of linear warm-up. The weight decay is set to 0.00001. |