reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Geometric Hyena Networks for Large-scale Equivariant Learning

Authors: Artem Moskalev, Mangal Prakash, Junjie Xu, Tianyu Cui, Rui Liao, Tommaso Mansi

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluated on all-atom property prediction of large RNA molecules and full protein molecular dynamics, Geometric Hyena outperforms existing equivariant models while requiring significantly less memory and compute that equivariant self-attention. Notably, our model processes the geometric context of 30k tokens 20 faster than the equivariant transformer and allows 72 longer context within the same budget.
Researcher Affiliation	Industry	Artem Moskalev 1 Mangal Prakash 1 Junjie Xu 1 2 3 Tianyu Cui 1 Rui Liao 1 Tommaso Mansi 1 1Johnson and Johnson Innovative Medicine
Pseudocode	Yes	Code 1. Pytorch implementation of the equivariant vector long convolution.
Open Source Code	No	No explicit statement about open-sourcing the code or a repository link for the Geometric Hyena implementation is provided in the paper. Code 1 shows an implementation snippet, but it is not a declaration of open-source availability for the full methodology.
Open Datasets	Yes	These include Open Vaccine (Das et al., 2020) and Ribonanza-2k (He et al., 2024) datasets for stability and degradation prediction where our model outperforms other state-of-the-art equivariant baselines by up to 15%. Moreover, it achieves a 6% improvement on the m RNA switching-factor prediction task (Groher et al., 2018) and outperforms other equivariant methods by 9% on all-atom protein molecular dynamics prediction.
Dataset Splits	Yes	For each experiment, we sample 2600 sequences for training, and 200 sequences each for validation and testing.
Hardware Specification	Yes	We record all runtimes on NVIDIA A10G GPU with CUDA 12.2.
Software Dependencies	Yes	We record all runtimes on NVIDIA A10G GPU with CUDA 12.2.
Experiment Setup	Yes	We train all models for 400 epochs with a batch size of 8. We employ Adam optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.001 and cosine learning rate scheduler (Loshchilov & Hutter, 2016) with 10 epochs of linear warm-up. The weight decay is set to 0.00001.