Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries
Authors: Zongzhao Li, Jiacheng Cen, Wenbing Huang, Taifeng Wang, Le Song
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on our new dataset rRNAsolo and the existing dataset ARES (Townshend et al., 2021) show that our model achieves better performance across all metrics than SOTA methods, establishing its superiority. |
| Researcher Affiliation | Collaboration | 1Gaoling School of Artificial Intelligence, Renmin University of China 2Beijing Key Laboratory of Big Data Management and Analysis Methods 3Bio Map Research |
| Pseudocode | No | The paper describes the methodology using mathematical equations and textual descriptions of processes (e.g., atom-level, subunit-level, nucleotide-level message passing) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions: "We use the default configurations in the corresponding source codes for all baselines." and "The results reported in our paper for other baselines are all obtained by retraining using publicly available code on rRNAsolo dataset." This refers to the code of baseline methods, not the code for the proposed Equi RNA model itself. |
| Open Datasets | Yes | We introduce a new dataset named rRNAsolo for assessing size generalization in RNA structure evaluation. It covers a broader range of RNA sizes, includes more RNA types, and features more recent RNA structures when compared to existing datasets. ... We devise a novel dataset from the RNAsolo database (Adamczyk et al., 2022), a publicly available online repository that comprises a diverse array of biomolecular information concerning RNA. We call this new dataset as rRNAsolo. ... Additionally, we test our approach on ... an existing dataset ARES (Townshend et al., 2021). |
| Dataset Splits | Yes | rRNAsolo: In our meticulously designed dataset, we employs candidate structures of RNAs with 50-100 nt as training set and candidate structures with 100-200 nt as validation and test sets. The dataset rRNAsolo consists of 80k/6k/6k candidate structures generated from 200/15/15 RNAs for training, validation, and test sets, respectively. ... Given that RNAs with over 200 nucleotides require considerable time for candidate conformation generation, our validation and test sets primarily focus on RNAs with 100-200 nucleotides, using a training set composed of RNAs with 50-100 nucleotides. |
| Hardware Specification | Yes | Both our approach and all other baseline methods are trained and tested on a single NVIDIA A100-80G GPU. |
| Software Dependencies | No | The paper mentions tools like "Py MOL", "RNAcentral", "BLAST", "Infernal", and "MAFFT" in the context of data generation and analysis, but does not specify version numbers for these or for any core deep learning libraries used for model implementation. |
| Experiment Setup | Yes | Table 8 presents the hyper-parameters of Equi RNA used in two experiments of this paper. Additionally, the results reported in our paper for other baselines are all obtained by retraining using publicly available code on rRNAsolo dataset. Each layer here consists of the Eq. (1), Eq. (2), and Eq. (3). ... Hyperparameter rRNAsolo dataset ARES dataset Learning Rate 1e-4 1e-4 Epochs 20 20 nucleotide template size 16 16 atom template size 16 16 hidden size 128 128 n layers 3 3 K 16 16 M 26 26 |