reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

Authors: Yi Zhou, Yilai Li, Jing Yuan, Quanquan Gu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS
Researcher Affiliation	Industry	Yi Zhou , Yilai Li , Jing Yuan , Quanquan Gu Byte Dance Research EMAIL
Pseudocode	Yes	Algorithm 1 Flow Posterior Sampling
Open Source Code	No	No explicit statement about releasing the code for the methodology described in this paper is provided. The paper mentions using publicly available code from Stable Diffusion and Hugging Face's diffusers for related components, but not their own implementation of CRYOFM.
Open Datasets	Yes	Our training dataset consists of deposited sharpened density maps from the EMDB (ww PDB Consortium, 2023)... The EMDB IDs of the training and testing data used in this paper have been uploaded to https://figshare.com/s/9ef2614108391c04d910.
Dataset Splits	Yes	This curation resulted in a total of 3479 density maps, where 32 density maps were selected as test set and excluded from training.
Hardware Specification	Yes	Training Hardware 8 A100
Software Dependencies	No	The paper mentions the use of 'Fairseq Adam (Ott et al., 2019) optimizer' but does not provide a specific version number for the Fairseq library itself or any other key software dependencies with their versions.
Experiment Setup	Yes	In all experiments, we employed the Fairseq Adam (Ott et al., 2019) optimizer with a default learning rate of 1e-4, betas set to (0.9, 0.98), and a weight decay of 0.01. A linear warm-up strategy was applied during the first 2000 steps of training.