reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction

Authors: Yunfei Teng, Yuxuan Ren, Kai Chen, Xi Chen, Zhaoming Chen, Qiwei Ye

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENT In this section, we compare our method with Iso Net, the state-of-the-art missing wedge correction technique, across various experiments. First, we validate our hypothesis using simple geometric shapes, as described in Section 5.1. Next, we assess our algorithm on a simulated dataset and benchmark it against other approaches in Section 5.2. Finally, we evaluate our method on realworld examples to test its robustness, as discussed in Section 5.3. Additional implementation and data processing details are provided in the Appendix, where we also present results from the latest simultaneous missing wedge correction and denoising method, Deep De Wedge (Wiedemann & Heckel, 2024). ... Table 1: Quantitative evaluation of image quality for tomography reconstructions using different methods, comparing PSNR and SSIM metrics (higher values indicating better performance for both metrics) on sphere, prism and Vipp1 assembly datasets.
Researcher Affiliation	Academia	Beijing Academy of Artificial Intelligence (BAAI) EMAIL
Pseudocode	Yes	Algorithm 1 Training prediction model. Require: Tomogram dataset Y, noise levels σ2, σ2 g, σ2 h > 0, estimated noise variance σ2 n, energy model Eϕ : Rd R+, prediction model gθ : Rd Rd, learning rate η, energy penalty term λ 0. repeat Randomly generate noise ϵ, ϵg, ϵh N(0, σ2I), N(0, σ2 g I), N(0, σ2 h I). Set f(ϕ, θ) = Eϕ (TM R gθ(y + ϵg)) + 1/2σ2 n TM gθ(y + ϵg) y 2 2 Update θ θ η θ R 1 gθ TM R (gθ(y) + ϵh) [gθ(y)] 2 2 Update ϕ ϕ ηλ ϕ[Eϕ(y + ϵ) f(ϕ, θ)] Update θ ηλ θ + (1 ηλ) θ ηλ θ f(ϕ, θ) Reduce the energy penalty term λ until convergence return gθ
Open Source Code	No	The paper does not provide an explicit statement of code release or a link to a code repository.
Open Datasets	Yes	Following the approach of Iso Net, we first evaluated our performance on the publicly available atomic model apoferritin (PDB:6Z6U) (Yip et al., 2020). Additionally, we selected the recently published electron microscopy dataset of C13 Vipp1 stacked rings (EMDB:18424) (Junglas et al., 2024). ... We collected all seven tilt series from the EMPIAR-10045 dataset and applied the same preprocessing steps as Iso Net, detailed in Appendix A.8.5. ... Raw tilt series for the HIV capsid is downloaded from the EMPIAR database.
Dataset Splits	Yes	For training, the volume is split into ten 96 96 96 pixel subtomograms with randomly chosen origins. These subtomograms are then randomly cropped to 64 64 64 pixels before being input into the models. ... For training, a tomogram is split into seventy 80 80 80 pixel subtomograms, resulting in a total of 490 subtomograms. These are randomly cropped to 64 64 64 pixels before being fed into the models. ... During training, each tomogram is split into one hundred 96 96 96 pixel subtomograms, resulting in 300 subtomograms. These are randomly cropped to 64 64 64 pixels before being fed into the models.
Hardware Specification	Yes	For instance, with an NVIDIA V100, Cryo GEN requires only two hours, whereas Iso Net takes around 20 hours on Section 5.3 experiment.
Software Dependencies	No	The paper mentions software tools like Chimera X and Aretomo2, but does not provide specific version numbers for these or any other libraries or programming languages used.
Experiment Setup	Yes	The Cryo GEN are trained with the Adam optimizer (Kingma & Ba, 2015) with batch size one for simulated shapes and protein subtomograms and with batch size 4 for real-world examples. The learning rate is set to 0.0004 with a linear warm-up phase in the initial one-tenth steps, which is followed by a linear decay schedule thereafter. Different from Iso Net, which progressively increases the noise scale, we apply random noise levels across all training steps. Specifically, a random number is sampled from a uniform distribution within the range (0, 1] and multiplied by the set noise scale for each step. Additionally, the penalty term λ is kept constant during the first epoch and then decays linearly throughout the subsequent epochs.