reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Feedforward Few-shot Species Range Estimation

Authors: Christian Lange, Max Hamilton, Elijah Cole, Alexander Shepard, Samuel Heinrich, Angela Zhu, Subhransu Maji, Grant Van Horn, Oisin Mac Aodha

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on two challenging benchmarks, where we obtain state-of-the-art range estimation performance, in a fraction of the compute time, compared to recent alternative approaches.
Researcher Affiliation	Collaboration	1University of Edinburgh 2UMass Amherst 3Gen Bio AI 4i Naturalist 5Cornell. Correspondence to: Christian Lange <EMAIL>.
Pseudocode	No	The paper describes methods and procedures in narrative text, often accompanied by mathematical formulations (e.g., LAN-full loss in Section 3.1, LAN-full-b in Section 3.2), but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code for FS-SINR is available at: https://github.com/Chris-lange/fs-sinr
Open Datasets	Yes	Data. We train FS-SINR on the presence-only dataset from Cole et al. (2023), which comprises 35.5 million citizen-science records each annotated with latitude, longitude, and species label for 47,375 diverse species including plants, fungi, and animals from the i Naturalist platform (i Naturalist, 2025).
Dataset Splits	Yes	Importantly, we hold out any species from the union of these two datasets from the training set so that species from the evaluation set are not observed during training. As a result, by default, FS-SINR is trained on data from 44,422 species.
Hardware Specification	Yes	Training takes approximately ten hours on a single NVIDIA A6000 GPU, requiring approximately six gigabytes of RAM.
Software Dependencies	No	The paper mentions software components like PyTorch (Paszke et al., 2019), Adam optimizer (Kingma & Ba, 2015), Grit LM (Muennighoff et al., 2025), EVA-02 ViT (Fang et al., 2024), and scikit-learn (Pedregosa et al., 2011). However, it only provides citations/publication years for these, not specific version numbers for replication.
Experiment Setup	Yes	For all training we use the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.0005, and an exponential learning rate scheduler with a learning rate decay of 0.98 per epoch, and we use a batch size of 2048.