reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scaling Channel-Adaptive Self-Supervised Learning

Authors: Alice V. De Lorenci, Seung Eun Yi, Théo Moutakanni, Piotr Bojanowski, Camille Couprie, Juan C. Caicedo, Wolfgang Maximilian Anton Pernice

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We here conduct a first, large-scale comparative study into the scaling properties of channel-adaptive methods, and compare their performance across uniform model architectures, learning objectives, and rigorous benchmarks. We validate this result along an extensive set of experiments on various datasets from cell microscopy to geospatial imagery. Our DINO Bo C approach sets a new state-of-the-art across challenging benchmarks, including generalization to out-of-distribution tasks and unseen channel combinations.
Researcher Affiliation	Collaboration	1 University of São Paulo, Brasil (Work done during an internship at FAIR, Meta.), 2 Meta FAIR Paris, France, 3 University of Wisconsin Madison, USA, 4 Columbia University, New York, USA
Pseudocode	No	The paper describes methods like Channel-Adaptive SSL baseline (Section 3.1), Channel-Agnostic SSL (Section 3.2), and Channel Adaptive Hierarchical Attention model (Section 3.3) in prose and mathematical notation, and illustrates strategies in Figure 3 and Figure 4. However, it does not contain a clearly labeled pseudocode block or an algorithm section with structured steps.
Open Source Code	Yes	We open-source code and model weights for a new general-purpose feature extractor for fluorescent microscopy. The code is available at https://github.com/facebookresearch/dinov2/blob/main/docs/README_CHANNEL_ADAPTIVE_DINO.md.
Open Datasets	Yes	We leverage multiple microscopy datasets with varying numbers of channels. In particular, we use the Human Protein Atlas, WTC-11, JUMP-CP and Cyclops datasets for evaluation tasks. Additionally, we employ the CHAMMI benchmark, a standardized evaluation framework for channel-adaptive models. Meter-ML dataset. The Meter-ML dataset contains images acquired by multiple sensors.
Dataset Splits	Yes	The images from each data source present in the CHAMMI dataset are split into one training set and several test sets, designed for specific tasks. For HPA, we used the same train/val splits as Doron et al. (2023). For WTC, we created 80% 10% 10% uniformly distributed train/val/test splits.
Hardware Specification	No	The paper mentions training on '16 nodes, or 4 nodes for the DINO Bo C model' and 'batch size per GPU'. However, it does not provide specific details about the GPU models (e.g., NVIDIA A100, Tesla V100) or CPU processors used.
Software Dependencies	No	The paper mentions using 'DINOv2' and 'Vi T-Large models' with specific parameters, but it does not specify software dependencies with version numbers such as Python, PyTorch, or CUDA versions.
Experiment Setup	Yes	Unless specified otherwise, we trained Vi T-Large models with default patch size 16, and the default parameters of DINOv2 except a drop path rate of 0.1, a teacher momentum of 0.996, a learning rate of 5.0 10 4, and 20 warm-up epochs. The batch size used was of 1024 for all models, and the batch size per GPU was set to 8, except for DINO Bo C model, for which 32 fits in memory. We used the Adam W optimizer and a one cycle Cosine scheduler.