Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Scaling Channel-Adaptive Self-Supervised Learning

Authors: Alice V. De Lorenci, Seung Eun Yi, Théo Moutakanni, Piotr Bojanowski, Camille Couprie, Juan C. Caicedo, Wolfgang Maximilian Anton Pernice

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We here conduct a first, large-scale comparative study into the scaling properties of channel-adaptive methods, and compare their performance across uniform model architectures, learning objectives, and rigorous benchmarks. We validate this result along an extensive set of experiments on various datasets from cell microscopy to geospatial imagery. Our DINO Bo C approach sets a new state-of-the-art across challenging benchmarks, including generalization to out-of-distribution tasks and unseen channel combinations.
Researcher Affiliation Collaboration 1 University of São Paulo, Brasil (Work done during an internship at FAIR, Meta.), 2 Meta FAIR Paris, France, 3 University of Wisconsin Madison, USA, 4 Columbia University, New York, USA
Pseudocode No The paper describes methods like Channel-Adaptive SSL baseline (Section 3.1), Channel-Agnostic SSL (Section 3.2), and Channel Adaptive Hierarchical Attention model (Section 3.3) in prose and mathematical notation, and illustrates strategies in Figure 3 and Figure 4. However, it does not contain a clearly labeled pseudocode block or an algorithm section with structured steps.
Open Source Code Yes We open-source code and model weights for a new general-purpose feature extractor for fluorescent microscopy. The code is available at https://github.com/facebookresearch/dinov2/blob/main/docs/README_CHANNEL_ADAPTIVE_DINO.md.
Open Datasets Yes We leverage multiple microscopy datasets with varying numbers of channels. In particular, we use the Human Protein Atlas, WTC-11, JUMP-CP and Cyclops datasets for evaluation tasks. Additionally, we employ the CHAMMI benchmark, a standardized evaluation framework for channel-adaptive models. Meter-ML dataset. The Meter-ML dataset contains images acquired by multiple sensors.
Dataset Splits Yes The images from each data source present in the CHAMMI dataset are split into one training set and several test sets, designed for specific tasks. For HPA, we used the same train/val splits as Doron et al. (2023). For WTC, we created 80% 10% 10% uniformly distributed train/val/test splits.
Hardware Specification No The paper mentions training on '16 nodes, or 4 nodes for the DINO Bo C model' and 'batch size per GPU'. However, it does not provide specific details about the GPU models (e.g., NVIDIA A100, Tesla V100) or CPU processors used.
Software Dependencies No The paper mentions using 'DINOv2' and 'Vi T-Large models' with specific parameters, but it does not specify software dependencies with version numbers such as Python, PyTorch, or CUDA versions.
Experiment Setup Yes Unless specified otherwise, we trained Vi T-Large models with default patch size 16, and the default parameters of DINOv2 except a drop path rate of 0.1, a teacher momentum of 0.996, a learning rate of 5.0 10 4, and 20 warm-up epochs. The batch size used was of 1024 for all models, and the batch size per GPU was set to 8, except for DINO Bo C model, for which 32 fits in memory. We used the Adam W optimizer and a one cycle Cosine scheduler.