reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Double-Ellipsoid Geometry of CLIP

Authors: Meir Yossef Levi, Guy Gilboa

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our statistical analysis and many experimental results are based on MS-COCO (Lin et al., 2014) validation set, a common standard image-text dataset. In Fig. 2, normalized histograms are shown for features 93, 134 and 494 of the CLIP latent vector. To empirically analyze the uniformity and alignment terms in Eq. 8 alongside the overall loss in Eq. 7, we use the MS-COCO validation set. The results show that the loss for correctly classified samples decreases monotonically with the shift toward the origin.
Researcher Affiliation	Academia	1Viterbi Faculty of Electrical and Computer Engineering, Technion Israel Institute of Technology, Haifa, Israel. Correspondence to: Meir Yossef Levi <EMAIL>, Guy Gilboa <EMAIL>.
Pseudocode	No	The paper describes methods and analyses using mathematical equations and prose, but it does not contain any explicitly labeled pseudocode blocks or algorithms in a structured, code-like format.
Open Source Code	No	The paper references existing frameworks and methods (e.g., "The un CLIP framework (Ramesh et al., 2022)", "text inversion (Han et al., 2024; Gal et al., 2022; Mokady et al., 2023)"), but it does not contain any statement from the authors about releasing their own source code for the methodology described in this paper, nor does it provide a link to a code repository.
Open Datasets	Yes	Our statistical analysis and many experimental results are based on MS-COCO (Lin et al., 2014) validation set, a common standard image-text dataset. We provide additional visualizations of highand low-conformity images across various datasets. Figure 19 illustrates examples of sketches from Image Net-R, while Figure 20 showcases examples from Image Net-A.
Dataset Splits	Yes	Our statistical analysis and many experimental results are based on MS-COCO (Lin et al., 2014) validation set, a common standard image-text dataset. We treat the entire validation set (5k samples) as a single batch.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	No	The paper discusses the normalized temperature-scaled cross entropy loss (NT-Xent) used in CLIP's training and analyzes its behavior, including varying a parameter 'alpha' for analytical purposes. However, it does not specify concrete hyperparameters like learning rates, batch sizes, number of epochs, or other system-level training settings for their own experiments or for reproducing the analyzed CLIP model.