reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhance Vision-Language Alignment with Noise

Authors: Sida Huang, Hongyuan Zhang, Xuelong Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The evaluation across 11 datasets demonstrates its effectiveness. 4 Experiments
Researcher Affiliation	Collaboration	1School of Artificial Intelligence, OPtics and Electro Nics (iOPEN), Northwestern Polytechnical University, Xi an 710072, P. R. China 2Institute of Artificial Intelligence (Tele AI), China Telecom, P. R. China 3The University of Hong Kong
Pseudocode	No	The paper describes the methodology in prose and mathematical formulations but does not contain a distinct pseudocode or algorithm block.
Open Source Code	Yes	Code https://github.com/hyzhang98/Pi NI
Open Datasets	Yes	To evaluate the performance of Pi NI, 11 datasets covering a wide range of visual concepts are selected. They include two generic object datasets, Image Net (Deng et al. 2009) and Caltech101 (Fei-Fei, Fergus, and Perona 2004); five fine-grained datasets, Oxford Pets (Parkhi et al. 2012), Stanford Cars (Krause et al. 2013), Flowers102 (Nilsback and Zisserman 2008), Food101 (Bossard, Guillaumin, and Van Gool 2014) and FGVCAircraft (Maji et al. 2013), which contain fine-grained categories of pets, cars, flowers, food and aircraft, respectively. The other datasets are scene recognition dataset SUN397 (Xiao et al. 2010), action recognition dataset UCF101 (Soomro, Zamir, and Shah 2012), describable textures dataset DTD (Cimpoi et al. 2014) and Euro SAT (Helber et al. 2019) which contains satellite images.
Dataset Splits	Yes	In the few-shot learning experiments, the train dataset is randomly sampled with 1, 2, 4, 8, and 16 shots per category. The model is tested on all data in the test dataset.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments. It only mentions model architectures like Vi T-B/16 and RN-50.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers.
Experiment Setup	Yes	The noise sample number m in Eq. (13) is set to 1. To ensure the fairness of the experiment, the best-performing Vi T-B/16 is selected as the visual encoder unless otherwise noted. The default parameter configurations are used for these baselines.