Attention Bootstrapping for Multi-Modal Test-Time Adaptation

Authors: Yusheng Zhao, Junyu Luo, Xiao Luo, Jinsheng Huang, Jingyang Yuan, Zhiping Xiao, Ming Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the benchmarks validate the effectiveness of the proposed ABPEM in comparison with competing baselines.
Researcher Affiliation Academia 1State Key Laboratory for Multimedia Information Processing, School of Computer Science, PKU-Anker LLM Lab, Peking University, Beijing, China 2Department of Computer Science, University of California, Los Angeles, CA, USA 3Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
Pseudocode Yes Algorithm 1: Optimization Algorithm of ABPEM
Open Source Code Yes More details can be found at https://github.com/Yusheng Zhao/ABPEM.
Open Datasets Yes Benchmarks. The experiments are performed on two benchmarks: Kinetics50-C and VGGSound-C (Yang et al. 2024), which are based on the widely used Kinetics (Kay et al. 2017) and VGGSound (Chen et al. 2020) datasets.
Dataset Splits No The paper mentions 'pretrained on the corresponding training set (Kinetics or VGGSound)' and 'using unlabeled test data Dte', implying the existence of training and test sets. However, it does not explicitly provide specific dataset split percentages, sample counts, or direct citations for the splits used for these benchmarks.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Adam optimizer (Kingma and Ba 2014)' and 'CAVMAE (Gong et al. 2023) as the architecture of M', but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes We set k in Eq. 10 to about 8 for Kinetics50-C and 30 for VGGSound-C, and λ to 1 by default. Moreover, we also use a class-balancing loss in alignment with (Yang et al. 2024). For optimization, we use Adam optimizer (Kingma and Ba 2014) and the model is optimized within a single epoch, with the learning rate of 1 × 10−4.