TdAttenMix: Top-Down Attention Guided Mixup

Authors: Zhiming Wang, Lin Gu, Feng Lu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the Td Atten Mix boosts the performance and achieve state-of-the-art top1 accuracy in CIFAR100, Tiny-Image Net, CUB-200 and Image Net-1k. Additionally, we introduce a new metric based on the human gaze and use this metric to investigate the issue of image-label inconsistency.
Researcher Affiliation Academia 1State Key Laboratory of VR Technology and Systems, School of CSE, Beihang University 2RIKEN AIP 3The University of Tokyo, Japan EMAIL, EMAIL
Pseudocode No The paper describes methods using mathematical formulas and descriptive text but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/morning12138/Td Atten Mix
Open Datasets Yes Extensive experiments demonstrate the Td Atten Mix boosts the performance and achieve state-of-the-art top1 accuracy in CIFAR100, Tiny-Image Net, CUB-200 and Image Net-1k. ... We use ADE20k (Zhou et al. 2017) to evaluate the performance of semantic segmentation task. ... We compute the Jaccard similarity over the PASCALVOC12 benchmark (Everingham et al. 2015). ... We evaluate our TDAtten Mix on two out-of-distribution datasets. (1) The Image Net A dataset (Hendrycks et al. 2021). ... (2) The Image Net O (Hendrycks et al. 2021). ... we utilize ARISTO dataset (Liu et al. 2022b)
Dataset Splits Yes ADE20k is a challenging scene parsing dataset covering 150 semantic categories, with 20k, 2k, and 3k images for training, validation and testing.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper refers to various model architectures and frameworks (e.g., Res Net, Vi T, Deit-S, Uper Net) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Table 6 shows that the best results are obtained when β is set to 0.5. ... We evaluate three different task adaptive balanced attention strategies: 1) σ = 0, 2) σ = 0.5, 3) σ = 1, 4) σ = 2, 5) σ = 3, 6) σ = 4.