TdAttenMix: Top-Down Attention Guided Mixup
Authors: Zhiming Wang, Lin Gu, Feng Lu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the Td Atten Mix boosts the performance and achieve state-of-the-art top1 accuracy in CIFAR100, Tiny-Image Net, CUB-200 and Image Net-1k. Additionally, we introduce a new metric based on the human gaze and use this metric to investigate the issue of image-label inconsistency. |
| Researcher Affiliation | Academia | 1State Key Laboratory of VR Technology and Systems, School of CSE, Beihang University 2RIKEN AIP 3The University of Tokyo, Japan EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods using mathematical formulas and descriptive text but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/morning12138/Td Atten Mix |
| Open Datasets | Yes | Extensive experiments demonstrate the Td Atten Mix boosts the performance and achieve state-of-the-art top1 accuracy in CIFAR100, Tiny-Image Net, CUB-200 and Image Net-1k. ... We use ADE20k (Zhou et al. 2017) to evaluate the performance of semantic segmentation task. ... We compute the Jaccard similarity over the PASCALVOC12 benchmark (Everingham et al. 2015). ... We evaluate our TDAtten Mix on two out-of-distribution datasets. (1) The Image Net A dataset (Hendrycks et al. 2021). ... (2) The Image Net O (Hendrycks et al. 2021). ... we utilize ARISTO dataset (Liu et al. 2022b) |
| Dataset Splits | Yes | ADE20k is a challenging scene parsing dataset covering 150 semantic categories, with 20k, 2k, and 3k images for training, validation and testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper refers to various model architectures and frameworks (e.g., Res Net, Vi T, Deit-S, Uper Net) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Table 6 shows that the best results are obtained when β is set to 0.5. ... We evaluate three different task adaptive balanced attention strategies: 1) σ = 0, 2) σ = 0.5, 3) σ = 1, 4) σ = 2, 5) σ = 3, 6) σ = 4. |