An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Authors: Chunxiao Li, Xiaoxiao Wang, Boming Miao, Chuanlong Xie, Zizhe Wang, Yao Zhu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have conducted extensive experiments across 17 prevalent deep model architectures with different training methods, including both CNN-based models such as Res Net and Transformer-based models like Vi T, to demonstrate the effectiveness of the proposed DBMEF. Specifically, the framework yields a 1.51% performance improvement for Res Net50 on the Image Net dataset and 3.02% on the Image Net A dataset.
Researcher Affiliation Academia 1Beijing Normal University, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Tsinghua University, Beijing, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, ee EMAIL
Pseudocode Yes The Pytorch-like pseudo algorithm of DBMEF is given in Appendix. A.
Open Source Code Yes Code https://github.com/Chun Xiaostudy/DBMEF
Open Datasets Yes The evaluation experiments are conducted on the Image Net2012-1k validation set. ... We select the Image Net-A (Hendrycks et al. 2021), Image Net-V2(Recht et al. 2019), Image Net-S(Gao et al. 2022), and Imagenet E(Li et al. 2023b) datasets to assess the robustness of our proposed framework against distribution shifts. ... Additionally, we chose the CIFAR-10 and CIFAR-100 datasets for our experiments.
Dataset Splits Yes The evaluation experiments are conducted on the Image Net2012-1k validation set. ... We select the Image Net-A (Hendrycks et al. 2021), Image Net-V2(Recht et al. 2019), Image Net-S(Gao et al. 2022), and Imagenet E(Li et al. 2023b) datasets ... Additionally, we chose the CIFAR-10 and CIFAR-100 datasets for our experiments. ... on their respective test sets as the baseline.
Hardware Specification No The paper mentions "Ge Force RTX 4090 GPU" in the context of inference time for *other* diffusion-based classifiers, but it does not specify the hardware used for the authors' own experiments and evaluations of DBMEF.
Software Dependencies No The pretrained weights of nine supervised backbone models and Vi Tb CLIP, Vi Th-CLIP, as well as the image preprocessing methods used during inference, were sourced from the TIMM library 1. The paper mentions "TIMM library" but does not specify its version. It also mentions "Pytorch-like pseudo algorithm" but not PyTorch version.
Experiment Setup Yes For Dei T-b, Vi Th-CLIP, Vi Tb-CLIP, Vi Tb-MAE, Vi Tl MAE, Vi Tb-DINOv2, we set Prot at 0.99, while for other models, Prot is set at 0.95. The time steps is set to 30, λ is fixed at 1.1, and the number of sub-models participating in voting is set to 5. ... Regarding the diffusion model, we adopt the most popular and mature architecture Stable Diffusion V1-5.