Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

Authors: Zihao TANG, Zheqi Lv, Shengyu Zhang, Yifan Zhou, Xinyu Duan, Fei Wu, Kun Kuang

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments in 3 datasets and 8 settings demonstrate the stability and superiority of our approach.
Researcher Affiliation Collaboration Zihao Tang, Zheqi Lv, Shengyu Zhang Zhejiang University EMAIL Yifan Zhou Shanghai Jiao Tong University EMAIL Xinyu Duan Huawei Cloud EMAIL Fei Wu & Kun Kuang Zhejiang University EMAIL
Pseudocode Yes For space issues, we leave the pseudo-code of our overall method in Appendix A. In Appendix A: "The pseudo-code of our proposed method is displayed in Algorithm 1."
Open Source Code Yes Code available at https://github.com/Ishi Kura-a/Au G-KD
Open Datasets Yes The proposed method is evaluated on 3 datasets Office-31 (Saenko et al., 2010), Office-Home (Venkateswara et al., 2017), and Vis DA-2017 (Peng et al., 2017).
Dataset Splits Yes for evaluation purposes, the student domain Ds of these two datasets is divided into training, validation, and testing sets using a seed, with proportions set at 8:1:1 respectively. As to Vis DA-2017, we split the validation domain into 80% training and 20% validation and directly use the test domain for test.
Hardware Specification Yes Each experiment is conducted using a single NVIDIA Ge Force RTX 3090 and takes approximately 1 day to complete.
Software Dependencies No The paper mentions 'Optimizer Adam' and shows PyTorch-like code structures, but does not provide specific version numbers for any key software components or libraries.
Experiment Setup Yes We summarize the hyperparameters and training schedules of Au G-KD on the three datasets in Table 5. Table 5 lists: Optimizer Adam, Learning Rate (except Encoder) 1e-3, Learning Rate (Encoder)GPI 1e-4, Batch size 2048, Nz 256, Image Resolution 32 32, seed {2021,2022, ,2025}, αg 20, αe 0.00025, αa 0.25, βa 0.1. Notably, the temperature of the KL-divergence in Module 3 is set to 10.