reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Energy Alignment for Accelerating Test-Time Adaptation

Authors: Wonjeong Choi, Do-Yeon Kim, Jungwuk Park, JungMoon Lee, Younghyun Park, Dong-Jun Han, Jaekyun Moon

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we demonstrate the effectiveness of our approach through extensive experiments. We follow the online TTA setting (Zhao et al., 2023) for experiments. ... In Tables 1-3, we report the classification error (%) for the online TTA on CIFAR10-C, CIFAR100-C and Tiny Image Net-C dataset, respectively. Experiments are conducted over 3 random trials... In Fig. 5, we report the temporal TTA performance over adaptation batches. ... In Table 5, we conduct an ablation study on each component of our AEA...
Researcher Affiliation	Academia	Wonjeong Choi1, Do-Yeon Kim1, Jungwuk Park1, Jungmoon Lee1, Younghyun Park2, Dong-Jun Han3, Jaekyun Moon1 1 Korea Advanced Institute of Science and Technology (KAIST), 2 Agency for Defense Development (ADD), 3 Yonsei University 1 EMAIL; 2 EMAIL; 3 EMAIL; 1 EMAIL
Pseudocode	No	The paper describes the methodology using prose and mathematical formulations but does not include explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our source code is publicly available at https://github.com/wonjeongchoi/AEA.
Open Datasets	Yes	For the domain shift datasets, we utilize corrupted image datasets: CIFAR10-C, CIFAR100C, and Tiny Image Net-C. The uncorrupted datasets (i.e., CIFAR10, CIFAR100, Tiny Image Net) are used as the source domain... We also evaluate our method on style shift dataset (i.e., PACS) in Sec. 4.2 and Image Net to Image Net-C dataset in Sec. B.1. ... Cityscapes dataset Cordts et al. (2016) ... GTA-5 dataset Richter et al. (2016)
Dataset Splits	Yes	For CIFAR10/100 pre-training before the TTA stage, we adopt a self-supervised learning scheme (i.e., the rotation prediction) with an auxiliary head to fairly compare with an auxiliary task-based methods like TTT (Sun et al., 2020), while Tiny Image Net pre-training follows the official Torch Vision setup (maintainers & contributors, 2016). ... Specifically, for current test time step i, the model f S adapts to mini-batch Bi = {xj}\|B\| j=1, consisting of domain-shifted, unlabeled target samples xj P T (x) with the size of \|B\|.
Hardware Specification	Yes	All the experiments are performed with 1 NVIDIA Ge Force RTX 3090 and the other details of computational resources (e.g. workers, memory, etc.) are the same with Zhao et al. (2023).
Software Dependencies	No	The paper mentions using SGD as an optimizer and adapting batch normalization parameters, and refers to Torch Vision setup for Tiny Image Net pre-training, but does not provide specific version numbers for software libraries or dependencies.
Experiment Setup	Yes	For all experiments, we use SGD as optimizer and set the learning rate to 0.001, batch size to 64, and the number of adaptation iterations per batch to 1. Also, we adapt the model parameters of the batch normalization (BN) layers similar to Wang et al. (2021). Additionally, there are several hyperparameters that need to be tuned for our AEA, i.e., {λ1, α} for the SFEA loss and {λ2, C0} for the LCS loss. For all datasets in the experiment section, we apply the SFEA loss with λ1 set to 1.0 and α set to 0.5. Also, for the LCS loss, which depends on the confidence scores of the model, we set the threshold C0 to 0.99 for CIFAR10-C, 0.66 for CIFAR100-C and Tiny Image Net-C. Also, λ2 is set to 25 for all experiments.