Fully Test-time Adaptation for Tabular Data
Authors: Zhi Zhou, Kun-Yang Yu, Lan-Zhe Guo, Yu-Feng Li
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on six benchmark datasets, which are evaluated using three metrics. The experimental results demonstrate that FTAT outperforms state-of-the-art methods by a margin. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3School of Intelligence Science and Technology, Nanjing University, China. EMAIL |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical formulations, but it does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | Project Homepage https://wnjxyk.github.io/FTTA |
| Open Datasets | Yes | We conduct comprehensive experiments on six benchmark datasets, which are evaluated using three metrics. ... We select six common tabular benchmark datasets from the Table Shift benchmark, which exhibit significant performance gaps under distribution shifts. ... Gardner, Popovic, and Schmidt 2023. Benchmarking Distribution Shift in Tabular Data with Table Shift. In Advances in Neural Information Processing Systems. |
| Dataset Splits | Yes | In our experiments on tabular tasks, we follow the fully test-time adaptation setting, where the source model is trained on training data and adapted to shifted test data without any access to the source training data. Specifically, we train the source model on training data and select the best model based on the validation set following the Table Shift benchmark (Gardner, Popovic, and Schmidt 2023). Then, FTAT approach and existing FTTA methods are evaluated on the shifted test set. |
| Hardware Specification | No | The paper does not explicitly describe any specific hardware components such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experimental setup. |
| Experiment Setup | Yes | As shown in Fig. 2, the optimal learning rates for different backbone models on the same task and same backbone model on different tasks varies. ... Here, we compare with four base models with different learning rates {1e 3, 1e 4, 5e 4, 1e 5}. ... In the main experiments, the batch size of the data stream is set to 512 ... batch sizes set to {64, 128, 256, 512, 1024}. ... FTAT contains three hyperparameters, i.e., ϵ, α and β. ... with α in {0.08, 0.09, 0.10, 0.11, 0.15, 0.20}, ϵ = Entropy([p, 1 p]) where p was set to {0.72, 0.71, 0.70, 0.69, 0.65, 0.60}, and β in {0.28, 0.29, 0.30, 0.31, 0.40, 0.50}. |