reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Test-time Adaptation for Regression by Subspace Alignment

Authors: Kazuki Adachi, Shin'ya Yamaguchi, Atsutoshi Kumagai, Tomoki Hamagami

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally show that SSA outperforms various baselines on real-world datasets. The code is available at https://github.com/kzkadc/regression-tta.
Researcher Affiliation	Collaboration	NTT Corporation Kyoto University Yokohama National University EMAIL EMAIL
Pseudocode	Yes	The procedure of SSA is listed in Algorithm 1 of the Appendix.
Open Source Code	Yes	The code is available at https://github.com/kzkadc/regression-tta.
Open Datasets	Yes	SVHN (Netzer et al., 2011) and MNIST (Le Cun et al., 1998b) are famous digit-recognition datasets. UTKFace (Zhang et al., 2017). UTKFace is a dataset consisting of face images. Biwi Kinect (Fanelli et al., 2013). Biwi Kinect is a dataset consisting of person images. California Housing (Nugent, 2017). California Housing is a tabular dataset
Dataset Splits	Yes	UTKFace: We randomly split the dataset into 80% for training and 20% for validation. Biwi Kinect: We split the dataset into male and female images and further randomly split them into 80% for training and 20% for validation. California Housing: We extracted the data of non-coastal areas for the source domain and split them into 90% for training and 10% for validation.
Hardware Specification	Yes	We conducted the experiments with a single NVIDIA A100 GPU.
Software Dependencies	No	We used Py Torch (Paszke et al., 2019) and Py Torch-Ignite (Fomin et al., 2020) to make the implementations of the source pre-training, proposed method, and baselines.
Experiment Setup	Yes	For optimization, we used Adam (Kingma & Ba, 2015) and set the learning rate to 0.0001, weight decay to 0.0005, batch size to 64, and number of epochs to 100. We set the number of dimensions of the feature subspace to K = 100 as the default throughout the experiments. For optimization, we used Adam (Kingma & Ba, 2015) with a learning rate= 0.001, (β1, β2) = (0.9, 0.999), and weight decay= 0, which is the default setting in Py Torch (Paszke et al., 2019). We set the batch size to 64 following other TTA baselines.