reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Disentanglement for Domain Generalization

Authors: Yufei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, Alex Kot

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to verify our method on three benchmarks, and both quantitative and qualitative results illustrate the effectiveness of our method.
Researcher Affiliation	Academia	Yufei Wang EMAIL Nanyang Technological University Haoliang Li EMAIL City University of Hong Kong Hao Cheng EMAIL Nanyang Technological University Bihan Wen EMAIL Nanyang Technological University Lap-pui Chau EMAIL The Hong Kong Polytechnic University Alex C. Kot EMAIL Nanyang Technological University
Pseudocode	Yes	Algorithm 1 Variational Disentanglement for domain generalization. Input: X = {xi}N i=1, Y = {yi}N i=1, initialized parameters G, Dc, Dx, Ec, Ed and Et. Output: Learned parameters G , D c, D x, E c , E d and E t .
Open Source Code	No	No explicit statement about code release or a link to a repository is provided in the main text of the paper.
Open Datasets	Yes	We evaluate our method using three benchmarks including Digits-DG (Zhou et al., 2020b), PACS (Li et al., 2017) and mini-Domain Net (Zhou et al., 2020c; Peng et al., 2019a)... We further evaluate the effectiveness of our method on the WILDS (Koh et al., 2021; Sagawa et al., 2021) benchmark. More specifically, we utilize Camelyon17 (Bandi et al., 2018) dataset which is collected from different hospitals.
Dataset Splits	Yes	Settings: Digits-DG (Zhou et al., 2020b) is a benchmark composed of MNIST, MNIST-M, SVHN, and SYN where the font and background of the samples are diverse. Except that we additionally use the Fourier-based data augmentation (Xu et al., 2021), we follow the experiment setting in Zhou et al. (2020b)... Following the widely used setting in Carlucci et al. (2019), we only used the official split of the training set to train the model and all the images from the target domain are used for the test... The batch size is 128 with a random sampler that roughly samples the same number of images in each domain... We use the official train/val/test split of the dataset and report the test accuracy when we achieved the highest accuracy on the evaluation set.
Hardware Specification	Yes	Parts of the experiments are conducted on a Windows workstation with Ryzen 5900X, 64GB RAM, and an Nvidia RTX 3090. Others are conducted on a Linux server with Intel(R) Xeon(R) Silver 4210R CPU, 256GB RAM, and RTX 2080-TIs... All the experiments are conducted on the same server with RTX A5000s.
Software Dependencies	Yes	For the software, Py Torch 1.9 is used.
Experiment Setup	Yes	For the hyper-parameters, we set λreg, λrec and λgan as 0.1, 1.0 and 1.0, respectively, for all experiments. The network architectures and training details are illustrated in the supplementary materials. We report the accuracy of the model at the end of the training... SGD is used as the optimizer with an initial learning rate of 0.05 and weight decay of 5e-5. For the rest of the networks, we use Rms Prop without momentum as the optimizer with the initial learning rate of 5e-5 and the same weight decay. We train the model for 60 epochs using the batch size of 128 and all the learning rate is decayed by 0.1 at the 50th epoch.