Gated Domain Units for Multi-source Domain Generalization
Authors: Simon Föll, Alina Dubatovka, Eugen Ernst, Siu Lun Chau, Martin Maritsch, Patrik Okanovic, Gudrun Thaeter, Joachim M. Buhmann, Felix Wortmann, Krikamol Muandet
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on image, text, and graph data show consistent performance improvement on out-of-training target domains. These findings support the practicality of the I.E.D assumption and the effectiveness of GDUs for domain generalisation. We verify the I.E.D assumption by extensive experiments using a publicly available benchmarking WILDS. Specifically, we validate our method on image, text, and graph datasets, showing consistent improvement on out-of-training target domains. Our experimental evaluations are then presented in Section 5. |
| Researcher Affiliation | Academia | 1Department of Management, Technology, and Economics, ETH Zurich, Switzerland 2Department of Computer Science, ETH Zurich, Switzerland 3Department of Mathematics, Karlsruhe Institute for Technology, Germany 4Institute for Technology Management, University of St. Gallen, Switzerland 5CISPA Helmholtz Center for Information Security, Germany |
| Pseudocode | No | The paper describes the GDU layer and model training with mathematical equations and formal definitions, such as Definition 1 and Proposition 3.1, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available for Tensor Flow (https://github.com/im-ethz/pub-gdu4dg) and Py Torch (https://github.com/im-ethz/gdu4dg-pytorch). |
| Open Datasets | Yes | We verify the I.E.D assumption by extensive experiments using a publicly available benchmarking WILDS. Specifically, we validate our method on image, text, and graph datasets, showing consistent improvement on out-of-training target domains. ...we create a multi-source dataset by combining five publicly available digits image datasets, namely MNIST Lecun et al. (1998), MNIST-M Ganin & Lempitsky (2015), SVHN Netzer et al. (2011), USPS, and Synthetic Digits (SYN) Ganin & Lempitsky (2015). ...using eight datasets: Camelyon17, FMo W, Amazon, i Wild Cam, and Rx Rx1, OGB-Mol PCBA, Civil-Comments, and Poverty Map. |
| Dataset Splits | Yes | Each dataset, except USPS, is split into training and test sets of 25,000 and 9,000 images, respectively. ...Camelyon17 comprises images of tissue patches from five different hospitals. While the first three hospitals are the source domains (302,436 examples), the forth and fifth are the validation (34,904 examples) and test domain (85,054 examples), respectively. ...training (76,863 images; between 2002 2013), validation (19,915 images; between 2013 and 2016), and test (22,108 images, between 2016 2017). ...training (245,502 reviews from 1,252 reviewers), validation (100,050 reviews from 1,334 reviewers), test (100,050 reviews from 1,334 reviewers). ...243 training traps (129,809 images), 32 validation traps (14,961 images), and 48 test traps (42,791 images). ...training (40,612 images, 33 domains), validation (9,854 images, 4 domains), and test (34,432 images, 14 domains). ...training (44,930 domains), validation (31,361 domains), and test (43,739 domains). ...training (269,038 comments), validation (45,180 comments), and test (133,782 comments) set. ...The avergae size of each set across the 5 folds is for the training 10,000 images (13-14 countries), 4,000 images (4-5 different countries), and for the test set 4,000 images (13-14 countries). |
| Hardware Specification | Yes | Although the Gated Domain layer requires more computation resources than the ERM models, all digits experiments were conducted on a single GPU (NVIDIA Ge Force RTX 3090). |
| Software Dependencies | Yes | Our Digits experiments are implemented using Tensor Flow 2.4.1 and Tensor Flow Probability 0.12.1. For the WILDS benchmarking we use our Py Torch (version 1.11.0). |
| Experiment Setup | Yes | For training, we resorted to the Adam optimizer with a learning rate of 0.001. We used early stopping and selected the best model weights according to the validation accuracy. For the validation data, we used the combined test splits only of the respective source datasets. The batch size was set to 512. The hyper-parameters relevant for our layer are summarized in Table 7 in Appendix B.1. In Table 2, we present the final results for our proof-of-concept experiment. We compare the performance of the Gated Domain layer with different similarity functions (CS, MMD, Projected) trained in fine tuning (FT) and end-to-end (E2E) modes. |