reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations

Authors: Xinyu Yang, Huaxiu Yao, Allan Zhou, Chelsea Finn

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate TALLY on several benchmarks and real-world datasets and find that it consistently outperforms other state-of-the-art methods in both subpopulation and domain shift. [...] In this section, we conduct extensive experiments to answer the following questions: Q1: How does TALLY perform relative to prior invariant learning and single-domain long-tailed learning approaches under subpopulation shift and domain shift? Q2: Since it is straightforward to combine invariant learning with imbalanced data strategies, how does TALLY compare with such combinations? Q3: What affect does incorporating the prototype representation (Eqn. (9)) have, in comparison with naive representation swapping (Eqn. (4))? Q4: Can TALLY produce models with greater domain invariance?
Researcher Affiliation	Academia	Xinyu Yang EMAIL Carnegie Mellon University Huaxiu Yao EMAIL University of North Carolina at Chapel Hill Allan Zhou EMAIL Stanford University Chelsea Finn EMAIL Stanford University
Pseudocode	Yes	Algorithm 1 TALLY Training Process
Open Source Code	No	The paper does not contain any explicit statement about providing source code, nor does it include a link to a code repository. The Open Review link is for peer review, not code.
Open Datasets	Yes	We curate four multi-domain long-tailed datasets by modifying four existing domain-generalization benchmarks: VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), and Domain Net (Peng et al., 2019). [...] To further evaluate TALLY and prior methods, we study two multi-domain datasets that are naturally imbalanced: Terra Incognita (Terra Inc) (Beery et al., 2018) and i Wild Cam (Beery et al., 2020).
Dataset Splits	Yes	In subpopulation shift, the test set is balanced across domains and classes, which means that each domain-class pair contains the same number of test examples. In domain shift, we use the classical domain generalization setting (Zhang et al., 2022). More specifically, we alternately use one domain as the test domain, and the rest as the training domains. [...] In Terra Inc, the number of training, validation and test domains are 10, 5, 5, respectively. For i Wild Cam, we follow the same training, validation, and test splits as used in the WILDS benchmark (Koh et al., 2021).
Hardware Specification	No	The paper mentions using a ResNet-50 for all algorithms but does not specify any hardware details like GPU or CPU models used for training or evaluation.
Software Dependencies	No	The paper mentions using ResNet-50 for all algorithms but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries).
Experiment Setup	Yes	The hyperparameters αc and αd in the Beta distribution are set to 0.5 and the warm start epoch T0 is set to 7. We list all hyperparameters in Appendix D.2. [...] Table 6: Hyperparameters for experiments on synthetic data. [...] Table 12: Hyperparameters for experiments on real-world data.