Learning Multimodal Data Augmentation in Feature Space

Authors: Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson

ICLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS We evaluate Le MDA over a diverse set of real-world multimodal datasets. We curate a list of public datasets covering image, text, numerical, and categorical inputs. Table 1 provides a summary of the source, statistic, and modality identity. We introduce baselines in Section 4.1, and describe experimental settings in Section 4.2 We provide the main evaluation result in Section 4.3. Finally, we investigate the effects of the consistency regularizer and the choices of augmentation model architecture in Section 4.4.
Researcher Affiliation Collaboration Zichang Liu Department of Computer Science Rice University EMAIL Zhiqiang Tang Amazon Web Services EMAIL Xingjian Shi Amazon Web Services EMAIL Aston Zhang Amazon Web Services EMAIL Mu Li Amazon Web Services EMAIL Anshumali Shrivastava Department of Computer Science Rice University EMAIL Andrew Gordon Wilson New York University Amazon Web Services EMAIL
Pseudocode Yes Algorithm 1 Le MDA Training
Open Source Code Yes 1Code is available at https://github.com/lzcemma/Le MDA/
Open Datasets Yes We evaluate Le MDA over a diverse set of real-world multimodal datasets. We curate a list of public datasets covering image, text, numerical, and categorical inputs. Table 1 provides a summary of the source, statistic, and modality identity. Examples include SNLI-VE (Xie et al., 2019a).
Dataset Splits No Table 1 provides train and test set sizes (e.g., Hateful Memes: 7134 Train, 1784 Test) but does not explicitly state a validation split or its size.
Hardware Specification Yes Experiments were conducted on a server with 8 V100 GPU.
Software Dependencies No The paper mentions using 'pyTorch Autograd' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes For Le MDA, we set the confidence threshold for consistency regularizer α as 0.5... In our main experiment, we use w1 = 0.0001, w2 = 0.1, w3 = 0.1 on all datasets except Melbourne Airbnb and SNLI-VE. On Melbourne Airbnb and SNLI-VE, we use w1 = 0.001, w2 = 0.1, w3 = 0.1.