reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Model-Agnostic Multi-Group Equivariant Networks

Authors: Razan Baltaji, Sourya Basu, Lav R. Varshney

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For the first design, we provide experiments on multi-image classification where each view is transformed independently with transformations such as rotations. We find equivariant models are robust to such transformations and perform competitively otherwise. For the second design, we consider three applications: language compositionality on the SCAN dataset to product groups; fairness in natural language generation from GPT-2 to address intersectionality; and robust zero-shot image classification with CLIP. Overall, our methods are simple and general, competitive with equitune and its variants, while also being computationally more efficient.
Researcher Affiliation	Academia	Razan Baltaji , Sourya Basu , & Lav R. Varshney Department of Electrical and Computer Engineering, Coordinated Science Laboratory University of Illinois at Urbana-Champaign EMAIL
Pseudocode	No	The paper contains mathematical formulations and descriptions of algorithms but no explicit 'Pseudocode' or 'Algorithm' blocks with structured, code-like steps.
Open Source Code	Yes	Code is available at https://github.com/baltaci-r/Multi-Group-Equivariant-Networks
Open Datasets	Yes	We perform experiments using two datasets: Caltech101 (Li et al., 2022) and 15Scene (Fei-Fei & Perona, 2005). We work on the SCAN-II dataset where we have one train dataset and three different test dataset splits. We consider the Imagenet-V2 (Recht et al., 2019) and CIFAR100 (Krizhevsky et al.) image classification datasets. Further, in Tab. 2, we verify that Multi Equi GPT2 has a negligible drop in perplexity on the test sets of Wiki Text-2 and Wiki Text-103 compared to GPT2 and close to Equi GPT2.
Dataset Splits	Yes	We partition the train and test datasets for each label in tuples of N. We add random 90 rotations to the test images, and for training, we report results both with and without the transformations. We work on the SCAN-II dataset where we have one train dataset and three different test dataset splits.
Hardware Specification	Yes	All multi-image classification experiments were done on a single Nvidia A100 GPU with 80GB memory in a compute cluster.
Software Dependencies	No	The paper mentions several software components like 'SGD optimizer', 'Adam optimizer', 'BERT', 'GPT-2', 'CLIP', 'ReLU', 'batch norm', 'dropout', but does not provide specific version numbers for these or other ancillary software libraries/frameworks used for implementation.
Experiment Setup	Yes	We train each model for 100 epochs and a batch size of 64, an SGD optimizer with a learning rate of 0.01, momentum of 0.9, and a weight decay of 0.001. Each model was pretrained on the train set for 200k iterations using Adam optimizer (Kingma & Ba, 2015) with a learning rate of 10 4 and teacher-forcing ration 0.5 (Williams & Zipser, 1989). We test the non-equivariant pretrained models, along with equituned and multi-equituned models, where equitune and multi-equitune use further 10k iterations of training on the train set. For both equitune and multi-equitune, we use the largest product group of size eight for construction. We use the cross-entropy loss as our training objective.