reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Feedback-guided Data Synthesis for Imbalanced Classification

Authors: Reyhane Askari Hemmat, Mohammad Pezeshki, Florian Bordes, Michal Drozdzal, Adriana Romero-Soriano

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate three feedback criteria on a long-tailed dataset (Image Net-LT, Places-LT) as well as a group-imbalanced dataset (NICO++). On Image Net-LT, we achieve state-of-the-art results, with over 4% improvement on underrepresented classes while being twice efficient in terms of the number of generated synthetic samples. Similarly, on Places LT we achieve state-of-the-art results as well as nearly 4% improvement on underrepresented classes. NICO++ also enjoys marked boosts of over 5% in worst group accuracy.
Researcher Affiliation	Collaboration	Reyhane Askari-Hemmat1,2,3, Mohammad Pezeshki1 Florian Bordes1,2,3 Michal Drozdzal1 Adriana Romero-Soriano1,2,4 1FAIR at Meta 2Mila 3Université de Montréal 4 Mc Gill University, Canada CIFAR AI chair
Pseudocode	No	The paper describes methods using text and equations, and provides an 'Overview of our framework' in Figure 2, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code	Yes	1Code: https://github.com/facebookresearch/Feedback-guided-Data-Synthesis
Open Datasets	Yes	We validate three feedback criteria on a long-tailed dataset (Image Net-LT, Places-LT) as well as a group-imbalanced dataset (NICO++). The Image Net-Long Tail (Image Net-LT) dataset (Liu et al., 2019) is a subset of the original Image Net (Deng et al., 2009). To further study the effect of feed-back guidance on different datasets, we study the Places-Long tailed dataset (Liu et al., 2019). We follow the sub-population shift setup of NICO++(Zhang et al., 2023; Yang et al., 2023).
Dataset Splits	Yes	The Image Net-Long Tail (Image Net-LT) dataset (Liu et al., 2019) is a subset of the original Image Net (Deng et al., 2009) consisting of 115.8K images distributed non-uniformly across 1,000 classes. In this dataset, the number of images per class ranges from a minimum of 5 to a maximum of 1,280. However, the test and validation sets are balanced. In line with related literature (Shin et al., 2023), our goal is to synthesize missing data points in a way that, when combined with the real data, results in a uniform distribution of examples across all classes. This dataset consists of 365 classes where the minimum number of examples in a class is 5 and the maximum is 4980. However, test and validation sets are balanced across classes. We follow the sub-population shift setup of NICO++(Zhang et al., 2023; Yang et al., 2023) which contains 62,657 training examples, 8,726 validation and 17,483 test examples.
Hardware Specification	Yes	Results are reported in Table 5, on an average of 1000 samples, computed on the same model and the same GPU machine (V100) without batch-generation.
Software Dependencies	No	For lower computational complexity, we use float16 datatype in Py Torch. This mentions a software (Py Torch) and a data type, but does not specify a version number for PyTorch.
Experiment Setup	Yes	For Res Next50, our classifier is trained for 150 epochs, for Vi T-B models, the classifier is trained with real data for a total of 100 epochs and then fine-tuned using real and synthetic data for another 10 epochs. To improve model scaling with synthetic data, we modify the training process to include 50% real and 50% synthetic samples in each mini-batch. We apply a balanced mini-batch approach when training all LDM methods. We also use the balanced Softmax (Ren et al., 2020) loss when training the classifier. For Imagenet-LT, the classifier is trained using ERM with learning rate of 0.1 (decaying) and weight-decay of 0.0005 and batch-size of 32. For NICO++ we use a pre-trained classifier on Imagnet and then fine-tune it on NICO++. We use the SGD with momentum of 0.9 and train for 50 epochs. We use the SGD optimizer with a learning rate of 0.0028, momentum of 0.9 and weight decay 0.0005 and batch-size of 512. During generation, we apply dropout on image-embeddings with 0.5 probability. Furthermore, we apply 30 steps of DDIM with clip-guidance scale of 3 and feedback guidance scale of 0.03.