reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data

Authors: Zhiteng Li, Lele Chen, Jerone Andrews, Yunhao Ba, Yulun Zhang, Alice Xiang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments across several image classification tasks demonstrate the effectiveness of our approach. We evaluate Gen Data Agent in a supervised learning setting, following prior work (Yuan et al., 2023; Sarıyıldız et al., 2023; He et al., 2022). Our experiments cover two scenarios: (i) training a classifier using synthetic data alone, and (ii) using synthetic data to augment real data.
Researcher Affiliation	Collaboration	1Shanghai Jiao Tong University, 2Sony AI
Pseudocode	Yes	Pseudocode for Gen Data Agent is presented in Algorithm 1. Algorithm 1 Gen Data Agent
Open Source Code	Yes	https://github.com/SonyResearch/GenDataAgent
Open Datasets	Yes	We evaluate Gen Data Agent on Image Net-100 (IN100) (Tian et al., 2020) and five fine-grained datasets: Oxford-IIIT Pets (Parkhi et al., 2012), Flowers-102 (Nilsback & Zisserman, 2008), Birdsnap (Berg et al., 2014), CUB-200-2011 (Wah et al., 2011), and Food-101 (Bossard et al., 2014).
Dataset Splits	Yes	Table 11: Dataset statistics. # Training Samples # Test Samples
Hardware Specification	No	The paper mentions 'GPU Hours' in Figure 4 as a metric for time taken, but does not specify any particular GPU models (e.g., NVIDIA A100, RTX 3090) or other hardware components used for the experiments.
Software Dependencies	No	The paper mentions models like Stable Diffusion v1.5, Llama-2, BLIP-2, and CLIP, but does not specify any programming languages (e.g., Python), frameworks (e.g., PyTorch, TensorFlow), or library versions (e.g., scikit-learn 1.0) that would be needed for replication.
Experiment Setup	Yes	Training hyperparameters for the downstream classifier are listed in Table 12. Table 12: Training hyperparameters for downstream classification. Pets CUB Flowers Birdsnap Food IN100 On-the-fly Iterations 20 20 20 20 20 20 Train Res Test Res 224 224 448 448 224 224 224 224 224 224 224 224 Epochs 200 200 200 200 200 200 Batch Size 128 8 64 8 128 8 128 8 128 8 128 8 Optimizer SGD SGD SGD SGD SGD SGD Learning Rate 0.1 0.2 0.1 0.1 0.1 0.1 LR Decay Multistep Multistep Multistep Multistep Multistep Multistep Decay Rate 0.2 0.2 0.2 0.2 0.2 0.2 Decay Epochs 50/100/150 50/100/150 50/100/150 50/100/150 50/100/150 50/100/150 Weight Decay 5e-4 5e-4 5e-4 5e-4 5e-4 5e-4 Mixed Precision