reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Taming Diffusion for Dataset Distillation with High Representativeness

Authors: Lin Zhao, Yushu Wu, Xinru Jiang, Jianyang Gu, Yanzhi Wang, Xiaolin Xu, Pu Zhao, Xue Lin

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experiments demonstrate that D3HR can achieve higher accuracy across different model architectures compared with state-of-the-art baselines in dataset distillation. 5. Main Results 5.1. Experimental Details 5.2. Comparison with State-of-the-art Methods 5.3. Cross-architecture Generalization
Researcher Affiliation	Academia	1Northeastern University 2The Ohio State University. Correspondence to: Pu Zhao <EMAIL>, Xue Lin <EMAIL>.
Pseudocode	Yes	Algorithm 1 D3HR Algorithm
Open Source Code	Yes	Source code: https://github. com/lin-zhao-reso Lve/D3HR.
Open Datasets	Yes	For small-scale datasets, we use CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) with 32 32 resolution. For large-scale datasets, we use Tiny Image Net (Le & Yang, 2015) with 200 classes (500 images per class, 64 64 size) and Image Net-1K (Deng et al., 2009) with 1, 000 classes (1M images, 224 224 resolution).
Dataset Splits	Yes	Experiments are conducted on both small-scale and large-scale datasets. For small-scale datasets, we use CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) with 32 32 resolution. For large-scale datasets, we use Tiny Image Net (Le & Yang, 2015) with 200 classes (500 images per class, 64 64 size) and Image Net-1K (Deng et al., 2009) with 1, 000 classes (1M images, 224 224 resolution).
Hardware Specification	Yes	All experiments are conducted on Nvidia RTX A6000 GPUs or Nvidia A100 40GB GPUs.
Software Dependencies	No	We adopt the pre-trained Diffusion Transformer (Di T) and VAE from Peebles & Xie (2023) in our framework, originally trained on Image Net-1K. (No specific software versions for libraries like PyTorch, TensorFlow, or Python are mentioned.)
Experiment Setup	Yes	For validation, the parameter settings vary slightly across methods. We adhere to the configurations in (Sun et al., 2024), as detailed in Table A1. Parameter CIFAR-10 CIFAR-100 Tiny-Image Net Image Net-1K Optimizer Adam W Learning Rate 0.01 Weight Decay 0.01 Batch Size 128 Augmentation Random Resized Crop + Horizontal Flip LR Scheduler Cosine Anneal Tempreture 20 Epochs 400 400 300 300