reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dataset Condensation with Color Compensation

Authors: Huyu Wu, Duo Su, Junjie Hou, Guang Li

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the superior performance and generalization of DC3 that outperforms SOTA methods across multiple benchmarks. To the best of our knowledge, besides focusing on downstream tasks, DC3 is the first research to fine-tune pre-trained diffusion models with condensed datasets. The Frechet Inception Distance (FID) and Inception Score (IS) results prove that training networks with our high-quality datasets is feasible without model collapse or other degradation issues. 4 Experiments 4.1 Experimental Settings 4.2 Comparison with SOTA Methods 4.3 Cross-architecture Generalization 4.4 Ablation Study
Researcher Affiliation	Academia	1University of Chinese Academy of Sciences 2Tsinghua University 3Hong Kong University of Science and Technology 4Hokkaido University *Corresponding Authors: EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Submodular Sampling Input: x C: Data in bins (Bins generated by clustering). Input: M: Number of data bins. Input: N: Image per class (IPC). 1: for Cj in C do 2: for xk in Cj do 3: Compute G(xk) according to eq. (4) 4: end for 5: Sj = 6: C j = sort(Cj, G, desc) 7: if N M then 8: M = N 9: end if 10: Sj = C j[0 : N/M] # Selection 11: end for 12: S = SM j=1 Sj Output: S: The selected sample set.
Open Source Code	Yes	Code and generated data are available at https://github.com/528why/Dataset-Condensation-with-Color-Compensation.
Open Datasets	Yes	For large-scale datasets, we include Image Net-1K (Deng et al., 2009) (224 224) and its subsets, such as Tiny-Image Net (64 64). Small-scale low-resolution (32 32) analysis utilizes CIFAR-10/100 (Krizhevsky et al., 2009). To quantify task difficulty sensitivity, we benchmark Image Nette and Image Woof, the subsets of Image Net with 10 classes.
Dataset Splits	Yes	For large-scale datasets, we include Image Net-1K (Deng et al., 2009) (224 224) and its subsets, such as Tiny-Image Net (64 64). Small-scale low-resolution (32 32) analysis utilizes CIFAR-10/100 (Krizhevsky et al., 2009). To quantify task difficulty sensitivity, we benchmark Image Nette and Image Woof, the subsets of Image Net with 10 classes. Note that Image Woof poses greater discrimination challenges due to more inter-class similarity. We use Stable Diffusion-V1.5 and Di T-XL/2-256 as our foundation models. Following the prior works (Sun et al., 2024; Chen et al., 2025), we set IPC to 1, 10, and 50.
Hardware Specification	Yes	Performance validation is conducted using Py Torch on 8 NVIDIA 3090 GPUs.
Software Dependencies	No	The paper mentions using PyTorch for performance validation and Stable Diffusion-V1.5 and Di T-XL/2-256 as foundation models. However, it does not specify version numbers for PyTorch or other software libraries.
Experiment Setup	Yes	Settings Values guidance scale 4 network Res Net18 input size 224 optimizer Adam W learning rate 0.001 weight decay 0.01 (a) Image Net Settings Values guidance scale 4 network Res Net18 input size 32 optimizer Adam W learning rate 0.001 weight decay 0.01 (b) CIFAR-10 and CIFAR-100 Settings Values guidance scale 4 network Res Net18 input size 224 optimizer Adam W learning rate 0.001 weight decay 0.01 (c) Image Woof and Image Nette Table 11: Evaluation details for different datasets.