reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distilling Datasets Into Less Than One Image

Authors: Asaf Shul, Eliahu Horwitz, Yedid Hoshen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Performing extensive experiments that demonstrate the effectiveness of Po DD with as low as 0.3 IPC and achieving a new So TA on the well established 1 IPC benchmark.
Researcher Affiliation	Academia	Asaf Shul EMAIL The Hebrew University of Jerusalem Eliahu Horwitz EMAIL The Hebrew University of Jerusalem Yedid Hoshen EMAIL The Hebrew University of Jerusalem
Pseudocode	Yes	Algorithm 1 Po CO: Pseudocode for Po CO class ordering Algorithm 2 Po DD: Pseudocode using Po DDL learned labels
Open Source Code	No	Project page: https://horwitz.ai/podd/ Explanation: The paper provides a 'Project page' URL, but it does not explicitly state that the source code is hosted there or provide a direct link to a code repository.
Open Datasets	Yes	We evaluate Po DD on four datasets commonly used to benchmark dataset distillation methods: i) CIFAR-10: 10 classes, 50k images of size 32 32 3 (Krizhevsky et al., 2009). ii) CIFAR-100: 100 classes, 50k images of size 32 32 3 (Krizhevsky et al., 2009). iii) CUB200: 200 classes, 6k images of size 32 32 3 (Welinder et al., 2010). iv) Tiny-Image Net: 200 classes, 100k images of size 64 64 3 (Le & Yang, 2015).
Dataset Splits	Yes	Following the protocol of (Zhao & Bilen, 2021; Deng & Russakovsky, 2022), we evaluate the distilled poster using a set of 8 different randomly initialized models with the same Conv Net (Gidaris & Komodakis, 2018) architecture used by DSA, DM, MTT, Ra T-BPTT, and others. ... The resulting amount of images from each class in our experiment are: [5000, 4750, 4500, 4250, 4000, 3750, 3500, 3250, 3000, 2750]. We did not modify the test set.
Hardware Specification	Yes	To fit the distillation into a single GPU (we use an NVIDIA A40), we use the maximal batch size we can fit into memory for a given dataset
Software Dependencies	No	We use the same distillation hyper-parameters used by Ra T-BPTT (Feng et al., 2023) except for the batch sizes. Explanation: The paper mentions using the hyperparameters from another method but does not specify any software names with version numbers for their own implementation (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	Concretely, we use: i) CIFAR10: p = 96(16 6) patches, bsd = 96, bs = 5000, 4k epochs. ii) CIFAR-100: p = 400(20 20) patches, bsd = 50, bs = 2000, 2k epochs. iii) CUB200: p = 1800(60 30) patches, bsd = 200, bs = 3000, 8k epochs. iv) Tiny-Image Net: p = 800(40 20) patches, bsd = 30, bs = 500, 500 epochs. We use the learned labels variant of Po DDL for all of our experiments... We use a learning rate of 0.001 for CIFAR-10, CIFAR-100, and CUB200. For Tiny Image Net into a single GPU we use a much smaller batch size and a learning rate of 0.0005.