reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ELFS: Label-Free Coreset Selection with Proxy Training Dynamics

Authors: Haizhong Zheng, Elisa Tsai, Yifu Lu, Jiachen Sun, Brian Bartoldson, Bhavya Kailkhura, Atul Prakash

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate ELFS on four vision benchmarks and show that, given the same vision encoder, ELFS consistently outperforms SOTA label-free baselines. For instance, when using Sw AV as the encoder, ELFS outperforms D2 by up to 10.2% in accuracy on Image Net-1K.
Researcher Affiliation	Academia	1University of Michigan 2Lawrence Livermore National Laboratory EMAIL EMAIL
Pseudocode	No	The paper describes the methodology in text and illustrates a pipeline in Figure 2, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We make our code publicly available on Git Hub1. 1https://github.com/eltsai/elfs
Open Datasets	Yes	We evaluate ELFS on four vision benchmarks: CIFAR10, CIFAR100 (Krizhevsky et al., 2009), STL10 (Coates et al., 2011), and Image Net-1K (Deng et al., 2009).
Dataset Splits	Yes	After generating the pseudo-labeled dataset, we split it into 90% for training and 10% for validation. We use the validation set to determine the optimal β.
Hardware Specification	Yes	The grid search time for a single pruning rate is approximately 17 hours using four A6000 GPUs.
Software Dependencies	No	The paper mentions using 'Adam W optimizer' and 'SGD optimizer' along with a 'cosine annealing learning rate scheduler', but does not provide specific version numbers for any software libraries or frameworks like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	For pseudo-label generation, we use the training settings recommended in TEMI (Adaloglou et al., 2023). ... Training is conducted over 200 epochs with a batch size of 512, using an Adam W optimizer (Loshchilov & Hutter, 2017b) with a learning rate of 0.0001 and a weight decay of 0.0001. ... CIFAR10 and CIFAR100: We use a Res Net18 model for 40,000 iterations, with a batch size of 256 and SGD optimizer settings that include 0.9 momentum and 0.0002 weight decay. The initial learning rate is set at 0.1 with a cosine annealing learning rate scheduler (Loshchilov & Hutter, 2017a).