reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exploring Learning Complexity for Efficient Downstream Dataset Pruning

Authors: Wenyu Jiang, Zhenlong Liu, Zejian Xie, Songxin Zhang, Bingyi Jing, Hongxin Wei

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments with downstream image and instructions dataset pruning benchmarks demonstrate the effectiveness and efficiency of the proposed approach. In the images pruning benchmark, DLC significantly reduces the pruning time by 35 while establishing state-of-the-art performance with Flex Rand.
Researcher Affiliation	Academia	1Department of Statistics and Data Science, Southern University of Science and Technology 2State Key Laboratory for Novel Software Technology, Nanjing University
Pseudocode	No	The paper describes methods and formulas but does not include a clearly labeled pseudocode or algorithm block with structured steps.
Open Source Code	Yes	The code is available in the supplementary material.
Open Datasets	Yes	Therefore, we choose diverse downstream datasets from 5 domains (Islam et al., 2021) to construct the large-scale benchmark, including CXRB102, Deep Weeds (Olsen et al., 2019), DTD (Cimpoi et al., 2014), FGVCAircraft (Maji et al., 2013), and Sketch (Eitz et al., 2012). For hyperparameter tuning, we split 20% as the validation set. ... Alpaca Cleaned (Taori et al., 2023) and Dolly & HH-RLHF3.
Dataset Splits	Yes	For hyperparameter tuning, we split 20% as the validation set. ... We prune the downstream datasets at 9 pruning ratios, ranging from 10% to 90%, for a thorough verification and comparison. For example, we keep 10% of each category in the original dataset when the pruning ratio is 10%.
Hardware Specification	Yes	The code is based on Py Torch and all the experiments run on NVIDIA L40.
Software Dependencies	No	To ensure reliable reproduction, we have run the compared baselines using the Deep Core (Guo et al., 2022) library. The code is based on Py Torch (Paszke et al., 2019). ... Regarding the pre-trained models and instruction datasets, we use the Hugging Face library. Fine-tuning is based on the PEFT library, and evaluation is based on the LM-Eval library.
Experiment Setup	Yes	Fine-tuning. We sequentially attach a linear layer on top of the pre-trained encoder for the downstream image classification. Then, the above classifier is fully trained on the pruned dataset for 50 epochs using SGD with a momentum of 0.9, a weight decay of 1e-5, and a batch size of 128. The initial learning rate is 1e-3 and decays by a factor of 10 at the 25th and 37th epochs. ... We fine-tune the base model for 3 epochs using SGD with a batch size of 32, a momentum of 0.9, a learning rate of 7e-6 scheduled by cosine function, and a weight decay of 0.01. Note that the learning rate increases linearly at the warmup stage (the first 100 steps).