reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Deep Learning from Crowds

Authors: Hang Yang, Zhiwu Li, Witold Pedrycz

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on real-world datasets, Label Me and CIFAR-10H. The experimental results, e.g., the reduction of annotations without performance degradation, demonstrate the effectiveness. Through empirical evaluation and visualization on real-world datasets, including Label Me and CIFAR-10H, we showcase the effectiveness of the Ada Crowd.
Researcher Affiliation	Academia	1Macau Institute of Systems Engineering, Macau University of Science and Technology 2Department of Electrical and Computer Engineering, University of Alberta 3Systems Research Institute, Polish Academy of Sciences
Pseudocode	Yes	Algorithm 1 Pseudo-code of Ada Crowd Input: Instance pool X. Output: Target classifiers f (T ), and the workers transition matrices {Ar}R r=1 1: Initialize classifiers f (0), the workers transition matrices {Ar}R i=1 with identity weights. 2: for t = 1, ..., T do 3: for parallel worker r do 4: Inference all instances with the evidential model. 5: Compute the uncertainty by Eq. (14). 6: Compute the accumulated belief by Eq. (15). 7: Select instance by Eq. (16). 8: Obtain the annotation yri. 9: end for 10: for E epochs do 11: Update parameters of classifier and worker matrices by Eq. (13). 12: end for 13: end for
Open Source Code	Yes	Our models are implemented with the Py Torch library, and the codes are released on our repository5. 5https://github.com/MISE-MUST/Ada Crowd
Open Datasets	Yes	The experiments are conducted on crowdsourcing datasets of image classification: Label Me and CIFAR-10H, which can be found: Label Me2, CIFAR-10H annotation3, image4. Label Me [Russell et al., 2008; Rodrigues and Pereira, 2018] is an open-source dataset collected from Amazon Mechanical Turk. CIFAR-10H [Battleday et al., 2020] is an subset of wellknown CIFAR-10 dataset.
Dataset Splits	Yes	In the Label Me dataset, the augmented training set contains 10,000 images, the validation set contains 500 images, and the testing set contains 1,188 images. In the CIFAR-10H dataset, the training and validation sets contain 10,000 images from the original CIFAR-10H test set, and 2,000 unseen images from the CIFAR-10 dataset are sampled for evaluation.
Hardware Specification	No	No specific hardware details (like GPU models, CPU types, or memory amounts) are mentioned in the paper. It only states the use of PyTorch and a VGG-16 backbone model, which are software and model architectures, respectively, not hardware specifications.
Software Dependencies	No	The paper states, 'Our models are implemented with the Py Torch library,' but does not provide a specific version number for PyTorch or any other software dependencies. Therefore, it lacks the specific version details required for reproducibility.
Experiment Setup	Yes	The trade-off parameter λ is increase with training step: λ(t) = min(1, t/Tw). The learning rate λ is selected from [0.0005, 0.001, 0.005, 0.01]. The weight decay is selected from [0, 0.003, 0.009, 0.01]. The annealing step Tw is selected from [5, 10, 15, 20]. The epoch in each step E is selected from the range [1, 5]. According to the validation set, λ is set to 0.001, the weight decay is set to 0, E is set to 2, and Tw is set to 5.