reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Label Distribution Changing Learning with Sample Space Expanding

Authors: Chao Xu, Hong Tao, Jing Zhang, Dewen Hu, Chenping Hou

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Besides evaluating LDCL on most of the existing 13 data sets, we also apply it in the application of emotion distribution recognition. Experimental results demonstrate the eﬀectiveness of our approach in both tackling label ambiguity problem and estimating facial emotion.
Researcher Affiliation	Academia	Chao Xu EMAIL College of Science, National University of Defense Technology Hong Tao EMAIL College of Science, National University of Defense Technology Jing Zhang EMAIL College of Science, National University of Defense Technology Dewen Hu EMAIL College of Intelligence Science and Technology, National University of Defense Technology Chenping Hou EMAIL College of Liberal Arts and Science, National University of Defense Technology
Pseudocode	Yes	Algorithm 1 Graph 1: Initialize M(0), p, λ and γ; 2: Calculate y(0) with p; 3: Calculate F; 4: while Stopping criterion is not satisﬁed do 5: Solve p(t+1) by Eq.(22) and equality 1 pi = ri; 6: Calculate y(t+1) with p(t+1); 7: Update M(t+1) by solving Eq.(18) using L-BFGS; 8: t = t + 1. 9: end while
Open Source Code	Yes	All the codes are shared by original authors, and we use the suggested default parameters.
Open Datasets	Yes	We evaluate our methods and comparative methods on the real-world data sets. There are in total 13 data sets including biology, movie ratings, emotional analysis and so on. The Yeast series datasets are real-world datasets collected from biological experiments with Saccharomyces cerevisiae. ... The Natural Scene dataset is derived from 2000 natural scenes, ... The Movie dataset is about user ratings of movies, ... More details of them can be found in the literature (Geng, 2016).
Dataset Splits	Yes	Without loss of generality, we divide the data into two parts, with 10% of the data as the test data and 90% as the training data. The training data is divided into two parts, 10% of the training data is relabeled data with emerging new labels, and 90% of the training data is un-relabeled data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or memory amounts used for experiments.
Software Dependencies	No	The paper mentions "L-BFGS" which is an algorithm, but does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9) for reproducibility.
Experiment Setup	Yes	In the experiments, the parameters λ and γ are both selected by grid searching from {10 4, 10 3, , 104} by cross-validation on training data. For Algorithm 2, we add a clustering parameter k. Similarly, we use the grid search method to select the best number of clusters from [1, 2, , 9] through cross-validation on training data. The maximum iteration is set to be 100. The stopping criterion parameter ϵ is set to be 10 3.