Label Distribution Changing Learning with Sample Space Expanding
Authors: Chao Xu, Hong Tao, Jing Zhang, Dewen Hu, Chenping Hou
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Besides evaluating LDCL on most of the existing 13 data sets, we also apply it in the application of emotion distribution recognition. Experimental results demonstrate the effectiveness of our approach in both tackling label ambiguity problem and estimating facial emotion. |
| Researcher Affiliation | Academia | Chao Xu EMAIL College of Science, National University of Defense Technology Hong Tao EMAIL College of Science, National University of Defense Technology Jing Zhang EMAIL College of Science, National University of Defense Technology Dewen Hu EMAIL College of Intelligence Science and Technology, National University of Defense Technology Chenping Hou EMAIL College of Liberal Arts and Science, National University of Defense Technology |
| Pseudocode | Yes | Algorithm 1 Graph 1: Initialize M(0), p, λ and γ; 2: Calculate y(0) with p; 3: Calculate F; 4: while Stopping criterion is not satisfied do 5: Solve p(t+1) by Eq.(22) and equality 1 pi = ri; 6: Calculate y(t+1) with p(t+1); 7: Update M(t+1) by solving Eq.(18) using L-BFGS; 8: t = t + 1. 9: end while |
| Open Source Code | Yes | All the codes are shared by original authors, and we use the suggested default parameters. |
| Open Datasets | Yes | We evaluate our methods and comparative methods on the real-world data sets. There are in total 13 data sets including biology, movie ratings, emotional analysis and so on. The Yeast series datasets are real-world datasets collected from biological experiments with Saccharomyces cerevisiae. ... The Natural Scene dataset is derived from 2000 natural scenes, ... The Movie dataset is about user ratings of movies, ... More details of them can be found in the literature (Geng, 2016). |
| Dataset Splits | Yes | Without loss of generality, we divide the data into two parts, with 10% of the data as the test data and 90% as the training data. The training data is divided into two parts, 10% of the training data is relabeled data with emerging new labels, and 90% of the training data is un-relabeled data. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory amounts used for experiments. |
| Software Dependencies | No | The paper mentions "L-BFGS" which is an algorithm, but does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9) for reproducibility. |
| Experiment Setup | Yes | In the experiments, the parameters λ and γ are both selected by grid searching from {10 4, 10 3, , 104} by cross-validation on training data. For Algorithm 2, we add a clustering parameter k. Similarly, we use the grid search method to select the best number of clusters from [1, 2, , 9] through cross-validation on training data. The maximum iteration is set to be 100. The stopping criterion parameter ϵ is set to be 10 3. |