reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Concentration Distribution Learning from Label Distributions

Authors: Jiawei Tang, Yuheng Jia

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments prove that the proposed approach is able to extract background concentrations from label distributions while producing more accurate prediction results than the stateof-the-art LDL methods. The code is available in https://github.com/seutjw/CDL-LD. ... Extensive experiments on the benchmark datasets clearly show that the LD predicted by our method is better than that predicted by state-of-the-art LDL methods, and the recovery results of our method on background concentrations also fit well with reality. ... 4. Experiments
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Southeast University, Nanjing 210096 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China. Correspondence to: Yuheng Jia <EMAIL>.
Pseudocode	No	The paper describes the proposed model with mathematical equations and a framework diagram (Figure 3), but it does not include an explicit pseudocode block or algorithm steps formatted like code.
Open Source Code	Yes	The code is available in https://github.com/seutjw/CDL-LD.
Open Datasets	Yes	The experiments are carried out on 12 real-world datasets with label distribution. The statistics of these datasets are summarized in Table 1. Among these, the first eight (from Alpha to Spoem) are from the clustering analysis of genomewide expression in Yeast Saccharomyces cerevisiae (Eisen et al., 1998). The SJAFFE is collected from JAFFE (Lyons et al., 1998), and the SBU 3DFE is obtained from BU 3DFE (Yin et al., 2006). The Scene consists of multi-label images, where the label distributions are transformed from rankings (Geng & Xia, 2014).
Dataset Splits	Yes	We run each method for ten-fold cross-validation.
Hardware Specification	Yes	The hardware configuration of the test machine is as follows: AMD EPYC 7K62 48-Cores CPU, 377G running memory and NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experiment. It mentions machine learning paradigms and specific models, but not the underlying software stack.
Experiment Setup	Yes	The suggested parameters are used for LDLLC, IIS-LLD, LCLR, LDLFs and DLDL. For LDLLDM, λ1, λ2 and λ3 are tuned from {10-3, 10-2, ..., 103}, and g is tuned from 1 to 14.