reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LIMEFLDL: A Local Interpretable Model-Agnostic Explanations Approach for Label Distribution Learning

Authors: Xiuyi Jia, Jinchi Li, Yunan Lu, Weiwei Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A series of numerical experiments and human experiments were conducted to validate the performance of the proposed algorithm in practical applications. The experimental results demonstrate that the proposed algorithm achieves high fidelity, consistency, and trustworthiness in explaining LDL models.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Nanjing University of Science and Technology, China 2Department of Computing, The Hong Kong Polytechnic University, China 3College of School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China. Correspondence to: Weiwei Li <EMAIL>.
Pseudocode	Yes	Algorithm 1 LIMEFLDL Input: instance to be interpreted x, LDL model F, sampling sample size m, weight kernel π. Initialization: A0, u(0), ρ, t = 1. Output: A. Algorithm 2 Get Summary Statistics: Getting summary statistics from training data Input: Train set X = {ζ1, . . . , ζn}, number of bins p.
Open Source Code	No	The paper does not provide an explicit statement or a link to the source code for the LIMEFLDL methodology. While it mentions that "All codes are provided by the original authors and we adopt the recommended parameters reported in their respective studies" in Section 4.3, this refers to the external LDL methods used (LDL-SCL, LDL-LRR, AA-KNN, MEM), not the authors' own implementation of LIMEFLDL.
Open Datasets	Yes	We conduct experiments on eight label distribution datasets from different kinds of real-world tasks to evaluate our approach. Yeast alpha (alpha) and Yeast cold (cold) (Geng, 2016) are collected from biological experiments. JAFFE (Lyons et al., 1998), SBU 3DFE (3DFE) (Yin et al., 2006), and Emotion6 (Emo) (Peng et al., 2015) are collected from sentiment mining tasks. Natural Scene(Scene) (Geng, 2016) is generated from a natural scene recognition task.
Dataset Splits	Yes	For each data set, we performed 10-fold cross-validations and recorded the mean and standard deviation for each evaluation metric.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or other detailed computer specifications used for running the experiments. It only mentions general experimental setup without hardware specifics.
Software Dependencies	No	The paper mentions the use of the Quickshift method from scikit-image, but does not provide a specific version number for scikit-image or any other software dependencies with version numbers, such as Python, PyTorch, or TensorFlow. For example, it states "segmented into superpixels using the Quickshift method from scikit-image" but omits the version.
Experiment Setup	Yes	In the fidelity-metric experiments, we set the number of sampled instances m to 100, and select the top 10 ranked features for evaluation on all test sets. For the consistency experiments, to ensure the explanatory model captures sufficient information, we increase m to 1000 and do not impose restrictions on feature rankings. Furthermore, in all experiments, the lagrange multiplier u(0) is initialized as an all-zero vector. The penalty coefficients λ, v and ρ are set to 0.01. ... For image data, the first step is to crop each image to a size of (256, 256, 3). The image is then segmented into superpixels using the Quickshift method from scikit-image, with the following parameters: kernel-size=4, max-dist=200, ratio=0.2, and random-seed=2024. ... For tabular data, the parameters are set as follows: discretizer= entropy , random-seed=2024, and discretize-continuous= True . ... For image explanations, the masking operation follows the original LIME settings. The hide-color parameter is set to None... For distance weights, cosine is used for image datasets, while the L2 norm is applied for tabular datasets. To balance accuracy and efficiency, the maximum number of iterations for L-BFGS is set to 50. In addition, all random seed parameters are fixed to ensure reproducibility and consistency across runs.