reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonparametric Identification of Latent Concepts

Authors: Yujia Zheng, Shaoan Xie, Kun Zhang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical results in both synthetic and real-world settings.
Researcher Affiliation	Academia	1Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. Procedural steps are described in regular paragraph text.
Open Source Code	No	Experiments are conducted using the official implementation of GIN2 (Sorrenson et al., 2020) with an additional ℓ1 regularization on the Jacobians and Fr EIA3 (Ardizzone et al., 2018-2022) for the flow-based generative function.
Open Datasets	Yes	To assess the applicability of our proposed structural condition in complex practical contexts, we performed experiments on five different real-world datasets, i.e., the Fashion-MNIST (Xiao et al., 2017), EMNIST (Cohen et al., 2017), Animal Face (Si & Zhu, 2011), Flower102 (Nilsback & Zisserman, 2008) , and FFHQ (Karras et al., 2019) datasets.
Dataset Splits	No	In the considered setting, different samples may correspond to different classes selected by a mask. We structure the dataset as {(x, c)(i)}N i=1, where N denotes the sample size, and c(i) is a multi-hot vector representing the classes for the data point x(i). ... For synthetic settings, the sample size is set as 10,000. The paper specifies total sample size for synthetic data but does not explicitly mention training/validation/test splits for any dataset.
Hardware Specification	Yes	Moreover, all experiments are conducted on 12 CPU cores with 16 GB RAM.
Software Dependencies	No	Experiments are conducted using the official implementation of GIN2 (Sorrenson et al., 2020) with an additional ℓ1 regularization on the Jacobians and Fr EIA3 (Ardizzone et al., 2018-2022) for the flow-based generative function. ... We use Generative Flow (Kingma & Dhariwal, 2018) as the nonlinear generating function. The paper mentions specific software tools but does not provide version numbers for them or any other key software libraries.
Experiment Setup	Yes	The objective function is defined as L(θ) = E(x,c)[− log p ˆ f 1(x \| Mi,: c)+λR], where λ is the regularization parameter, and R represents the ℓ1 norm applied to ˆ M and, if estimating class-independent concepts, also to Dˆz ˆf. Following previous work, we use Mean Correlation Coefficient (MCC) to measure the alignment between the ground-truth and the recovered latent concepts. The results are from 10 random trials. Additional details and results are provided in Appx. C, such as identification with general noises and supplementary evaluation metrics across various settings. ... The regularization parameters λ is set according to a search in λ {0.01, 0.1, 1}, and we select λ = 0.1 according to the average MCCs of experiments conducted on synthetic datasets.