Nonparametric Identification of Latent Concepts

Authors: Yujia Zheng, Shaoan Xie, Kun Zhang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical results in both synthetic and real-world settings.
Researcher Affiliation Academia 1Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. Procedural steps are described in regular paragraph text.
Open Source Code No Experiments are conducted using the official implementation of GIN2 (Sorrenson et al., 2020) with an additional ℓ1 regularization on the Jacobians and Fr EIA3 (Ardizzone et al., 2018-2022) for the flow-based generative function.
Open Datasets Yes To assess the applicability of our proposed structural condition in complex practical contexts, we performed experiments on five different real-world datasets, i.e., the Fashion-MNIST (Xiao et al., 2017), EMNIST (Cohen et al., 2017), Animal Face (Si & Zhu, 2011), Flower102 (Nilsback & Zisserman, 2008) , and FFHQ (Karras et al., 2019) datasets.
Dataset Splits No In the considered setting, different samples may correspond to different classes selected by a mask. We structure the dataset as {(x, c)(i)}N i=1, where N denotes the sample size, and c(i) is a multi-hot vector representing the classes for the data point x(i). ... For synthetic settings, the sample size is set as 10,000. The paper specifies total sample size for synthetic data but does not explicitly mention training/validation/test splits for any dataset.
Hardware Specification Yes Moreover, all experiments are conducted on 12 CPU cores with 16 GB RAM.
Software Dependencies No Experiments are conducted using the official implementation of GIN2 (Sorrenson et al., 2020) with an additional ℓ1 regularization on the Jacobians and Fr EIA3 (Ardizzone et al., 2018-2022) for the flow-based generative function. ... We use Generative Flow (Kingma & Dhariwal, 2018) as the nonlinear generating function. The paper mentions specific software tools but does not provide version numbers for them or any other key software libraries.
Experiment Setup Yes The objective function is defined as L(θ) = E(x,c)[− log p ˆ f 1(x | Mi,: c)+λR], where λ is the regularization parameter, and R represents the ℓ1 norm applied to ˆ M and, if estimating class-independent concepts, also to Dˆz ˆf. Following previous work, we use Mean Correlation Coefficient (MCC) to measure the alignment between the ground-truth and the recovered latent concepts. The results are from 10 random trials. Additional details and results are provided in Appx. C, such as identification with general noises and supplementary evaluation metrics across various settings. ... The regularization parameters λ is set according to a search in λ {0.01, 0.1, 1}, and we select λ = 0.1 according to the average MCCs of experiments conducted on synthetic datasets.