Leveraging Task Structures for Improved Identifiability in Neural Network Representations

Authors: Wenlin Chen, Julien Horwood, Juyeon Heo, José Miguel Hernández-Lobato

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we find that this straightforward optimization procedure enables our model to outperform more general unsupervised models in recovering canonical representations for both synthetic data and real-world molecular data. [...] Section 4 empirically evaluates the proposed method on both synthetic datasets and real-world molecular datasets. [...] This section empirically validates our model s ability to recover canonical representations up to permutations and scaling for both synthetic and real-world data. [...] In Table 1, we show that MTLCM manages to recover the ground-truth latent factors from h up to permutations and scaling, and the result is scalable as the number of latent factors and the number of causal factors increase.
Researcher Affiliation Academia Wenlin Chen EMAIL University of Cambridge, Cambridge, United Kingdom Max Planck Institute for Intelligent Systems, Tübingen, Germany. Julien Horwood EMAIL University of Cambridge, Cambridge, United Kingdom. Juyeon Heo EMAIL University of Cambridge, Cambridge, United Kingdom. José Miguel Hernández-Lobato EMAIL University of Cambridge, Cambridge, United Kingdom.
Pseudocode Yes Algorithm 1 Pseudocode for the data generating process in the synthetic data experiments
Open Source Code Yes Our code is available at https://github.com/jdhorwood/mtlcm.
Open Datasets Yes We further evaluate our model on two real-world molecular datasets. [...] The superconductivity dataset (Hamidieh, 2018) consists of 21,263 superconductors. [...] The QM9 dataset (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014) is a popular benchmark for molecular prediction tasks
Dataset Splits No For each task, we first sample the causal indicator variables c t . [...] we generate 500 tasks of 200 samples each to improve convergence of the multitask model. [...] The superconductivity dataset (Hamidieh, 2018) consists of 21,263 superconductors. [...] The QM9 dataset (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014) is a popular benchmark for molecular prediction tasks consisting of 134,000 enumerated organic molecules. The paper describes how synthetic data is generated and the total sizes of real-world datasets but does not provide specific training, validation, or test splits.
Hardware Specification No Part of this work was performed using resources provided by the Cambridge Service for Data Driven Discovery (CSD3) operated by the University of Cambridge Research Computing Service (www.csd3.cam.ac.uk), provided by Dell EMC and Intel using Tier-2 funding from the Engineering and Physical Sciences Research Council (capital grant EP/T022159/1), and Di RAC funding from the Science and Technology Facilities Council (www.dirac.ac.uk). This text mentions computing resources and providers (Dell EMC and Intel) but does not specify particular GPU or CPU models or other detailed hardware specifications.
Software Dependencies No The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions used for implementation).
Experiment Setup Yes G Model Configurations: In Stage 1, the learnable parameters of a multi-task regression network (MTRN) are the feature extractor parameters ϕ and the task-specific regression weights wt for all tasks t. These model parameters are learned by maximum likelihood as defined in Equation (3). In Stage 2, the learnable parameters of a multi-task linear causal model (MTLCM) are the linear transformation A, the causal indicators ct for all tasks t, and the spurious coefficients γt for all tasks t. These are free parameters learned by maximum marginal likelihood as defined in Equation (14). [...] H Experiment Settings for the Synthetic Data: This section details the precise process for the data generation of the synthetic data for both the linear and non-linear experiments in Section 4.1. Algorithm 1 details the full data generation process, Table 4 details the experiment hyperparameters used in the linear setting and Table 5 details the hyperparameters used in the non-linear setting.