Guided Structural Inference: Leveraging Priors with Soft Gating Mechanisms
Authors: Aoran Wang, Xinnan Dai, Jun Pang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on 16 datasets show SGSI improves edge recovery by up to 9% AUROC over baselines, scales to larger graphs (94.2% AUROC), and maintains stable training. SGSI bridges domain expertise with data-driven learning, enabling interpretable and robust structural discovery in dynamical systems. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Luxembourg 2Shanghai Tech University 3Institute for Advanced Studies, University of Luxembourg. Correspondence to: Jun Pang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Soft-Gated Structural Inference (SGSI) Training Algorithm 2 Encoder of SGSI Forward Pass Algorithm 3 Two-Layer MLP with ELU and Batch Normalization Algorithm 4 Node-to-Edge Mapping with Gating Algorithm 5 Edge-to-Node Aggregation with Gating |
| Open Source Code | Yes | The implementation of SGSI is at: https://github.com/wang422003/SGSI-Guided-Structural-Inference-Leveraging-Priors-with-Soft-Gating-Mechanisms. |
| Open Datasets | Yes | Our study first evaluates the SGSI model on two established structural inference datasets: the Spring Simulations dataset (Kipf et al., 2018), which simulates dynamic interactions of balls connected by springs within a symmetric setting, and the Net Sim dataset (Smith et al., 2011b), which consists of simulated blood-oxygen-level-dependent imaging data from various brain regions in an asymmetric network. Furthermore, we examined six directed synthetic biological networks (Linear, Linear Long, Cycle, Bifurcating, Trifurcating, and Bifurcating Converging) as outlined in (Pratapa et al., 2020), with abbreviations LI, LL, CY, BF, TF and BFCV, respectively. We also incorporated data from the Struct Infer Benchmark (Wang et al., 2024), focusing on Vascular Networks (VN) with node counts ranging from 15 to 100. |
| Dataset Splits | Yes | We collect the trajectories and randomly group them into three sets for training, validation and testing with the ratio of 8: 2: 2, respectively. We then sample different numbers of trajectories from raw trajectories and randomly group them into three datasets: for training, for validation, and for testing, with a ratio of 8 : 2 : 2. As the data is already split into three sets: for training, for validation, and for testing, we keep this setting. |
| Hardware Specification | Yes | All experiments were conducted on a single NVIDIA Ampere 40GB HBM graphics card, paired with 2 AMD Rome CPUs (32 cores@2.35 GHz). |
| Software Dependencies | No | The model of SGSI is implemented with Py Torch (Paszke et al., 2019), while Scikit-learn package was leveraged for the calculation of metrics (Pedregosa et al., 2011). |
| Experiment Setup | Yes | During training, we set batch size as 128 for datasets which have less than 30 nodes, for those having 30 or 50 nodes, we set batch size as 64. For the rest of the data sets, we set the batch size to be 16. The learning rate is set to be 5e-4. We train SGSI model with 1000 epochs on every dataset. The choice of the hyperparameters in the loss function play a non-neglect role in training SGSI, and the their values are searched via Bayesian Optimization toolbox Optuna (Akiba et al., 2019). We set the bounds for β, λsparsity, and λdeg as [1.0, 2.5], [10-3, 10-2], and [10-4, 10-2], respectively. The values of the hyper-parameters are summarized in Table 3. |