Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems
Authors: Jie Chen
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation on over 800 matrices suggests that the construction time of these graph neural preconditioners (GNPs) is more predictable and can be much shorter than that of other widely used ones, such as ILU and AMG, while the execution time is faster than using a Krylov method as the preconditioner, such as in inner-outer GMRES. |
| Researcher Affiliation | Collaboration | Jie Chen MIT-IBM Watson AI Lab, IBM Research EMAIL |
| Pseudocode | Yes | Algorithm 1 FGMRES with M being a nonlinear operator |
| Open Source Code | Yes | The implementation of GNP is available at https://github.com/jiechenjiechen/GNP. |
| Open Datasets | Yes | To this end, we turn to the Suite Sparse matrix collection https://sparse.tamu.edu, which is a widely used benchmark in numerical linear algebra. |
| Dataset Splits | No | The paper describes how training data (b,x) pairs are generated for each matrix using sampling methods ("we sample x from both N(0, Σx m) and N(0, In) to form each training batch"), and how evaluation matrices are selected ("We select all square, real-valued, and non-SPD matrices whose number of rows falls between 1K and 100K and whose number of nonzeros is fewer than 2M. This selection results in 867 matrices from 50 application areas."). However, it does not specify how this collection of 867 matrices is split into training, validation, or test sets for the overall approach, as each GNN is trained individually for each matrix. |
| Hardware Specification | Yes | Our experiments are conducted on a machine with one Tesla V100(16GB) GPU, 96 Intel Xeon 2.40GHz cores, and 386GB main memory. |
| Software Dependencies | No | All code is implemented in Python with Pytorch. The paper mentions software packages like scipy.sparse.linalg.spilu, Super LU, Py AMG, Amg X, but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | We use L = 8 Res-GCONV layers, set the layer input/output dimension to 16, and use 2-layer MLPs with hidden dimension 32 for lifting/projection. We use Adam (Kingma & Ba, 2015) as the optimizer, set the learning rate to 1e-3, and train for 2000 steps with a batch size of 16. We apply neither dropouts nor weight decays. ... We use the ℓ1 residual norm AM(b) Ax 1 as the training loss... We use m = 40 Arnoldi steps when sampling the (x, b) pairs according to (5). Among the 16 pairs in a batch, 8 pairs follow (5) and 8 pairs follow x N(0, In). |