Approximately Equivariant Neural Processes

Authors: Matthew Ashman, Cristiana Diaconu, Adrian Weller, Wessel Bruinsma, Richard Turner

NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments, showing that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.
Researcher Affiliation Collaboration Matthew Ashman University of Cambridge EMAIL Cristiana Diaconu University of Cambridge EMAIL Adrian Weller University of Cambridge The Alan Turing Institute EMAIL Wessel Bruinsma Microsoft Research AI for Science EMAIL Richard E. Turner University of Cambridge Microsoft Research AI for Science The Alan Turing Institute EMAIL
Pseudocode Yes Algorithm 1: Forward pass through the Conv CNP (T) for off-the-grid data.
Open Source Code Yes An implementation of our models can be found at cambridge-mlg/aenp.
Open Datasets Yes We consider a synthetic 1-D regression task with datasets drawn from a Gaussian process (GP) with the Gibbs kernel [Gibbs, 1998]. derived from ERA5 [Copernicus Climate Change Service, 2020], consisting of surface air temperatures for the years 2018 and 2019.
Dataset Splits Yes For each task, we sample the number of context points Nc U{1, 64} and set the number of target points to Nt = 128. The context range [xc,min, xc,max] (from which the context points are uniformly sampled) is an interval of length 4, with its centre randomly sampled according to U[ 7,7] for the ID task, and according to U[13,27] for the OOD task. The target range is [xt,min, xt,max] = [xc,min 1, xc,max + 1]. This is also applicable during testing, with the test dataset consisting of 80,000 datasets.
Hardware Specification Yes We train and evaluate all models on a single 11 GB NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions 'Adam W [Loshchilov and Hutter, 2017]' as the optimizer but does not specify versions for core software libraries like Python, PyTorch, or TensorFlow.
Experiment Setup Yes For all models, we optimise the model parameters using Adam W [Loshchilov and Hutter, 2017] with a learning rate of 5 10 4 and batch size of 16. Gradient value magnitudes are clipped at 0.5. We train for a maximum of 500 epochs, with each epoch consisting of 16,000 datasets (10,000 iterations per epoch).