Discretization Invariant Networks for Learning Maps between Neural Fields
Authors: Clinton Wang, Polina Golland
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply convolutional DI-Nets to toy classification (NF scalar) and dense prediction (NF NF) tasks, and analyze its behavior under different discretizations. Our aim is not to compete with discrete networks on these tasks, but rather to illustrate the learning behavior of CNNs compared to DI-Nets with the equivalent architectures, without additional techniques or types of layers. We demonstrate that convolutional DI-Nets learn maps that often generalize to unseen discretizations, whereas maps learned by pre-trained CNNs are highly sensitive to perturbations of the discretization. |
| Researcher Affiliation | Academia | Clinton Wang EMAIL MIT CSAIL Polina Golland EMAIL MIT CSAIL |
| Pseudocode | Yes | We present the design of such a network in Algorithm 1, with additional details in the Appendix. Algorithm 1: DI-Net approximation of J : πf 7 R[f](x) Algorithm 2: Classifier Training Algorithm 3: Dense Prediction Training |
| Open Source Code | Yes | Code: https://github.com/clintonjwang/DI-net. |
| Open Datasets | Yes | Data We perform classification on a dataset of 8,400 NFs fit to a subset of Image Net1k (Deng et al., 2009), with 700 samples from each of the 12 superclasses in the big_12 dataset (Engstrom et al., 2019), which is derived from the Word Net hierarchy. Data We perform semantic segmentation of SIRENs fit to street view images from Cityscapes (Cordts et al., 2016), grouping segmentation labels into 7 categories. |
| Dataset Splits | Yes | For each class, we train DI-Net on 500 SIRENs (Sitzmann et al., 2020b) and evaluate on 200 Gaussian Fourier feature networks (Tancik et al., 2020b). We train on 2975 NFs with coarsely annotated segmentations only, and test on 500 NFs with both coarse and fine annotations (Fig. 4). |
| Hardware Specification | Yes | A 4-layer DI-Net performs a forward pass on a batch of 48 images in 96 4ms on a single NVIDIA RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions several optimizers and architectures (Adam, AdamW, SIREN, Efficient Net, ConvNexT-UPerNet) and a tool (Torch Kb Nufft), but does not provide specific version numbers for any software libraries or dependencies required for reproduction. |
| Experiment Setup | Yes | We fit SIREN (Sitzmann et al., 2020b) to each image in Image Net using 5 fully connected layers with 256 channels and sine non-linearities, trained for 2000 steps with an Adam optimizer at a learning rate of 10 4. Each network is trained for 8K iterations with a learning rate of 10 3. In training, the CNNs sample neural fields along the 32 32 grid. DI-Nets and the non-uniform network sample 1024 points generated from a scrambled Sobol sequence (QMC discretization). Networks are trained for 10K iterations with a learning rate of 10 3. We train each network for 1000 iterations with an Adam W optimizer with a batch size of 64 and a learning rate of 0.1 with an MSE loss on the SDF. |