PIN: Prolate Spheroidal Wave Function-based Implicit Neural Representations
Authors: Viraj Dhananjaya Bandara Jayasundara Jayasundara Mudiyanselage, Heng Zhao, Demetrio Labate, Vishal Patel
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results reveal that PIN excels not only in representing images and 3D shapes but also significantly outperforms existing methods in various vision tasks that require INR generalization, including image inpainting, novel view synthesis, edge detection, and image denoising. To address this challenge, we introduce the Prolate Spheroidal Wave Function-based Implicit Neural Representations (PIN), which exploits the optimal space-frequency domain concentration of Prolate Spheroidal Wave Functions (PSWFs) as the nonlinear mechanism in INRs. Our experimental results reveal that PIN excels not only in representing images and 3D shapes but also significantly outperforms existing methods in various vision tasks that require INR generalization, including image inpainting, novel view synthesis, edge detection, and image denoising. Extensive numerical experiments demonstrate that our new INR model excels not only in representation tasks but also in more challenging reconstruction tasks, e.g., image inpainting, where exisitng INRs often perform poorly. |
| Researcher Affiliation | Academia | 1Johns Hopkins University, 2University of Houston, 3Data Science Platform, The Rockefeller University EMAIL, {hzhao25@central, dlabate@}.uh.edu |
| Pseudocode | No | The paper describes the methodology using mathematical formulations (e.g., Equation 1 and related expansions) and prose, but does not include any explicit pseudocode blocks or algorithms labeled as such. |
| Open Source Code | No | The paper states: "In the appendix, we provide a comprehensive theoretical analysis of PIN s forward propagation, alongside a combined theoretical and experimental investigation into why existing baseline spacefrequency compact INRs fall short compared to PIN. This analysis helps to elucidate the superior performance of PIN in capturing complex details. The appendix also includes the complete numerical implementation of PIN, where we demonstrate its robustness to variations in parameters and provide a detailed examination of how different weight initialization strategies impact its performance." However, it describes the implementation details in Section A.2.3 and A.2.1, but does not provide a direct link to a code repository or an explicit statement about the public release of the code. |
| Open Datasets | Yes | For assessing the effectiveness of INRs in image representation tasks, the Kodak Lossless True Color Image Dataset (Franzen, 1999) was employed. Additional image representation results including a thorough evaluation on DIV2K dataset (Agustsson & Timofte, 2017), and learning curves are provided in the supplementary material. For this experiment two occupancy volumes namely Asian Dragon, and Armadillo, which are shown in figure 4, were obtained from Stanford 3D shape dataset (Stanford University Computer Graphics Laboratory), and sampled on a grid of 512 512 512, assigning a value of 1 to each voxel inside the volume and a value of 0 to those outside. |
| Dataset Splits | Yes | We employ two testing strategies: one involves training with 70% of the data sampled randomly, and the other uses a predefined text mask that obscures the image with varying font sizes. For this experiment, we used a vanilla Ne RF architecture consisting of two fully connected blocks, each containing four layers, and the drums dataset with 100 training images and 200 testing images. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions the use of PyTorch and an Adam optimizer. |
| Software Dependencies | No | For numerical experiments, we used the Py Torch framework and the Adam optimizer (learning rate 0.001) with a Multilayer Perceptron (MLP) comprising layers of 300 neurons in each hidden layer. The paper mentions 'Py Torch framework' but does not specify its version number. No other software dependencies with version numbers are mentioned. |
| Experiment Setup | Yes | For numerical experiments, we used the Py Torch framework and the Adam optimizer (learning rate 0.001) with a Multilayer Perceptron (MLP) comprising layers of 300 neurons in each hidden layer. This hyperparameter tuning included the variation of PSNR with the number of hidden neurons, while keeping the number of hidden layers constant at 3 (shown in the left figure of figure 7). Additionally, the study examined the variation of PSNR with the number of hidden layers, while maintaining the number of hidden neurons at 300 (shown in the middle figure of figure 7). Lastly, the variation of PSNR with the learning rate was analyzed, with the number of hidden layers kept at 3 and the number of hidden neurons at 300 (shown in the right figure of figure 7) |