Enhancing Implicit Neural Representations via Symmetric Power Transformation
Authors: Weixiang Zhang, Shuzhao Xie, Chengwei Ren, Shijia Ge, Mingzi Wang, Zhi Wang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to verify the performance of the proposed method, demonstrating that our transformation can reliably improve INR compared with other data transformations. We also conduct 1D audio, 2D image and 3D video fitting tasks to demonstrate the effectiveness and applicability of our method. |
| Researcher Affiliation | Academia | Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/zwx-open/Symmetric-Power Transformation-INR |
| Open Datasets | Yes | We chose SIREN as the backbone, fitting the widely used processed DIV2K datasets (Agustsson and Timofte 2017; Tancik et al. 2020) and Kodak dataset (E.Kodak 1999). We use Libri Speech (Panayotov et al. 2015) datasets to evaluate the effectiveness of our method. evaluated on the Shake NDry video from the UVG dataset (Mercat, Viitanen, and Vanne 2020) |
| Dataset Splits | Yes | Specifically, we selected the test.clean split of the dataset and cropped each audio to the first 5 seconds at a sampling rate of 16k Hz. evaluated on the Shake NDry video from the UVG dataset (Mercat, Viitanen, and Vanne 2020) (the first 30 frames with 1920 × 1080 resolution). |
| Hardware Specification | Yes | All experiments were conducted on 4 GPUs equipped with NVIDIA RTX 3090. |
| Software Dependencies | No | The paper mentions 'l2 loss functions and the Adam optimizer' and 'SIREN' and 'FINER' as backbones, but does not specify version numbers for general software libraries or programming languages required for reproduction. |
| Experiment Setup | Yes | We implemented all experiments using l2 loss functions and the Adam optimizer (Kingma and Ba 2015). We set the total number of iterations to 5000, with hyper-parameters ξ = 0.5, τ = 0.1, and κ = 256 in our method. Following the setting of Siamese SIREN (Lanzend orfer and Wattenhofer 2023), we set ω and ω0 both to 100 in the SIREN backbone. We conducted the video fitting with SIREN and FINER backbones, both using the same network size of 6 × 256. Each scenario was trained for 100 epochs. |