FreSh: Frequency Shifting for Accelerated Neural Representation Learning
Authors: Adam Kania, Marko Mihajlovic, Sergey Prokudin, Jacek Tabor, Przemysław Spurek
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach experimentally, demonstrating improved quality in representation tasks such as image and video overfitting and in an inverse problem, specifically 3D shape modeling with Ne RF. We show that these improvements result from a more accurate approximation of all frequencies (see Figure 6). |
| Researcher Affiliation | Academia | Faculty of Mathematics and Computer Science, Jagiellonian University, Krakow, Poland1; ETH Zurich2; Balgrist University Hospital3; IDEAS NCBR4 |
| Pseudocode | Yes | We present the pseudocode for Fre Sh in Algorithm 1. The algorithm is slightly different between image and video/Ne RF tasks, due to multiple images being available for the latter. It is constructed by sampling 10 images for Ne RF and video approximation tasks, while for the image approximation task, Ysample consists of the same image repeated 10 times. Even though the target signal is not random for the image approximation task, it is also measured multiple times due to the randomness of the model output. Algorithm 1: Fre Sh Input: Θ set of embedding configurations, n > 0 Data: Ysample a sample of images, X input coordinates Output: θbest embedding configuration with the lowest Wasserstein distance dbest ; θbest None; for θ Θ do distances []; for Y in Ysample do Starget full spectrum(Y )[: n]; ϕ get random model weights(); Smodel full spectrum(model(θ, ϕ, X))[: n]; d wasserstein distance(Smodel, Starget); distances.append(d); dmean mean(distances); if dmean < dbest then dbest dmean; θbest θ; return θbest; |
| Open Source Code | Yes | The code is available at: https://github.com/gmum/Fre Sh/ |
| Open Datasets | Yes | We evaluate on image and video overfitting using the first 10 images from FFHQ (Karras et al., 2019) (both the in the wild and cropped images at a resolution of 1024x1024), Wiki Art (Saleh & Elgammal, 2015), Chest X-Ray (Kermany et al., 2018), and Kodak (Franzen, 2024) datasets. |
| Dataset Splits | No | The paper mentions using "the first 10 images from FFHQ" and "one image per dataset" (Table 1). For videos, it uses "the bikes and cat videos from (Sitzmann et al., 2020b)". These refer to specific data instances or subsets, but the paper does not define explicit train/test/validation splits for a larger dataset that would be needed to reproduce general experimental partitioning. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU models, or cloud computing environments with specifications are mentioned in the paper. |
| Software Dependencies | No | All experiments are implemented in Py Torch (Paszke et al., 2019) and use the Adam optimizer (Kingma & Ba, 2014). This statement mentions software but does not include specific version numbers for PyTorch or other libraries, which are needed for reproducibility. |
| Experiment Setup | Yes | In all experiments, we use Fre Sh to select one embedding configuration from σ {1, 2, . . . , 20}, ω0 {10, 20, . . . , 200}, ω {10, 20, . . . , 200} and k {0.0, 0.1, . . . , 3.0} (parameters are described in table 2). Unless stated otherwise, we set the spectrum size hyperparameter, n, to 64. All experiments are implemented in Py Torch (Paszke et al., 2019) and use the Adam optimizer (Kingma & Ba, 2014). For low-frequency image training, we reduced training time by a factor of 10 and lowered the learning rate by a factor of 10. |