Real-Time Neural Denoising with Render-Aware Knowledge Distillation

Authors: Mengxun Kong, Jie Guo, Chen Wang, Ye Yuan, Yanwen Guo

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on multiple benchmark scenes have demonstrated the effectiveness and superiority of the proposed RADK framework. In particular, it helps to achieve state-of-the-art Monte Carlo denoising quality with a very small neural network, guaranteeing real-time running performance.
Researcher Affiliation Academia Mengxun Kong, Jie Guo*, Chen Wang, Ye Yuan, Yanwen Guo Nanjing University, Nanjing 210023, China EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology in narrative text and uses network architecture diagrams (Fig. 1, Fig. 2) to illustrate the framework, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper mentions optimizing the model to run in real-time using Tensor RT and provides a link to NVIDIA's Tensor RT developer page (https://developer.nvidia.com/tensorrt), but it does not provide concrete access to the authors' own source code implementation for the methodology described.
Open Datasets Yes We generate our labeled training dataset using 6 publicly available Tungsten scenes (Bitterli 2016), and our unlabeled dataset is generated using 4 publicly available Tungsten scenes and 4 publicly available PBRT (Pharr, Jakob, and Humphreys 2016) scenes. We evaluate the performance of our RAKD framework and two denoisers (the large teacher model and the small student model) on a set of test scenes, including the Bistro (Lumberyard 2017), Zero-Day (Winkelmann 2019), Classroom and Dining room (Bitterli 2016).
Dataset Splits No The paper mentions generating a 'labeled training dataset' and an 'unlabeled dataset' using specific scenes, and evaluating on a separate set of 'test scenes'. It also states 'Our training dataset consists of a total of 600 10-frame sequences.' However, it does not provide specific details on the numerical split of these datasets into training, validation, and test sets (e.g., percentages or exact counts for each split).
Hardware Specification Yes It takes about 500 epochs for our teacher model to converge on our labeled dataset, costing about 3 days on a single RTX 4090 GPU, and it takes about 800 epochs for our student model to converge on the union of our labeled dataset and unlabeled dataset, costing about 5 days on a single RTX 4090 GPU. All comparisons were conducted on the NVIDIA RTX 4090 GPU.
Software Dependencies No The paper states 'Our denoiser is implemented with Py Torch (Paszke et al. 2017)' and 'We further optimize our model to run in real-time using Tensor RT,' but it does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We use Adam optimizer (Kingma and Ba 2014) with a learning rate of 10 4 in the beginning, and we use a step LR scheduler to decay our learning rate by a factor of 0.8 every 100 epochs. We take 256 x 256 patches on our dataset and augment them with rotations and flips. We use a batch size of 8 to train our model. It takes about 500 epochs for our teacher model to converge on our labeled dataset, costing about 3 days on a single RTX 4090 GPU, and it takes about 800 epochs for our student model to converge on the union of our labeled dataset and unlabeled dataset, costing about 5 days on a single RTX 4090 GPU.