reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

Authors: Amandine Brunetto, Sascha Hornauer, Fabien Moutarde

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that Ne RAF generates high-quality audio on Sound Spaces and RAF datasets, achieving significant performance improvements over prior methods while being more data-efficient. Additionally, Ne RAF enhances novel view synthesis of complex scenes trained with sparse data through cross-modal learning. Ne RAF is designed as a Nerfstudio module, providing convenient access to realistic audio-visual generation. Project page: https://amandinebtto.github.io/Ne RAF
Researcher Affiliation	Academia	Amandine Brunetto, Sascha Hornauer, Fabien Moutarde Center for Robotics, Mines Paris PSL University Paris, France EMAIL
Pseudocode	No	The paper describes the architecture and methodology in detail using text and figures (Figure 2, Figure 4) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We release Ne RAF s code to the community. Project page: https://amandinebtto.github.io/Ne RAF
Open Datasets	Yes	We validate our method on Sound Spaces (Chen et al., 2020a; 2022b), a simulated dataset, and on RAF (Chen et al., 2024), a real-world dataset.
Dataset Splits	Yes	Similar to (Su et al., 2022; Luo et al., 2022; Liang et al., 2023a), we use 90% of Sound Spaces audio data for training and 10% for testing. For RAF, we follow previous works experimental setup: we keep 80% of data for training and 20% for evaluation. Nerfstudio automatically keep 90% of them for training and 10% for evaluation.
Hardware Specification	Yes	We train our method on a single RTX 4090 GPU.
Software Dependencies	No	We implement our method using Py Torch framework (Paszke et al., 2019). We optimize NAc F using Adam optimizer (Kingma & Ba, 2014) with β1 = 0.9 and β2 = 0.999 and ϵ = 10 15. For Ne RF, just as AV-Ne RF we keep default Nerfacto parameters. The paper mentions PyTorch and Nerfstudio but does not provide specific version numbers for these software components.
Experiment Setup	Yes	We optimize NAc F using Adam optimizer (Kingma & Ba, 2014) with β1 = 0.9 and β2 = 0.999 and ϵ = 10 15. The initial learning rate is 10 4. It decreases exponentially to reach 10 8. For Ne RF, just as AV-Ne RF we keep default Nerfacto parameters. For the first 2k iterations, we only train the Ne RF part. It allows the grid to be filled and updated several times using batches of 4,096 voxel-centers. After, both Ne RF and NAc F are train jointly. We use batch sizes of 4,096 for Ne RF and 2,048 for NAc F. Ne RAF is trained for 500k iterations but most runs reach their peak performance before, depending of the room size. We empirically select λA = 10 3, λSC = 10 1 and λSL = 1.