Fast, Accurate Manifold Denoising by Tunneling Riemannian Optimization

Authors: Shiyu Wang, Mariam Avagyan, Yihan Shen, Arnaud Lamy, Tingran Wang, Szabolcs Marka, Zsuzsanna Marka, John Wright

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on scientific manifolds demonstrate significantly improved complexity-performance tradeoffs compared to nearest neighbor search, which underpins existing provable denoising approaches based on exhaustive search.
Researcher Affiliation Academia 1Department of Electrical Engineering, Columbia University 2Data Science Institute, Columbia University 3Department of Computer Science, Columbia University 4Department of Physics 5Institute of Advanced Studies (i ASK), Chernel utca 14, K oszeg, 9730, Hungary 6Columbia Astrophysics Laboratory 7Department of Applied Physics and Applied Mathematics, Columbia University. Correspondence to: Shiyu Wang <EMAIL>.
Pseudocode Yes Algorithm 1 Manifold Traversal Algorithm 2 Online Learning For Manifold Traversal Algorithm 3 101Traversal Algorithm 4 Incr PCAon Matrix(X, d) Algorithm 5 Incr PCA(xi+1 q M, Ui, Λi, i + 1, d)
Open Source Code Yes The code for our framework and experiments is available at https://github.com/shiyu-w/Manifold_Traversal.
Open Datasets Yes We learn a denoiser on a dataset of 100,000 noisy gravitational waves (Abramovici et al., 1992; Aasi et al., 2015) using the online method as described in Algorithm 2. ...We evaluate our method on large-scale real-world image data by performing patch-level denoising. Specifically, we randomly select 300 RGB images from Image Net...We conduct an additional experiment to denoise a single natural image from the DIV2K dataset(Agustsson & Timofte, 2017)
Dataset Splits Yes The training set consists of 100,000 noisy waveforms, the test set contains 20,000 noisy waveforms. ...we use the first 890,000 patches to train our traversal network. ...After shuffling, we use the first 170,000 patches for training.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments. It mentions the dimensionality of data but not computing resources.
Software Dependencies Yes We generate synthetic gravitational waveforms with the Py CBC package (Nitz et al., 2023) with masses drawn from a Gaussian distribution with mean 35 and variance 15.
Experiment Setup Yes All autoencoders are trained using the Adam optimizer with a learning rate of 1 x 10^-3. As we can see in the Figure 13, high-complexity autoencoders can reach high accuracy. ...We simulate noise as i.i.d. Gaussian with standard deviation σ = 0.01...The parameter called denoising radius R(i) in Algorithm 2 controls complexity by determining the number of landmarks created. ...Table 1: The choice of hyperparameters yielding each denoiser. Ni corresponds to the number of points assigned to a landmark qi. For all experiments, σ = 0.01, d = 2,and D = 2048.