Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Exploring Invariance in Images through One-way Wave Equations
Authors: Yinpeng Chen, Dongdong Chen, Xiyang Dai, Mengchen Liu, Yinan Feng, Youzuo Lin, Lu Yuan, Zicheng Liu
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we empirically demonstrate that natural images can be reconstructed with high fidelity from compressed representations using a simple first-order norm-plus-linear autoregressive (FINOLA) process without relying on explicit positional information. Through systematic analysis, we observe that the learned coefficient matrices (A and B) in FINOLA are typically invertible, and their product, AB 1, is diagonalizable across training runs. Our empirical analysis reveals that the learned FINOLA matrices A and B are typically invertible and that their product, AB 1, is diagonalizable across multiple training runs. The paper includes several tables presenting PSNR values for image reconstruction, comparisons with other techniques, and ablation studies (e.g., "Table 1: Reconstruction PSNR across various resolutions.", "Table 2: Comparison with simpler autoregressive baselines."). |
| Researcher Affiliation | Collaboration | The authors are affiliated with: 1Google Deep Mind, 2Microsoft, 3Meta, 4University of North Carolina at Chapel Hill, 5AMD. This includes a mix of industry affiliations (Google DeepMind, Microsoft, Meta, AMD) and academic affiliations (University of North Carolina at Chapel Hill). |
| Pseudocode | No | The paper describes the FINOLA process and related equations (e.g., Eq. 1: z(x + 1, y) = z(x, y) + Aˆz(x, y)) within the main text, and details the implementation in sections like '2. First-order Norm+Linear Autoregression'. However, it does not contain a dedicated section or figure explicitly labeled 'Pseudocode' or 'Algorithm' with structured, code-like steps. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository or mention that code is included in supplementary materials. |
| Open Datasets | Yes | The paper refers to well-known publicly available datasets with citations, such as "Image Net (Deng et al., 2009)", "Kodak (Company, 1999) datasets", and "COCO object detection". |
| Dataset Splits | Yes | The paper mentions using standard benchmark dataset splits: "Our models are trained on the training set and subsequently evaluated on the validation set." It also specifically refers to evaluating on the "Image Net-1K validation set" and "COCO object detection results on the val2017 dataset". |
| Hardware Specification | Yes | End-to-end runtime of encoder/decoder for a 256x256 image on a Mac Book Air with an Apple M2 CPU is reported. |
| Software Dependencies | No | The paper mentions several models and frameworks used or compared against, such as GPT, i GPT, Pixel CNN, DALL E, MAE, Sim MIM, Mobile-Former, and DETR. However, it does not specify version numbers for any ancillary software dependencies like programming languages (e.g., Python), deep learning libraries (e.g., PyTorch, TensorFlow), or GPU acceleration libraries (e.g., CUDA). |
| Experiment Setup | Yes | The paper provides detailed experimental setup configurations in tables. For instance, Table 14: "Training setting for FINOLA" specifies optimizer (AdamW), base learning rate (1.5e-4), weight decay (0.1), batch size (128), learning rate schedule (cosine decay), warmup epochs (10), training epochs (100), image size (256^2), and augmentation (Random Resize Crop). Additional tables (Table 22, Table 23) provide similar detailed settings for linear probing, tran-1 probing, and end-to-end fine-tuning. |