reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups

Authors: Yongxing Zhang, Donglin Yang, Renjie Liao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the effectiveness of our Symmetric Diffusers, we conduct extensive experiments on three tasks: sorting 4-Digit MNIST images, solving Jigsaw Puzzles on the Noisy MNIST and CIFAR-10 datasets, and addressing traveling salesman problems (TSPs). Our model achieves the state-of-the-art or comparable performance across all tasks. ... Ablation Study. We conduct an ablation study to verify our design choices for reverse transition and decoding strategies.
Researcher Affiliation	Academia	Yongxing Zhang1,3 , Donglin Yang2,3, Renjie Liao2,3 1University of Waterloo 2University of British Columbia 3Vector Institute EMAIL, EMAIL
Pseudocode	No	The paper describes algorithms and processes textually (e.g., shuffling methods, denoising schedule, proofs in appendix) but does not include any clearly labeled pseudocode blocks or formal algorithm figures.
Open Source Code	Yes	Our code is released at https://github.com/DSL-Lab/Symmetric Diffusers.
Open Datasets	Yes	We conduct extensive experiments on three tasks: sorting 4-Digit MNIST images, solving Jigsaw Puzzles on the Noisy MNIST and CIFAR-10 datasets, and addressing traveling salesman problems (TSPs). ...We take the TSP-20 and TSP-50 dataset from Joshi et al. (2021) 1. The train set consists of 1,512,000 graphs, where each node is an i.i.d. sample from the unit square [0, 1]2. The labels are optimal TSP tours provided by the Concorde solver (Applegate et al., 2006). The test set consists of 1,280 graphs, with ground truth tour generated by the Concorde solver as well.
Dataset Splits	Yes	The training set for Noisy MNIST comprises 60,000 images, while the CIFAR-10 training set contains 10,000 images. The Noisy MNIST test set, which is pre-shuffled, also includes 10,000 images. The CIFAR-10 test set, which shuffles images on the fly, contains 10,000 images as well. ... For each training epoch, we generate 60,000 sequences of 4-digit MNIST images... For testing purposes, we similarly generate 10,000 sequences... The train set consists of 1,512,000 graphs... The test set consists of 1,280 graphs...
Hardware Specification	Yes	The Jigsaw Puzzle and Sort 4-Digit MNIST Numbers experiments are trained and evaluated on the NVIDIA A40 GPU. The TSP experiments are trained and evaluated on the NVIDIA A40 and A100 GPU.
Software Dependencies	No	The paper mentions software like Adam W optimizer, PyTorch, and various baselines (e.g., Gurobi, Concorde, LKH-3), but it does not specify version numbers for the key software components used in their own methodology, such as Python or PyTorch.
Experiment Setup	Yes	For the Jigsaw Puzzle experiments, we use the Adam W optimizer (Loshchilov & Hutter, 2019) with weight decay 1e-2, ε = 1e-9, and β = (0.9, 0.98). We use the Noam learning rate scheduler given in (Vaswani et al., 2023) with 51,600 warmup steps for Noisy MNIST and 46,000 steps for CIFAR-10. We train for 120 epochs with a batch size of 64.