reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AtlasD: Automatic Local Symmetry Discovery

Authors: Manu Bhat, Jonghyun Park, Jianke Yang, Nima Dehmamy, Robin Walters, Rose Yu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate Atlas D is capable of discovering local symmetry groups with multiple connected components in top-quark tagging and partial differential equation experiments. The discovered local symmetry is shown to be a useful inductive bias that improves the performance of downstream tasks in climate segmentation and vision tasks. Our code is publicly available at https://github.com/Rose-STL-Lab/Atlas D.
Researcher Affiliation	Collaboration	1University of California San Diego 2IBM Research 3Northeastern University. Correspondence to: Rose Yu <EMAIL>.
Pseudocode	Yes	Algorithm 1 Automatic Local Symmetry Discovery input Atlas A = {(Uc, φc)}N c=1, dataset D = Xi : M Rdin, Yi : M Rdout n i=1 output Lie algebra {Bi}, cosets {Cℓ}
Open Source Code	Yes	Our code is publicly available at https://github.com/Rose-STL-Lab/Atlas D.
Open Datasets	Yes	The goal is to classify between top quark and lighter quarks jets present in the Top Quark Tagging Reference Dataset (Kasieczka et al., 2019). To highlight the benefits of using our learned results in downstream models, we design a projected MNIST segmentation task. For our final experiment, we evaluate our method on a realworld dataset, Climate Net, proposed by Prabhat et al. (2021).
Dataset Splits	No	The paper mentions several datasets (Top Quark Tagging Reference Dataset, PDE dataset, MNIST, Climate Net) and refers to training and testing phases (e.g., 'train each model on a dataset where the digits are rotated 60 degrees, and test it on one where digits are rotated 180 degrees' in Section 5.3, and 'roughly 200 input images in the training set' in Section C.4). However, it does not explicitly state the specific percentages or exact sample counts for training, validation, or test splits for any of the datasets, nor does it provide a detailed splitting methodology for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory amounts used for running the experiments. It focuses on the methodology and experimental results without mentioning the underlying computational infrastructure.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., programming language versions like Python 3.x, or library versions like PyTorch 1.x) that would be needed to replicate the experiments.
Experiment Setup	Yes	In the infinitesimal generator discovery, we seed our basis with 7 generators. ... We run the model for 10 epochs using cross-entropy loss. We set the coefficient of standard basis regularization to be 0.1 and the growth factor of the generators to be 1. ... The learning rate is 0.001. ... We construct the model with 6 group equivariant blocks with 72 hidden dimensions and train it with a batch size of 32 for 35 epochs with dropout rate of 0.2, weight decay rate of 0.01, and learning rate of 0.0003.