reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Does equivariance matter at scale?

Authors: Johann Brehmer, Sönke Behrends, Pim De Haan, Taco Cohen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and on general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions.
Researcher Affiliation	Industry	Johann Brehmer EMAIL Qualcomm AI Research Sönke Behrends Qualcomm AI Research Pim de Haan Qualcomm AI Research ... Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc. and/or its subsidiaries.
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper mentions that the data can be generated with 'the openly available repository at https://github.com/google-research/kubric', but this refers to a third-party tool used for data generation, not the authors' implementation code for the models described in the paper.
Open Datasets	Yes	We construct a dataset of rigid-body interactions following a proposal by Allen et al. (2023). ... We recreate the MOVi-B dataset used by Allen et al. (2023) as best as we can, using parameters from their paper and private communication; see Appendix A for details. ... Our data can be generated with the openly available repository at https://github.com/google-research/kubric.
Dataset Splits	Yes	Our training set consists of 4 105 such trajectories, while we use 1000 trajectories each for the validation and test set.
Hardware Specification	No	The paper mentions 'GPU utilization' and 'inter-GPU communication' and shows figures for 'single GPU' setups, but it does not specify any particular GPU models, CPU models, or other detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions using 'the Kubric simulator (Greff et al., 2022)' and 'the Py Bullet physics engine (Coumans & Bai, 2016 2024)' and 'the Adam optimizer (Kingma, 2015)'. However, it does not provide specific version numbers for these software components.
Experiment Setup	Yes	We train all models with the Adam optimizer (Kingma, 2015), annealing the learning rate over the course of training from an initial value of 5 10 4 on a cosine schedule. ... a higher learning rate of 10 3 or 2 10 3... we use the same batch size of 64 samples... Early stopping is used in all experiments. ... Our architectures are shown in Tbl. 1.