Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics

Authors: Jonas Spinner, Victor Breso, Pim de Haan, Tilman Plehn, Jesse Thaler, Johann Brehmer

NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now demonstrate L-GATr in three applications. Each addresses a different problem in the data-analysis pipeline sketched in Fig. 1. 4.1 Surrogates for QFT amplitudes 4.2 Top tagging 4.3 Generative modelling
Researcher Affiliation Collaboration Jonas Spinner Heidelberg University EMAIL Victor Bresó Heidelberg University EMAIL Pim de Haan Qualcomm AI Research Tilman Plehn Heidelberg University Jesse Thaler MIT / IAIFI Johann Brehmer Qualcomm AI Research
Pseudocode No The paper describes the architecture and layers mathematically and textually but does not provide a formal pseudocode block or algorithm.
Open Source Code Yes Our implementation of L-GATr is available at https://github.com/heidelberg-hepml/ lorentz-gatr.
Open Datasets Yes We use the reference top quark tagging dataset by Kasieczka et al. [49, 50].12 The data samples are structured as point clouds, with each event simulating a measurement by the ATLAS experiment at detector level. ... 12Available at https://zenodo.org/records/2603256 under a CC-BY 4.0 license.
Dataset Splits Yes Each dataset consists of 4 105 samples for training, 105 for validation, and 5 105 for testing. ... The dataset consists of 1.2 106 events for training and 4 105 each for validation and testing. ... On each dataset, 1% of the events are set aside as validation and test split.
Hardware Specification Yes Our measurements are performed with datasets made up by a single sample and all models are run on an H100 GPU.
Software Dependencies Yes The t t + n jets, n = 0...4 dataset is simulated with the Mad Graph 3.5.1 event generation toolchain, consisting of Mad Event [4] for the underlying hard process, Pythia 8 [72] for the parton shower, Delphes 3 [35] for a fast detector simulation, and the anti-k T jet reconstruction algorithm [21] with R = 0.4 as implemented in FASTJET [22].
Experiment Setup Yes All models are trained by minimizing a mean squared error (MSE) loss on the preprocessed amplitude targets and by making use of the Adam optimizer. We use a batch size of 256 and a fixed learning rate of 10 4 for all baselines.