Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Towards fully covariant machine learning

Authors: Soledad Villar, David W Hogg, Weichi Yao, George A Kevrekidis, Bernhard Schölkopf

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our goal, in this conceptual contribution, is to understand the implications for machine learning of the many passive and active symmetries in play. We illustrate with toy examples how enforcing passive symmetries in model structure and in normalization can improve regressions. We demonstrate that imposing passive symmetries can lead to the discovery of important hidden objects in a data problem. In Figure 2 (left) we perform the following toy experiment: We generate noisy samples of intensities as a function of wavelength and temperature according to (3), and the learning task is to predict the intensity for different values of wavelengths and temperatures. We perform three experiments, described in Appendix B (A) a units-covariant regression (employing the approach of Villar et al. 2023) using only λ, T, c, k; (B) a units covariant regression with an extra dimensional constant found by cross-validation; and (C) a standard multi-layer perceptron regression (MLP) with no units constraints. Our results show that no units-covariant regression for the intensity as a function of λ, T, c, k can reproduce accurately the intensity Bλ.
Researcher Affiliation Academia Soledad Villar EMAIL Department of Applied Mathematics and Statistics, Johns Hopkins University Mathematical Institute for Data Science, Johns Hopkins University Flatiron Institute, a division of the Simons Foundation David W. Hogg EMAIL Center for Cosmology and Particle Physics, Department of Physics, New York University Max Planck Institute for Astronomy, Heidelberg Flatiron Institute, a division of the Simons Foundation Weichi Yao EMAIL Michigan Institute for Data Science, University of Michigan George A. Kevrekidis EMAIL Department of Applied Mathematics and Statistics, Johns Hopkins University Los Alamos National Laboratory Bernhard Schölkopf EMAIL Max Planck Institute for Intelligent Systems and ELLIS Institute, Tübingen
Pseudocode No The paper describes methods and algorithms in prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes All code used in this project is available at repositories https://github.com/weichiyao/Towards Fully Covariant ML.
Open Datasets No In Figure 2 (left) we perform the following toy experiment: We generate noisy samples of intensities as a function of wavelength and temperature according to (3), and the learning task is to predict the intensity for different values of wavelengths and temperatures. The training data are made by computing intensities Bλ(λ; T) according to (3) at five temperatures (300, 1000, 3000, 10000, and 30000 K), and on a wavelength grid of 40 wavelengths on a logarithmically spaced grid from 10 8 to 10 4 m. The training inputs consist of N = 500 different initializations of the pendulum positions and momenta {z(i)(t(i) 0 )}N i=1
Dataset Splits Yes The training data are made by computing intensities Bλ(λ; T) according to (3) at five temperatures (300, 1000, 3000, 10000, and 30000 K), and on a wavelength grid of 40 wavelengths on a logarithmically spaced grid from 10 8 to 10 4 m. The test data are similarly generated except that they are generated at temperatures of 6000, 20000, and 60000 K. That is, the test data extend outside the temperature range of the training data, and the test and training data have no temperatures in common. Once the intensity cut is made, there are 126 data points in the training set and 108 in the test set. The training inputs consist of N = 500 different initializations of the pendulum positions and momenta {z(i)(t(i) 0 )}N i=1, and the labels are the set of positions and momenta {z(i)(t(i) 1 ), z(i)(t(i) 2 ), . . . , z(i)(t(i) T )}N i=1 with T = 5. The model is evaluated on a test data set with T = 150 and t0 = 0.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No This project made use of open-source software, including Python, jax, objax, jupyter, numpy, matplotlib, scikit-learn.
Experiment Setup Yes For all experiments, we use an MLP consisting of 3 layers with 20 hidden units each. The standard MLP uses wavelength and temperature as features and it doesn’t require the output to be dimensionally correct. The model learns an scalar O(3)-invariant function H of the input vectors, and uses a symplectic integrator to predict the dynamics: ... We implement H to be O(3)-invariant by simply restricting H to be a function of the inner products of the possible input vectors q1, q2, p1, p1, qo and possibly g, following the fundamental theorem for invariant theory for O(d) (see Weyl 1946 for the theory, and Villar et al. 2021 for a discussion on machine learning implications).