Dimensionless machine learning: Imposing exact units equivariance
Authors: Soledad Villar, Weichi Yao, David W. Hogg, Ben Blum-Smith, Bianca Dumitrascu
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our approach with simple numerical examples involving dynamical systems in physics and ecology. (...) We demonstrate with a few simple numerical regression problems that the reduction of model capacity (at fixed complexity) delivered by the units equivariance leads to improvements in generalization (in-distribution and out-of-distribution). In this context, we discuss symbolic regression and emulator-related tasks. We also discuss the limitations of our approach in the context of unknown dimensional constants. (...) 5. Experimental demonstrations |
| Researcher Affiliation | Academia | Soledad Villar EMAIL Department of Applied Mathematics and Statistics, Johns Hopkins University Mathematical Institute for Data Science, Johns Hopkins University Weichi Yao EMAIL Department of Technology, Operations and Statistics, New York University David W. Hogg EMAIL Center for Cosmology and Particle Physics, Department of Physics, New York University Max-Planck-Institut f ur Astronomie Flatiron Institute, a Division of the Simons Foundation Ben Blum-Smith EMAIL Department of Applied Mathematics and Statistics, Johns Hopkins University Bianca Dumitrascu EMAIL Department of Computer Science and Technology, Cambridge University |
| Pseudocode | No | The paper describes the methodology using prose and diagrams (e.g., Figure 1: Overview of the general approach), but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is publicly available at Google Colab.1 (...) The source code is published in https://github.com/weichiyao/Scalar EMLP/tree/dimensionless. |
| Open Datasets | No | For the symbolic regression, training-set objects are drawn from distributions in m, ks, L, g, p, q, in which the scalars m, ks, L are drawn from uniforms and the vectors g, p, q are drawn from isotropically oriented unit vectors times magnitudes drawn from uniforms. For the springy double pendulum, m1, m2, ks1, ks2, L1, L2 are randomly generated from Unif(1, 2), as well as the norm of the gravitational acceleration vector g. Initializations at t0 of the pendulum positions and momenta are generated as those in Finzi et al. (2021) and Yao et al. (2021). For the arid vegetation model, we consider random initial conditions, and random choice of parameters, uniformly sampled between 0.5 and 1.5 times the default value. The paper does not provide public access links or citations for these generated datasets. |
| Dataset Splits | Yes | In the L2 case, 8192 training-set objects are used, and in the LASSO case, 128. (...) We use the same training data N = 30000 for all three experiments and each test set consists of 500 data points. (...) We produce a training set of 1000 initial configurations and a test set of 100 configurations. |
| Hardware Specification | No | The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions implementing Hamiltonian neural networks (HNNs) with scalar-based MLPs, and integrating the Rietkerk model using Euler’s method, but it does not specify any particular software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We construct all rational scalar monomials of the inputs up to a well-defined degree, including, for example, m ks |q| (g p) 2, where the vectors g, p, q are implicitly column vectors, |q| is the magnitude of q, and g p is the scalar (inner) product of g and p. For our purposes, the degree of the rational monomial is the maximum absolute value exponent appearing in the expression, so the example would have a degree 2. (...) In the L2 case, 8192 training-set objects are used, and in the LASSO case, 128. (...) We use the same training data N = 30000 for all three experiments and each test set consists of 500 data points. (...) The dimensional scalars-based and the dimensionless scalars-based MLPs both have equal numbers of model parameters, and are trained with the same set of hyper-parameters (number of training epochs, learning rate, etc.). (...) For each choice of parameters, we use finite differences to estimate the derivatives and Laplacian, and integrate the Rietkerk model using Euler s method with time step 0.005 d, in a 200 m 200 m grid, with 2 m pixel spacing. (...) The baseline regression uses 33 features: the dimensional parameters, their inverses, and the dimensionless constant 1 which describes affine linear functions. The dimensionless linear regression uses the method described in Section 3. It uses the Smith normal form to construct a basis of 12 dimensionless features, and it uses them, their inverses, and the constant 1, obtaining 25 regression features. |