Dimension Reduction for Symbolic Regression
Authors: Paul Kahlmeyer, Markus Fischer, Joachim Giesen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach in Section 5 on the Wikipedia eponymous equations data set (Guimer a et al. 2020) and on the Feynman symbolic regression data set1. Finally, we draw some conclusions in Section 6. [...] In a second experiment, we evaluate the effectiveness of combining our beam search with different state-of-the-art symbolic regression algorithms. [...] Results on the Feynman equations data set are shown in Table 4. |
| Researcher Affiliation | Academia | Paul Kahlmeyer, Markus Fischer, Joachim Giesen Friedrich Schiller University Jena EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the beam search process and uses an illustration in Figure 2, but it does not provide a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement from the authors releasing their code or a direct link to a code repository for their methodology. It mentions third-party tools like SymPy and other symbolic regression algorithms, and provides a link to a dataset used, but not their own implementation's source code. |
| Open Datasets | Yes | For our experiments, we have used two sets of regression problems, namely, Wikipedia s list of 880 eponymous equations (Guimer a et al. 2020) and 114 formulas that were extracted from the Feynman lecture notes of physics (Udrescu and Tegmark 2020). We provide more details about both data sets in the full version of the paper. 1https://space.mit.edu/home/tegmark/aifeynman.html |
| Dataset Splits | No | The paper mentions using "hold out data" and sampling from functions, and that the Feynman dataset samples are "directly given by La Cava et al. (2021), who also add different levels of Gaussian noise". However, it does not specify exact training/test/validation splits (e.g., percentages, sample counts, or specific predefined split references) in the provided text. |
| Hardware Specification | Yes | All experiments were run on a computer with an Intel Xeon Gold 6226R 64-core processor, 128 GB of RAM, running Python 3.10. |
| Software Dependencies | No | While "Python 3.10" is mentioned, no other key libraries or software components are listed with specific version numbers. References to "Sym Py" or other symbolic regression algorithms are to the tools themselves, not their specific versions used by the authors as dependencies for their own implementation. |
| Experiment Setup | No | The paper states, "The beam search in the experiment uses beam size 1 and the CODEC functional dependence measure." and "To keep the search space small, we consider only expression DAGs with at most one intermediary node and one output node". However, it explicitly defers details for other settings: "An overview of the respective hyperparameters can be found in the full version of the paper," indicating that specific hyperparameter values for the symbolic regression algorithms are not provided in this extract. |