AutoSciLab: A Self-Driving Laboratory for Interpretable Scientific Discovery

Authors: Saaketh Desai, Sadhvikas Addamane, Jeffrey Y. Tsao, Igal Brener, Laura P. Swiler, Remi Dingreville, Prasad P. Iyer

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the generalizability of Auto Sci Lab by rediscovering a) the principles of projectile motion and b) the phasetransitions within the spin-states of the Ising model (NPhard problem). Applying our framework to an open-ended nanophotonics challenge, Auto Sci Lab uncovers a fundamentally novel method for directing incoherent light emission that surpasses the current state-of-the-art (Iyer et al. 2023b, 2020).
Researcher Affiliation Academia 1Center for Integrated Nanotechnologies, Sandia National Laboratories, Albuquerque, NM 2Material, Physical and Chemical Sciences Center, Sandia National Laboratories, Albuquerque, NM 3Center for Computing Research, Sandia National Laboratories, Center for Computing Research, Albuquerque, NM
Pseudocode No The paper describes the Auto Sci Lab framework and its components (VAE, active learning, directional autoencoder, neural network equation learner) in detail, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code for the experiments and results is available in the Appendix.
Open Datasets No The paper refers to creating training sets (e.g., "Given a training set X of candidate experiments {x1, x2, ..., xn}") and initial databases of experiments ("an initial, small database of experiments Yinit"), but it does not provide concrete access information (specific links, DOIs, repository names, or formal citations to established public datasets) for any dataset used in the experiments. While it mentions "symbolic regression benchmarks (La Cava et al. 2021; Udrescu and Tegmark 2020)" for benchmarking, it doesn't specify which datasets from these benchmarks were used or how to access them directly for the presented work.
Dataset Splits No The paper refers to a "training set X" and an "initial, small database of experiments Yinit" for the VAE and active learning components. However, it does not specify any exact percentages, sample counts, or detailed methodologies for splitting data into training, validation, or test sets for reproduction.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper describes the methodological components like Variational Autoencoders (VAEs), Gaussian process models, and a Neural Network Equation Learner (nn-EQL) conceptually. However, it does not list any specific software libraries, frameworks, or their version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x) that were used for implementation, which are necessary for reproducibility.
Experiment Setup No The paper describes the overall framework and its components, including how the VAE, active learning, directional autoencoder, and neural network equation learner function. It also refers to Appendix sections for more details (e.g., "See Appendix Section S2 (Desai et al. 2024) for details on VAE architectures and training sets." and "See Appendix Section S5 (Desai et al. 2024) for more training details."). However, the main text itself does not contain specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings.