Probabilistic Factorial Experimental Design for Combinatorial Interventions

Authors: Divya Shyamal, Jiaqi Zhang, Caroline Uhler

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments to validate our theoretical results, as well as show a comparison to fractional factorial design, using simulated data.
Researcher Affiliation Academia 1Department of Electrical Engineering and Computer Science, MIT 2Department of Mathematics, MIT 3Eric and Wendy Schmidt Center, Broad Institute. Correspondence to: Jiaqi Zhang <EMAIL>, Caroline Uhler <EMAIL>.
Pseudocode Yes Algorithm 1 Active probabilistic factorial experimental design.
Open Source Code Yes Code can be found at the linked repository.
Open Datasets No We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper explicitly states the data is simulated, not from a public dataset. Therefore, no concrete access information for a public dataset is provided.)
Dataset Splits No We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper uses simulated data, and thus does not discuss predefined train/test/validation splits of a public dataset.)
Hardware Specification Yes Experiments were run on a device with a 16 core Intel Core Ultra 7 165H processor with 32 GB RAM, and an NVIDIA RTX 4000 Mobile Ada Generation 12 GB GPU.
Software Dependencies Yes The code is implemented in Python, utilizing the cupy and numba libraries, among others. The active design optimization was done using scipy SLSQP solver. (The reference Virtanen et al., 2020 specifies 'Scipy 1.0', indicating a version for the scipy library.)
Experiment Setup Yes We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. We noise the outcomes with standard Gaussian noise. In each of the following simulations, we keep ̖ constant through all iterations of each run. The curves are generated with values p = 10, k = 2, n = 200; p = 20, k = 2, n = 1000; and p = 30, k = 2, n = 1000.