Stochastic Subspace Descent Accelerated via Bi-fidelity Line Search

Authors: Nuojin Cheng, Alireza Doostan, Stephen Becker

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A comprehensive empirical evaluation of BF-SSD is conducted across four distinct problems: a synthetic optimization benchmark, dual-form kernel ridge regression, black-box adversarial attacks on machine learning models, and transformer-based black-box language model fine-tuning. The numerical results consistently demonstrate that BF-SSD achieves superior optimization performance, particularly in terms of solution quality obtained per HF function evaluation, when compared against relevant baseline methods.
Researcher Affiliation Academia Nuojin Cheng EMAIL Department of Applied Mathematics University of Colorado Boulder Alireza Doostan EMAIL Smead Aerospace Engineering Sciences Department University of Colorado Boulder Stephen Becker EMAIL Department of Applied Mathematics University of Colorado Boulder
Pseudocode Yes Algorithm 1: Surrogate Construction Algorithm 2: BF-Backtracking Algorithm 3: Bi-Fidelity Line Search SSD Algorithm
Open Source Code No The paper does not contain any explicit statements about code release, nor does it provide a link to a code repository. The text discusses methodologies and results without mentioning the availability of their implementation code.
Open Datasets Yes For the regression data, we select the first D = 1,000 samples from the California housing dataset provided in the scikit-learn library (Pedregosa et al., 2011). In this study, we utilized two convolutional neural network (CNN) architectures to model the HF and LF representations for classification tasks on the MNIST dataset... Sentiment analysis on the acl Imdb dataset was used as a soft prompting task.
Dataset Splits Yes In this study, we utilized two convolutional neural network (CNN) architectures to model the HF and LF representations for classification tasks on the MNIST dataset with 60,000 training data and 10,000 testing data.
Hardware Specification No The paper does not provide any specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory specifications. It only mentions general aspects like 'black-box' scenarios.
Software Dependencies No The paper mentions software components like the 'scikit-learn library', 'BERT model', and 'Distil BERT', but it does not specify any version numbers for these or any other software dependencies.
Experiment Setup Yes We set the dimension D = 1,000, ℓ= 20, and L = 20. The backtracking parameter is β = ℓ/(2D). The hyperparameter study is conducted according to different values of c {0.8, 0.9, 0.99} and ℓ {5, 10, 20}. For the SSD methods (including the line search version), the parameters are set to ℓ= 50 and αmax = 2.0. We let ℓ= 50, c = 0.99, and for the methods without line search we chose a fixed step size of 1 10 2.