reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Partitioning Schemes for Optimistic Optimization

Authors: Raja Sunkara, Ardhendu Tripathy

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	When the function is a low-dimensional multi-index function we theoretically prove improved regret bounds shown in Table 1. Empirically, we demonstrate the improvement in optimization error for several benchmark functions including Rastrigin (multi-modal), Branin (multiple minima), and Sharp Ridge (non-differentiable). We pose the quantization of Large Language Model (LLM) as a high-dimensional black-box optimization problem and obtain an improved perplexity value.
Researcher Affiliation	Collaboration	1Missouri University of Science & Technology, Rolla MO, US 2Ops Canvas, Alexandria VA, US. Correspondence to: Raja Sunkara <EMAIL>, Ardhendu Tripathy <EMAIL>.
Pseudocode	Yes	Algorithm 1 Obtaining directions for an adaptive partitioning scheme Require: T, oracle for f which is a multi-index function defined using A (see (1))... Algorithm 2 Sequ OOL on an adaptive partitioning scheme with a direction selection strategy Require: Total number of openings n, number of samples T for updating f, integer c stating how often f is updated, number of dimensions m, oracle for f, direction selection strategy τh... Algorithm 3 Implementing lookahead direction selection strategy τh( f) Require: Current partition tree T , height h, estimated function f
Open Source Code	Yes	All implementation details, benchmark functions, and experiment scripts can be found at our Git Hub repository: https://github.com/raja-sunkara/Learned-Partitions-SequOOL
Open Datasets	Yes	We evaluated our approach on the OPT-1.3B model (Zhang et al., 2022), with results presented in Table 2. Our proposed objective function using Sequ OOL over 72 dimensions outperformed AWQ, achieving lower perplexity on both Wiki Text-2 (Merity et al., 2016) and the calibration set (Pile dataset (Gao et al., 2020)).
Dataset Splits	No	Where X is the input features to the block which is cached from a calibration dataset. It uses the parameterization s = sα X, where s X is the activation scale computed from X and α [0, 1] and Q as the quantization function and W as the original weights (full-precision). To determine the optimal α , AWQ applies a 1D grid search over the interval [0, 1]. This parameter controls the scale of activations and influences quantization error. The paper mentions using a "calibration dataset" for LLM quantization but does not provide specific details on how this dataset is split into training, validation, or test sets, nor does it provide percentages or counts for these splits. The text focuses on the use of the dataset for caching input features and calculating perplexity, not on its partitioning for model training/evaluation.
Hardware Specification	Yes	We implemented our Large Language Model (LLM) code on hardware equipped with one Quadro RTX 5000 GPU having 16GB VRAM.
Software Dependencies	No	We employed the Ray package for hyper-parameter tuning 4. We used Adam optimizer and our search space included hidden layer sizes (500, 1000, 2000, 3000), learning rates (log-uniform from 1 10 4 to 1 10 1), weight decay (log-uniform from 1 10 2 to 1 10 1), and learning rate Step Decay with gamma values (uniform from 0.9 to 0.99), and step sizes (500, 1000, 2000). The paper mentions software like "Ray package" and "Adam optimizer" but does not specify their version numbers or the version of Python/PyTorch (or similar frameworks) used.
Experiment Setup	Yes	We used Adam optimizer and our search space included hidden layer sizes (500, 1000, 2000, 3000), learning rates (log-uniform from 1 10 4 to 1 10 1), weight decay (log-uniform from 1 10 2 to 1 10 1), and learning rate Step Decay with gamma values (uniform from 0.9 to 0.99), and step sizes (500, 1000, 2000). We utilized early stopping to prevent overfitting.