reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimization over Sparse Support-Preserving Sets: Two-Step Projection with Global Optimality Guarantees

Authors: William De Vazelhes, Xiaotong Yuan, Bin Gu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	H. Experiments In this section, we provide some experiments to validate experimentally our theoretical results. Our experiments take 2h30 to run in total on a Mac Book Pro with a 2.6GHz 6-core Intel Core i7 and 16GB of memory. Before describing our experiments, we provide a short discussion about the settings and algorithms that we will illustrate. ... In Section H.1, we illustrate on a synthetic example the trade-off between sparsity and optimality that is introduced by the extra constraint Γ, and that is balanced by the parameter ρ. In Section H.2, we consider a portfolio index tracking problem where the goal is to illustrate a real-life application of our methods. In section H.3, we consider a multi-class logistic regression on a real life dataset, to illustrate in more details in particular the stochastic and the zeroth-order versions of our method.
Researcher Affiliation	Collaboration	1Gen Bio AI, work done while at MBZUAI, Abu Dhabi, UAE 2School of Intelligence Science and Technology, Nanjing University, Suzhou, China 3School of Artificial Intelligence, Jilin University, China.
Pseudocode	Yes	Algorithm 1: Deterministic IHT with extra constraints (IHT-2SP) ... Algorithm 2: Hybrid Stochastic IHT with Extra Constraints (HSG-HT-2SP) ... Algorithm 3: Hybrid ZO IHT with Extra Constraints (HZO-HT-2SP)
Open Source Code	Yes	For reproducibility, our code is available at https://github.com/wdevazelhes/2SP_icml2025.
Open Datasets	Yes	We compare our algorithms on three portfolio indices datasets: S&P500 ... HSI ... CSI300 ... The data for those three indices is scrapped from the web using the beautifulsoup1 library to gather information about the index, and the yfinance2 library to scrap the returns of such stocks during the considered time period. ... We consider the dna dataset from the Lib SVM dataset repository (Chang & Lin, 2011)
Dataset Splits	Yes	We learn the weights of the portfolio on 80% of the considered period, and evaluate the out of sample (test set) performance on the remaining 20% (shaded area in the figure).
Hardware Specification	Yes	Our experiments take 2h30 to run in total on a Mac Book Pro with a 2.6GHz 6-core Intel Core i7 and 16GB of memory.
Software Dependencies	No	The data for those three indices is scrapped from the web using the beautifulsoup1 library to gather information about the index, and the yfinance2 library to scrap the returns of such stocks during the considered time period. This text lists libraries but does not provide version numbers.
Experiment Setup	Yes	We consider the dna dataset from the Lib SVM dataset repository (Chang & Lin, 2011), and we choose D = 0.5, λ = 10. For the stochastic case we take B = 1e5, and for the stochastic and ZO case we take α = 2. Note that in the stochastic case, if the growing batch-size required by Theorem 4.3 becomes larger than n, we keep it fixed to n (i.e. in such case we take the whole dataset at each step). In the zeroth-order case, we take µ = 1e 6. We set set all other hyperparameters as per Theorems 3.7, 4.3 and 4.8.