reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Stochastic Optimization via Gradient Quantile Clipping

Authors: Ibrahim Merad, Stéphane Gaïffas

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose an implementation of this algorithm using rolling quantiles which leads to a highly efficient optimization procedure with strong robustness properties, as confirmed by our numerical experiments. Finally, we provide experiments to demonstrate that QC-SGD can be easily and efficiently implemented by estimating Qp( e G(θt, ζt) ) with rolling quantiles. In particular, we show that the iteration is indeed robust to heavy tails and corruption on multiple stochastic optimization tasks.
Researcher Affiliation	Academia	Ibrahim Merad EMAIL LPSM, UMR 8001 Université Paris Cité, Paris, France Stéphane Gaïffas EMAIL LPSM, UMR 8001 Université Paris Cité, Paris, France DMA, École normale supérieure
Pseudocode	Yes	Algorithm 1: Aggregation of cycling iterates; Algorithm 2: Rolling QC-SGD
Open Source Code	No	The paper does not provide an explicit statement or link for the source code of its own methodology. It only mentions: "We do not include a comparison with (Diakonikolas et al., 2022) whose procedure has no implementation we are aware of and is difficult to use in practice."
Open Datasets	Yes	Dataset for Sensorless Drive Diagnosis. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C5VP5F. Jock Blackard. Covertype. UCI Machine Learning Repository, 1998. DOI: https://doi.org/10.24432/C50K5N. Abdelhakim Hannousse and Salima Yahiouche. Web page phishing detection. Mendeley Data, 2, 2020. Byron Roe. Mini Boo NE particle identification. UCI Machine Learning Repository, 2010. DOI: https://doi.org/10.24432/C5QC87. Codrna (Uzilov et al., 2006) 488,565 8 2 Open ML
Dataset Splits	Yes	We use a 10% share of each dataset as a test set in order to compute the test loss plotted in Figures 2 and 3. We also ensure the test set contains at least 5000 elements. Optimization is run using the remaining train set which is corrupted as specified next.
Hardware Specification	No	The paper describes experimental results on synthetic and real datasets but does not provide any specific details about the hardware used for these experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers used for the experiments. It describes the algorithms and their implementation conceptually but lacks details on the programming languages, libraries, or frameworks with their versions.
Experiment Setup	Yes	Our experiments on synthetic data consider an infinite horizon, dimension d = 128, and a constant step size for all methods. We use step size β = 10-3. We use step size β = 6 10-3. We use one sample per iteration and step size β = 10-2 for all methods. As previously, RQC-SGD is run with buffer size S = 100 and τunif = 10. The quantile value was set to p = 0.9. We compute the gradient norms over a batch of samples of size S at the beginning of the optimization and use the quantiles of order p = 0.25, 0.5 and 0.75 as the clipping level for the constant clipping baselines.