Aggregated Hold-Out

Authors: Guillaume Maillard, Sylvain Arlot, Matthieu Lerasle

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For further insights into agghoo and majhoo, we conduct in Section 5 a numerical study on simulated data sets. Its results confirm our intuition: in all settings considered, agghoo and majhoo actually perform much better than the hold-out, and sometimes better than CV, provided their parameters are well-chosen. This section investigates how agghoo and majhoo s performance vary with their parameters V = |T | = and τ = nt n , and how it compares to the performance of CV and related methods at a similar computational cost that is, for the same values of V and τ. Two settings are considered, corresponding to Corollary 12 (ε-regression) and Theorem 13 (classification).
Researcher Affiliation Academia Guillaume Maillard EMAIL Université Paris-Saclay, CNRS, Inria, Laboratoire de mathématiques d Orsay, 91405, Orsay, France Sylvain Arlot EMAIL Université Paris-Saclay, CNRS, Inria, Laboratoire de mathématiques d Orsay, 91405, Orsay, France Institut Universitaire de France (IUF) Matthieu Lerasle EMAIL Université Paris-Saclay, CNRS, Inria, Laboratoire de mathématiques d Orsay, 91405, Orsay, France
Pseudocode No The paper describes the Aggregated Hold-Out (agghoo) procedure and its variants, such as Majhoo, using mathematical definitions and textual descriptions of the steps. There are no explicitly labeled pseudocode or algorithm blocks in the document.
Open Source Code No The paper mentions the scikit-learn library as a tool commonly used in the machine learning community, but it does not provide any explicit statement or link indicating that the authors have released their own code for the methods described in this paper. For example, it states: 'According to Varoquaux et al. (2017), agghoo is commonly used by the machine learning community thanks to the scikit-learn library (Pedregosa et al., 2011).'
Open Datasets No The paper states: 'For further insights into agghoo and majhoo, we conduct in Section 5 a numerical study on simulated data sets.' It then describes the generation process for these datasets across different experimental setups, such as 'Data are generated as follows: (X1, Y1), ..., (Xn, Yn) are independent, with Xi N(0, π2), Yi = s(Xi) + Zi, with Zi N(0, 1/4) independent from Xi.' The paper describes how data is simulated but does not mention the use of or provide access to any publicly available datasets.
Dataset Splits Yes Agghoo and CV training sets T T are chosen independently and uniformly among the subsets of {1, . . . , n} with cardinality τn , for different values of τ and V = |T |; hence, CV corresponds to what is usually called Monte-Carlo CV (Arlot and Celisse, 2010). Each algorithm is run on 1000 independent samples of size n = 500, and independent test samples of size 1000 are used for estimating the excess risks... For bagged K-FCV, V is the number of bagging resamples considered, and τ = (K 1)/K (or equivalently, K = 1/(1 τ)).
Hardware Specification No The paper describes numerical experiments in Section 5, but it does not provide any specific hardware details such as GPU or CPU models, memory, or cloud computing resources used for these experiments.
Software Dependencies No The paper mentions the 'R implementation svm from package e1071' and that computations are performed 'numerically using the scipy.integrate python library'. However, specific version numbers for R, the e1071 package, Python, or the scipy library are not provided.
Experiment Setup Yes Each algorithm is run on 1000 independent samples of size n = 500, and independent test samples of size 1000 are used for estimating the excess risks... Agghoo and CV are applied to (Aλ)λ Λ over the grid Λ = { 2j / (500nt) : 0 <= j <= 17}, corresponding to the grid {500 / 2j : 0 <= j <= 17} over the cost parameter C = 1 / (2λnt) of the R implementation svm from package e1071. ...we have τ = {0.8, 0.9} and V = {5, 10}. ...The kernel parameter is h = 1/2 and the threshold for the ε-insensitive loss is ε = 1/4. ... The Bayes classifier is s : x -> Ih(x) b where b = 1.18 and λ = 0.05.