reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

sunny-as2: Enhancing SUNNY for Algorithm Selection

Authors: Tong Liu, Roberto Amadini, Maurizio Gabbrielli, Jacopo Mauro

JAIR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work we present the technical advancements of sunny-as2, by detailing through several empirical evaluations and by providing new insights. We performed a considerable number of experiments to understand the impact of the new technical improvements. Section 5 describes the experiments over diﬀerent conﬁgurations of sunny-as2, while Sect. 6 provides more insights on the SUNNY algorithm, including a comparison with other AS approaches.
Researcher Affiliation	Academia	Tong Liu EMAIL Faculty of Computer Science, Free University of Bozen-Bolzano, Italy; Roberto Amadini EMAIL Maurizio Gabbrielli EMAIL Department of Computer Science and Engineering, University of Bologna, Italy; Jacopo Mauro EMAIL Department of Mathematics and Computer Science, University of Southern Denmark, Denmark
Pseudocode	Yes	Algorithm 1 shows through pseudocode how sunny-as2-fk selects the features and the k-value. Algorithm 1 Conﬁguration procedure of sunny-as2-fk. 1: function learn FK(A, λ, I, max K, F, max F) 3: best K 1 4: best Score 5: while \|best F\| < max F do 6: curr Score 7: for f F do 8: curr Features best F {f} 9: for k 1, . . . , max K do 10: tmp Score get Score(A, λ, I, k, curr Features) 11: if tmp Score > curr Score then 12: curr Score tmp Score 13: curr Feat f 14: curr K k 16: end for 17: end for 18: if curr Score best Score then Cannot improve the best score 21: best Score curr Score 22: best F best F {curr Feat} 23: best K curr K 24: F = F {curr Feat} 25: end while 26: return best F, best K 27: end function
Open Source Code	No	The paper does not provide concrete access to source code for the sunny-as2 methodology. While it references publicly available ASlib scenarios at 'https://github.com/coseal/aslib_data' and the 'Auto Folio Repository' at 'https://github.com/mlindauer/Auto Folio', these are for datasets and third-party tools, not the authors' own implementation of sunny-as2.
Open Datasets	Yes	To address this problem, the Algorithm Selection library (ASlib) (Bischl et al., 2016) has been proposed. ASlib consists of scenarios collected from a broad range of domains, aiming to give a cross-the-board performance comparison of diﬀerent AS techniques, with the scope of comparing various AS techniques on the same ground. The ASlib scenarios are publicly available at https://github.com/coseal/aslib_data.
Dataset Splits	Yes	To evaluate the performance of our algorithm selector by overcoming the overﬁtting problem and to obtain more robust and rigorous results, in this work we adopted a repeated nested cross-validation approach (Loughrey & Cunningham, 2005). A nested cross-validation consists of two CVs, an outer CV which forms test-training pairs, and an inner CV applied on the training sets used to learn a model that is later assessed on the outer test sets. The original dataset is split into ﬁve folds thus obtaining ﬁve pairs (T1, S1) . . . , (T5, S5) where the Ti are the outer training sets and the Si are the (outer) test sets, for i = 1, . . . , 5. For each Ti we then perform an inner 10-fold CV to get a suitable parameter setting. We split each Ti into further ten sub-folds T i,1, . . . , T i,10, and in turn for j = 1, . . . , 10 we use a sub-fold T i,j as validation set to assess the parameter setting computed with the inner training set, which is the union of the other nine sub-folds S k =j T i,k. We then select, among the 10 conﬁgurations obtained, the one for which SUNNY achieves the best PAR10 score on the corresponding validation set. The selected conﬁguration is used to run SUNNY on the paired test set Si. Finally, to reduce the variability and increase the robustness of our approach, we repeated the whole process for ﬁve times by using diﬀerent random partitions.
Hardware Specification	Yes	All the experiments were conducted on Linux machines equipped with Intel Corei5 3.30GHz processors and 8 GB of RAM.
Software Dependencies	No	The paper mentions implementing a baseline with Scikit-learn (Pedregosa et al., 2011) but does not specify a version number for Scikit-learn or any other software used by sunny-as2 itself. Other software mentioned like CPLEX, Gecode, Choco are in the related work section and not directly used by the authors' implementation with specific versions.
Experiment Setup	Yes	The default values of these parameters were decided by conducting an extensive set of manual experiments over ASlib scenarios, with the goal of reaching a good trade-oﬀ between the performance and the time needed for the training phase (i.e., at most one day). split mode: the way of creating validation folds for the inner CV, including: random, rank, and stratiﬁed split. Default: rank. training instances limit: the maximum number of instances used for training. Default: 700. feature limit: the maximum number of features for feature selection, used by sunnyas2-f and sunnyas2-fk. Default: 5. k range: the range of neighborhood sizes used by both sunnyas2-k and sunnyas2-fk. Default: [1,30]. schedule limit for training (λ): the limit of the schedule size for greedy-SUNNY. Default: 3. seed: the seed used to split the training set into folds. Default: 100. time cap: the time cap used by sunnyas2-f and sunnyas2-fk to perform the training. Default: 24 h.