reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Hierarchical Nearest Neighbour Approach to Contextual Bandits

Authors: Stephen Pasteris, Madeleine Dwyer, Chris Hicks, Vasilios Mavroudis

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also give the results of real world experiments, empirically demonstrating the performance of our algorithm. In Section 6 we give the results of two real-world experiments comparing the performance of HNN to that of NN and three state of the art algorithms for the i.i.d. stochastic problem (Figure 3a) and a real-world adversarial problem (Figure 3b).
Researcher Affiliation	Academia	Stephen Pasteris EMAIL The Alan Turing Institute, London, UK Madeleine Dwyer EMAIL The Alan Turing Institute, London, UK Chris Hicks EMAIL The Alan Turing Institute, London, UK Vasilios Mavroudis EMAIL The Alan Turing Institute, London, UK
Pseudocode	Yes	Algorithm: First part of trial t. 1. For all d [h] {0} let sd be a c-nearest neighbour of t in Hd . 2. If there exists d [h] {0} with sd,t ϵ then pt sd for any such d. 3. If sd,t > ϵ for all d [h] {0} then: (a) δ max{d [h] \| sd,t 1/2d} . (b) If δ = h then: i. h h + 1 . ii. Hh . (c) Hδ+1 Hδ+1 {t} . (d) pt sδ . Algorithm: Second part of trial t (when using belief propagation). 1. For all a [K] set: φ1,a 1 . 2. Descend the tree T from 1 (i.e. the root) to t. When at a vertex s = 1 set, for all a [K], the quantities: b [K] es,bτb,a ; φs,a X 3. For all a [K] set: πa φt,a P b [K] φt,b . 4. Sample at such that for all a [K] we have at = a with probability πa. 5. Play at. 6. Receive the loss ℓt,at. 7. For all a [K] set: et,a exp Jat = a Kℓt,at 8. Ascend the tree T from t to 1 (i.e. the root). When at a vertex s = 1 set, for all a [K], the quantities: b [K] es,bτb,a ; eps,a ϑ s,aeps,a
Open Source Code	No	The paper does not explicitly provide a link to source code, state that code is available in supplementary materials, or make an unambiguous statement of code release for the methodology described.
Open Datasets	Yes	For our first experiment, we used the UCI ML Firewall dataset int (2019), which is internet firewall data. [...] UCI Machine Learning Repository, 2019. DOI: https://doi.org/10.24432/C5131M. For the second experiment, we used another real-world dataset, using a subset of the CICIDS2017 intrusion dataset Sharafaldin et al. (2018).
Dataset Splits	Yes	For our first experiment, we used the UCI ML Firewall dataset int (2019), which is internet firewall data. We randomly shuffled and ran the dataset multiple times to get the average cumulative loss of each algorithm. For the second experiment, we used another real-world dataset, using a subset of the CICIDS2017 intrusion dataset Sharafaldin et al. (2018). For this experiment, we used the machine learning CVE data, which consisted of the network traffic flows. The CICIDS2017 dataset has multiple days of data, on each day, there is benign and attack traffic. We took a subset of the Wednesday data which contained multiple different Do S attacks, we maintained the temporal order from the original dataset so that the experiment was conducted in the adversarial setting.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU, or TPU models, or detailed cloud/cluster specs) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CPLEX 12.4).
Experiment Setup	Yes	The loss was determined by whether the action chosen matched the label from the dataset, with a loss of 1 if it did not, and 0 otherwise. For the distance metric, we took the Euclidean difference of each of the non-categorical features. For the categorical features (the Port features) we used a distance measure of 0 if the value was identical and 1 if it was not. For each feature, we independently scaled the pairwise differences to the [0, 1/6] range (6 being the total number of features), to ensure that all features contributed equally to the overall distance computation (i.e. we had no weightings across the features). The final distances were constrained to the [0, 1] range. We also compared against NN with binning radii of 0.1 and 0.2.