Targeted Active Learning for Bayesian Decision-Making
Authors: Louis Filstroff, Iiris Sundin, Petrus Mikkola, Aleksei Tiulpin, Juuso Kylmäoja, Samuel Kaski
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our targeted active learning strategy to existing alternatives on both simulated and real data and show improved performance in decision-making accuracy. [...] We empirically demonstrate the advantages of the proposed method with respect to existing AL baselines, both in simulated and real-world experiments. |
| Researcher Affiliation | Academia | Louis Filstroff EMAIL Univ. Lille, CNRS, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France; Iiris Sundin EMAIL Department of Computer Science Aalto University, Finland; Petrus Mikkola EMAIL Department of Computer Science Aalto University, Finland University of Helsinki, Finland; Aleksei Tiulpin Research Unit of Health Sciences and Technology University of Oulu, Finland; Juuso Kylmäoja EMAIL Department of Computer Science Aalto University, Finland; Samuel Kaski EMAIL Department of Computer Science Aalto University, Finland |
| Pseudocode | Yes | Algorithm 1 Estimating the criterion Eq. (10) for (xj, dj) U |
| Open Source Code | No | The paper mentions "Python implementation is carried out with the framework GPy4 (open-source, under BSD licence)." but this refers to a third-party framework used, not the authors' own implementation code for the methodology described. |
| Open Datasets | Yes | IHDP dataset2 (Hill, 2011), a semi-synthetic dataset which consists of 747 patients with 25 covariates. ... 2Available online as part of the supplementary material of Hill (2011). ... Osteoarthritis Initiative (OAI) database3 ... 3https://nda.nih.gov/oai/ |
| Dataset Splits | No | The paper states: "each considered dataset is randomly split into a training set D, query set U, and a test set." and "Experiments are run with a starting training set of size 100 for the synthetic dataset and the OAI dataset, and of size 50 for the IHDP dataset." It also mentions a test population of Nt = 50 points. However, it does not specify the exact percentages or absolute counts for the query set, or the total size of each dataset to deduce the split proportions, which is necessary for full reproducibility of the data partitioning. |
| Hardware Specification | No | All experiments were run on a high-performance computing cluster. This statement is too general and does not provide specific details like GPU/CPU models, memory, or other hardware specifications. |
| Software Dependencies | Yes | Python implementation is carried out with the framework GPy4 (open-source, under BSD licence). |
| Experiment Setup | No | GP hyperparameters (variance, lengthscales), as well as the noise variance, are estimated with maximum marginal likelihood. This describes the method for finding hyperparameters but does not provide their specific values or other training-related hyperparameters like learning rates, batch sizes, or number of epochs. |