Bayesian Quantification with Black-Box Estimators

Authors: Albert Ziegler, Paweł Czyż

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the introduced model against the established point estimators in a variety of scenarios, and show it is competitive, and in some cases superior, with the non-Bayesian alternatives. 4 Experimental results
Researcher Affiliation Collaboration Albert Ziegler EMAIL XBOW, Head of AI Uppsala, Sweden Paweł Czyż EMAIL ETH AI Center and Department of Biosystems Science and Engineering ETH Zürich Zürich, Switzerland
Pseudocode No The paper describes algorithms like Expectation-Maximization and Gibbs sampler in prose, and refers to the NUTS algorithm, but does not present them in a structured pseudocode or algorithm block format.
Open Source Code Yes The code and workflows used to run the experiments and generate the figures are available in the https: //github.com/pawel-czyz/labelshift repository.
Open Datasets Yes Darmanis et al. (2017) collected biopsy specimens from four glioblastoma multiforme tumors corresponding to four different populations of cells. [...] We downloaded the TPM-normalized (Zhao et al., 2021) data sequenced by Darmanis et al. (2017) from the Curated Cancer Cell Atlas.
Dataset Splits Yes We fix the data set sizes N = 103 and N = 500 and use L = K = 5 as a default setting. [...] We consider a semi-realistic scenario in which one wants to estimate cell prevalence in an automated fashion employing a given black-box cell type classifier. We treat the first two samples as an auxiliary cell atlas on which a generic black-box cell type classifier was trained (we use a random forest), the third sample as an available labeled data set, {(Xi, Yi)}, and the fourth sample as an unlabeled data set, {X j}, for the quantification problem.
Hardware Specification Yes Experiments described in Appendices E.1, E.4, and E.5 were run on a laptop with 32 Gi B RAM and 16 CPU cores clocked at 4680 MHz and finished under six hours. Experiments described in Appendices E.2 and E.3 [...] We ran them sequentially on a cluster equipped with 384 Gi B RAM and 128 CPU cores clocked at 2.25 3.7 GHz.
Software Dependencies Yes As a random forest we used the Sci Kit-Learn implementation (Pedregosa et al., 2011, v. 1.4.1) with default hyperparameters and 20 trees.
Experiment Setup Yes We fix the data set sizes N = 103 and N = 500 and use L = K = 5 as a default setting. The ground-truth prevalence vectors are parametrized as π = (1/L, . . . , 1/L) and π (r) = r, 1 r L 1, . . . , 1 r L 1 . By default, we use r = 0.7. The ground-truth matrix P(C | Y ) is parameterized as φ yy = q and φ yk = (1 q)/(K 1) for k = y and K L, with the default value q = 0.85. [...] For each simulated data set, we ran four Markov chains with 500 warm-up steps and 1000 samples each using the NUTS algorithm of Hoffman & Gelman (2014).