Prediction-Powered Adaptive Shrinkage Estimation

Authors: Sida Li, Nikolaos Ignatiadis

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both synthetic and real-world datasets show that PAS adapts to the reliability of the ML predictions and outperforms traditional and modern baselines in large-scale applications. We conduct extensive experiments on both synthetic and real-world datasets.
Researcher Affiliation Academia 1Data Science Institute, The University of Chicago 2Department of Statistics, The University of Chicago. Correspondence to: Sida Li <EMAIL>.
Pseudocode Yes A pseudo-code implementation is also presented in Algorithm 1.
Open Source Code Yes The code for reproducing the experiments is available at https://github.com/listar2000/predictionpowered-adaptive-shrinkage.
Open Datasets Yes Experiments on both synthetic and real-world datasets show that PAS adapts to the reliability of the ML predictions and outperforms traditional and modern baselines in large-scale applications. Fisch et al. (2024) have shown improvements in estimating the fraction of spiral galaxies using predictions on images from the Galaxy Zoo 2 dataset (Willett et al., 2013). Amazon Review Ratings (SNAP, 2014). The Amazon Fine Food Reviews dataset, provided by the Stanford Network Analysis Project (SNAP; SNAP (2014)) on Kaggle.
Dataset Splits Yes we randomly split the data points of each problem into labeled/unlabeled partitions (where we choose a 20/80 split ratio). For both datasets, we randomly split the data for each problem (a food product or galaxy subgroup) into a labeled and unlabeled partition with a 20/80 ratio.
Hardware Specification Yes All the experiments were conducted on a compute cluster with Intel Xeon Silver 4514Y (16 cores) CPU, Nvidia A100 (80GB) GPU, and 64GB of memory.
Software Dependencies No The paper mentions software like 'Hugging Face s transformers library (Wolf, 2019)', 'bert-base-multilingual-uncased-sentiment model (Town, 2023)', 'Res Net50 architecture (He et al., 2016)', and 'Adam optimizer (Kingma & Ba, 2015)'. However, specific version numbers for these software libraries or frameworks are not provided, which is required for a reproducible description of ancillary software.
Experiment Setup Yes We use a batch size of 256 and Adam optimizer (Kingma & Ba, 2015) with a learning rate of 1e-3. After 20 epochs, the model achieves 87% training accuracy and 83% test accuracy.