Prediction-Powered Adaptive Shrinkage Estimation
Authors: Sida Li, Nikolaos Ignatiadis
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both synthetic and real-world datasets show that PAS adapts to the reliability of the ML predictions and outperforms traditional and modern baselines in large-scale applications. We conduct extensive experiments on both synthetic and real-world datasets. |
| Researcher Affiliation | Academia | 1Data Science Institute, The University of Chicago 2Department of Statistics, The University of Chicago. Correspondence to: Sida Li <EMAIL>. |
| Pseudocode | Yes | A pseudo-code implementation is also presented in Algorithm 1. |
| Open Source Code | Yes | The code for reproducing the experiments is available at https://github.com/listar2000/predictionpowered-adaptive-shrinkage. |
| Open Datasets | Yes | Experiments on both synthetic and real-world datasets show that PAS adapts to the reliability of the ML predictions and outperforms traditional and modern baselines in large-scale applications. Fisch et al. (2024) have shown improvements in estimating the fraction of spiral galaxies using predictions on images from the Galaxy Zoo 2 dataset (Willett et al., 2013). Amazon Review Ratings (SNAP, 2014). The Amazon Fine Food Reviews dataset, provided by the Stanford Network Analysis Project (SNAP; SNAP (2014)) on Kaggle. |
| Dataset Splits | Yes | we randomly split the data points of each problem into labeled/unlabeled partitions (where we choose a 20/80 split ratio). For both datasets, we randomly split the data for each problem (a food product or galaxy subgroup) into a labeled and unlabeled partition with a 20/80 ratio. |
| Hardware Specification | Yes | All the experiments were conducted on a compute cluster with Intel Xeon Silver 4514Y (16 cores) CPU, Nvidia A100 (80GB) GPU, and 64GB of memory. |
| Software Dependencies | No | The paper mentions software like 'Hugging Face s transformers library (Wolf, 2019)', 'bert-base-multilingual-uncased-sentiment model (Town, 2023)', 'Res Net50 architecture (He et al., 2016)', and 'Adam optimizer (Kingma & Ba, 2015)'. However, specific version numbers for these software libraries or frameworks are not provided, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | We use a batch size of 256 and Adam optimizer (Kingma & Ba, 2015) with a learning rate of 1e-3. After 20 epochs, the model achieves 87% training accuracy and 83% test accuracy. |