Poster
Prediction-Powered Adaptive Shrinkage Estimation
Sida Li · Nikolaos Ignatiadis
East Exhibition Hall A-B #E-2210
Scientists and data analysts often face a common challenge: they have a lot of data features (like images of galaxies or product details) but only a small amount of reliable, "gold-standard" labeled information (like which galaxies have spirals or actual user ratings). This scarcity makes it hard to answer many related questions accurately, such as finding the fraction of spiral galaxies in different clusters or the average ratings for many different products.We developed a new statistical method called Prediction-Powered Adaptive Shrinkage (PAS). PAS cleverly combines these limited gold-standard labels with predictions from modern machine learning (ML) models. First, for each specific question (e.g., for one galaxy cluster), it uses the ML predictions to make initial estimates more precise. Then, it "borrows strength" across all the different questions by using these same ML predictions as a common reference point, intelligently adjusting how much to rely on them based on their estimated quality.Our method, PAS, allows researchers to get more accurate answers even when high-quality labeled data is scarce for each individual question. It automatically adapts to how good the ML predictions are, outperforming existing approaches in diverse real-world scenarios, from astronomy to analyzing customer reviews. This helps extract more reliable insights from complex datasets with many parallel questions.