ICML Poster Learning With Multi-Group Guarantees For Clusterable Subpopulations

Poster

Learning With Multi-Group Guarantees For Clusterable Subpopulations

Jessica Dai · Nika Haghtalab · Eric Zhao

West Exhibition Hall B2-B3 #W-1017

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Thu 17 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract: A canonical desideratum for prediction problems is that performance guarantees should hold not just on average over the population, but also for meaningful subpopulations within the overall population. But what constitutes a meaningful subpopulation? In this work, we take the perspective that relevant subpopulations should be defined with respect to the clusters that naturally emerge from the distribution of individuals for which predictions are being made. In this view, a population refers to a mixture model whose components constitute the relevant subpopulations. We suggest two formalisms for capturing per-subgroup guarantees: first, by attributing each individual to the component from which they were most likely drawn, given their features; and second, by attributing each individual to all components in proportion to their relative likelihood of having been drawn from each component. Using online calibration as a case study, we study a multi-objective algorithm that provides guarantees for each of these formalisms by handling all plausible underlying subpopulation structures simultaneously, and achieve an $O(T^{1/2})$ rate even when the subpopulations are not well-separated. In comparison, the more natural cluster-then-predict approach that first recovers the structure of the subpopulations and then makes predictions suffers from a $O(T^{2/3})$ rate and requires the subpopulations to be separable. Along the way, we prove that providing per-subgroup calibration guarantees for underlying clusters can be easier than learning the clusters: separation between median subgroup features is required for the latter but not the former.

Lay Summary:

One common goal for designing algorithms that make predictions is for those algorithms to perform well not just on average over the whole population, but also for meaningful subsets of that population. For example, the algorithm should be equally accurate across gender, race, and their intersections. In this work, we explore an alternative way to define subgroups from a population. Rather than using static, pre-defined identity categories (like gender and race), can we provide performance guarantees on subgroups that we learn directly from the relevant data? We study this question from a theoretical perspective and answer in the affirmative, showing an algorithmic strategy that, surprisingly, can perform almost as well as an algorithm that had pre-defined the groups ahead of time.

Chat is not available.