Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Workshop on Models of Human Feedback for AI Alignment (MoFA)

Selective Preference Aggregation

Shreyas Kadekodi · Hayden McTavish · Berk Ustun


Abstract: Many applications in machine learning and decision-making rely on procedures to aggregate the preferences of individuals -- from voting, to search, to alignment. In this paper, we introduce a paradigm for selective aggregation, where we can either abstain from comparison or arbitrate dissent. Given a dataset of individual preferences, we summarize collective preferences as a selective ranking -- a partial order that only allows comparisons for items on which at least $1 - \tau$ proportion of individuals agree. We develop fast algorithms to construct selective rankings that achieve all possible trade-offs between comparability and dissent, paired with practical guarantees to ensure safety and reliability. We conduct extensive experiments to benchmark our approach on real-world datasets for ranking and learning. Our results demonstrate how selective rankings can promote transparency, robustness, and fairness by revealing disagreement and abstaining from arbitration.

Chat is not available.