ICML Poster Collaborative Mean Estimation Among Heterogeneous Strategic Agents: Individual Rationality, Fairness, and Truthful Contribution

Poster

Collaborative Mean Estimation Among Heterogeneous Strategic Agents: Individual Rationality, Fairness, and Truthful Contribution

Alex Clinton · Yiding Chen · Jerry Zhu · Kirthevasan Kandasamy

West Exhibition Hall B2-B3 #W-815

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract: We study a collaborative learning problem where $m$ agents aim to estimate a vector $\mu =(\mu_1,\ldots,\mu_d)\in \mathbb{R}^d$ by sampling from associated univariate normal distributions $(\mathcal{N}(\mu_k, \sigma^2))\_{k\in[d]}$. Agent $i$ incurs a cost $c_{i,k}$ to sample from $\mathcal{N}(\mu_k, \sigma^2)$. Instead of working independently, agents can exchange data, collecting cheaper samples and sharing them in return for costly data, thereby reducing both costs and estimation error. We design a mechanism to facilitate such collaboration, while addressing two key challenges: ensuring *individually rational (IR) and fair outcomes* so all agents benefit, and *preventing strategic behavior* (e.g. non-collection, data fabrication) to avoid socially undesirable outcomes.We design a mechanism and an associated Nash equilibrium (NE) which minimizes the social penalty-sum of agents' estimation errors and collection costs-while being IR for all agents. We achieve a $\mathcal{O}(\sqrt{m})$-approximation to the minimum social penalty in the worst case and an $\mathcal{O}(1)$-approximation under favorable conditions. Additionally, we establish three hardness results: no nontrivial mechanism guarantees *(i)* a dominant strategy equilibrium where agents report truthfully, *(ii)* is IR for every strategy profile of other agents, *(iii)* or avoids a worst-case $\Omega(\sqrt{m})$ price of stability in any NE. Finally, by integrating concepts from axiomatic bargaining, we demonstrate that our mechanism supports fairer outcomes than one which minimizes social penalty.

Lay Summary:

Machine learning has increased the value of data, which is often costly to collect but easyto share. While most existing data sharing platforms assume honesty, participantsmay lie about their contributions to gain access to others’ data.When participants have different data collection costs there are two key challenges indesigning truthful data sharing algorithms. The first is creating a method to validate thesubmission of each contributor using the other participants’ data. The second, is ensuringthere is enough data from all participants so that each agent’s submission can be sufficientlyvalidated against the others, without compromising on efficiency.We address these problem in two parts. First, we determine how to fairly divide thework of data collection. Second, we reward participants based on the quality of the datathey submitted. For each participant we compare the mean of their data to the mean ofthe others’ data. Instead of returning the others’ data to them, we first corrupt it based onthe difference of the means. If a participant wants to receive the others’ data with minimalcorruption, it is in their best interest to collect a sufficient amount of data and share ittruthfully.

Chat is not available.