Skip to yearly menu bar Skip to main content


Poster

Dueling Convex Optimization with General Preferences

Aadirupa Saha · Tomer Koren · Yishay Mansour

West Exhibition Hall B2-B3 #W-914
[ ] [ ]
Wed 16 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract: We address the problem of convex optimization with dueling feedback, where the goal is to minimize a convex function given a weaker form of \emph{dueling} feedback. Each query consists of two points and the dueling feedback returns a (noisy) single-bit binary comparison of the function values of the two queried points.The translation of the function values to the single comparison bit is through a \emph{transfer function}.This problem has been addressed previously for some restricted classes of transfer functions, but here we consider a very general transfer function class which includes all functions that admit a series expansion about the origin.Our main contribution is an efficient algorithm with convergence rate of $O(\epsilon^{-4p})$ for smooth convex functions, and an optimal rate of $\widetilde O(\epsilon^{-2p})$ when the objective is both smooth and strongly convex, where $p$ is the minimal degree (with a non-zero coefficient) in the transfer's series expansion about the origin.

Lay Summary:

We study how to solve optimization problems in situations where we can't directly measure how good each option is, but can only make noisy comparisons between pairs of options—like asking someone which of two choices they prefer, without knowing why. This kind of feedback is called "dueling feedback." While earlier research has only worked for limited types of such comparisons, our work allows for a much broader and more realistic range of comparison styles. We design a new method that can efficiently find the best choice even when these comparisons are noisy and indirect. Our approach works especially well when the problem has a smooth and well-behaved structure, in fact, optimally when the underlying function is smooth and strongly convex. This work pushes the boundaries of optimization with pairwise comparisons, offering a powerful and general solution for real-world decision-making with limited, noisy relative feedback.

Chat is not available.