Poster
Correlation Clustering Beyond the Pivot Algorithm
Soheil Behnezhad · Moses Charikar · Vincent Cohen-Addad · Alma Ghafari · Weiyun ma
East Exhibition Hall A-B #E-2007
We study the classic correlation clustering problem. Given objects and a complete labeling of the object-pairs as either “similar” or “dissimilar”, the goal is to partition the objects into arbitrarily many clusters while minimizing disagreements with the labels.A well‐known method (called Pivot) does a pretty good job: it picks one object at random, gathers all the objects similar to it into one group, removes them, and repeats until everything is grouped. This approach guarantees you won’t make more than three times as many mistakes in expectation. However, that “three times” bound is the best Pivot can do, and in many practical situations one would like to do a better without adding complexity.In our work, we introduce a change to Pivot that we call Modified Pivot. It still runs just as quickly and easily as the original method, but it makes noticeably fewer mistakes in practice. Our experiments show it typically cuts the number of errors compared to Pivot. Even more importantly, we prove there is a small but definite improvement in the worst‐case guarantee: Modified Pivot makes strictly fewer mistakes than three times the optimum, by some fixed amount. This small but real gain carries over to “dynamic” settings, where labels change over time, giving a more accurate grouping.