Skip to yearly menu bar Skip to main content


Poster
in
Workshop: The 1st Workshop on Vector Databases

Vector Data Search with Sorting Transformation

Hongzhi Wang · Tanveer Syeda-Mahmood

[ ] [ Project Page ]
Fri 18 Jul 1:45 p.m. PDT — 3 p.m. PDT

Abstract:

For vector data, a sorting transformation sorts the elements of a vector by permuting the locations of its elements. It can be shown that among all permutation transformations, the sorting transformation minimizes L2 distance and maximizes similarity measures such as cosine similarity and Pearson correlation for vector data. Applying sorting transformation with vector/product quantization can substantially reduce compression errors when the same codebook size is applied, with a mild storage overhead for saving the sorting permutations for each compressed vector. For nearest neighbor search, the sorting transformation may produce false positive nearest neighbors when directly applied with the product quantization search algorithm. This problem was initially addressed via a re-ranking approach. In this work, we gave more in depth analysis to show that the re-ranking approach may be inadequate, especially for evenly distributed data. To address this problem, we adapted the lookup table approach for nearest neighbor search using sorting transformation based product quantization and showed its effectiveness on both real and simulated data.

Chat is not available.