Poster
Dimensionality Reduction on Complex Vector Spaces for Euclidean Distance with Dynamic Weights
Simone Moretti · Paolo Pellizzoni · Francesco Silvestri
West Exhibition Hall B2-B3 #W-1009
Machine learning models typically work with high-dimensional data, such as a document represented by thousands of words or a user profile described by hundreds of preferences. To make computations faster and more efficient, researchers use dimensionality reduction: a technique to compress data into a smaller number of dimensions while preserving important information, like the distances between data points.In many real-world applications, not all features (i.e., dimensions) are equally important. For instance, in recommendation systems, some words in a document carry more weight than others. If we know the importance of each feature beforehand, we can adjust for this during dimensionality reduction. But what happens if we only find out which features are important after the data has already been compressed?This paper addresses this challenge. It introduces a novel method that reduces the dimensions of data in a way that is agnostic to future feature importance, but still allows accurate distance measurements once those weights become known. To do this, the paper leverages, as a mathematical tool, complex numbers (i.e., numbers which include the square root of -1). The proposed method compresses the original data into a complex vector space using a linear function, making it efficient and applicable at scale. Once the feature importance weights are revealed, the method applies a special function to the compressed data to recover accurate estimates of weighted distances. This work opens the door to faster, more flexible machine learning systems, especially in settings where priorities change dynamically, like personalized recommendations or real-time data analysis.