Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)
Learning Gaussian Mixture Models via Transformer Measure Flows
Aleksandr Zimin · Anastasiia Kutakh · Yury Polyanskiy · Philippe Rigollet
Keywords: [ GMM ] [ Measure-to-measure flow map ] [ Transformers ]
We introduce a transformer architecture for approximating Gaussian Mixture Models (GMMs) through a measure-to-measure flow interpretation. Rather than estimating explicit cluster parameters, our model predicts the underlying cluster probability distribution by minimizing Wasserstein distance to the true measure. A key innovation is the flow speed hyperparameter, which adjusts clustering intensity by varying transformer step size and indirectly controlling model depth based on the desired output complexity. Experimental results show performance comparable to or exceeding classical algorithms like K-means, while the synthetic setup provides a lightweight, interpretable sandbox for investigating transformer flow foundations without computational overhead of language-based benchmarks.