Skip to yearly menu bar Skip to main content


Poster

Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with sharp convergence bounds under minimal assumptions

Le Tuyet Nhi PHAM · Dario Shariatian · Antonio Ocello · Giovanni Conforti · Alain Oliviero Durmus

East Exhibition Hall A-B #E-3312
[ ] [ ]
Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

This paper introduces the Discrete Markov Probabilistic Model (DMPM), a novel algorithm for discrete data generation. The algorithm operates in discrete space, where the noising process is a continuous-time Markov chain that can be sampled exactly via a Poissonian clock that flips labels uniformly at random. The time-reversal process, like the forward noise process, is a jump process, with its intensity governed by a discrete analogue of the classical score function. Crucially, this intensity is proven to be the conditional expectation of a function of the forward process, strengthening its theoretical alignment with score-based generative models while ensuring robustness and efficiency. We further establish convergence bounds for the algorithm under minimal assumptions and demonstrate its effectiveness through experiments on low-dimensional Bernoulli-distributed datasets and high-dimensional binary MNIST data. The results highlight its strong performance in generating discrete structures. This work bridges theoretical foundations and practical applications, advancing the development of effective and theoretically grounded discrete generative modeling.

Lay Summary:

This paper presents a new algorithm for generating discrete data, like binary patterns or pixel images, using a mathematically grounded approach. The method works by gradually adding and removing noise in a controlled way, allowing it to learn how to produce realistic-looking data. Unlike many existing models, it is specifically designed for data made up of bits. We prove that the method is reliable and efficient, and show in experiments that it works well for both simple and complex datasets. Overall, the paper combines strong theory with practical results to advance how we generate structured, discrete data.

Chat is not available.