ICML Poster Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models

Spotlight Poster

Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models

Huanjian Zhou · Masashi Sugiyama

East Exhibition Hall A-B #E-1103

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Thu 17 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract: Sampling from high-dimensional probability distributions is fundamental in machine learning and statistics. As datasets grow larger, computational efficiency becomes increasingly important, particularly in reducing *adaptive complexity*, namely the number of sequential rounds required for sampling algorithms. While recent works have introduced several parallelizable techniques, they often exhibit suboptimal convergence rates and remain significantly weaker than the latest lower bounds for log-concave sampling.To address this, we propose a novel parallel sampling method that improves adaptive complexity dependence on dimension $d$ reducing it from $\widetilde{\mathcal{O}}(\log^2 d)$ to $\widetilde{\mathcal{O}}(\log d)$. Our approach builds on parallel simulation techniques from scientific computing.

Lay Summary:

Sampling from high-dimensional probability distributions is a fundamental task in machine learning and statistics, essential for applications like Bayesian inference and generative modeling. However, existing sampling methods often require numerous sequential steps, limiting their efficiency and scalability, especially as datasets grow larger.Our research introduces a novel parallel sampling algorithm that significantly reduces the number of sequential steps—known as adaptive complexity—required to generate high-quality samples. By leveraging techniques from scientific computing, our method improves the adaptive complexity from approximately the square of the logarithm of the data's dimensionality to just the logarithm. As a result, our method speeds up sampling for both log-concave distributions and diffusion models.This advancement enables more efficient processing of large-scale data, facilitating faster and more scalable applications in machine learning tasks such as generative modeling and Bayesian inference.

Chat is not available.