From Discrete-Time Policies to Continuous-Time Diffusion Samplers: Asymptotic Equivalences and Faster Training

Jahr:

2025

Publikation:

arXiv preprint arXiv:2501.06148

Abstrakt:

We study the problem of training neural stochastic differential equations, or diffusion models, to sample from a Boltzmann distribution without access to target samples. Existing methods for training such models enforce time-reversal of the generative and noising processes, using either differentiable simulation or off-policy reinforcement learning (RL). We prove equivalences between families of objectives in the limit of infinitesimal discretization steps, linking entropic RL methods (GFlowNets) with continuous-time objects (partial differential equations and path space measures). We further show that an appropriate choice of coarse time discretization during training allows greatly improved sample efficiency and the use of time-local objectives, achieving competitive performance on standard sampling benchmarks with reduced computational cost.

Link:

Read the paper

Additional Information

Brief introduction of the dida co-author(s) and relevance for dida's ML developments.

Dr. Lorenz Richter

Aus der Stochastik und Numerik kommend (FU Berlin), beschäftigt sich der Mathematiker seit einigen Jahren mit Deep-Learning-Algorithmen. Neben seinem Faible für die Theorie hat er in den letzten 10 Jahren diverse Data Science-Probleme praktisch gelöst. Lorenz leitet das Machine-Learning-Team.