A team from three institutions proposes VDT, a new generative modeling framework that unifies optimal control and optimal transport via linear programming — resulting in shorter pathways and faster inference.

Pablo Moreno-Muñoz and Gergely Neu (ICREA), researchers at Pompeu Fabra University (UPF), together with Adrian Müller from ETH Zürich, published a paper on May 21 in arXiv (arXiv:2605.22507) introducing a novel generative modeling framework called “Value-Driven Transport” (VDT). This framework models transport problems as discrete-time stochastic control problems and reformulates them into a linear program (LP): the dual variables of this LP correspond exactly to the optimal value function of the control problem, which in turn encodes the optimal control policy. As a result, VDT unifies optimal control with reinforcement learning (RL), optimal transport, and stochastic primal-dual optimization under one theoretical umbrella. Leveraging this LP structure, the researchers developed a simulation-free primal-dual algorithm to approximate the optimal value function, from which the VDT control policy is derived.

Compared to currently popular flow-matching, diffusion, and Schrödinger bridge models, VDT-generated transport paths tend to follow straighter trajectories; they can be simulated quickly and robustly without explicitly parameterizing drift terms in the control process. Moreover, VDT supports extensions identical to those available for diffusion and flow models, such as conditional generation and classifier-free guidance. In a post on X, Gergely Neu remarked, “Maybe one day we won’t need to engineer rewards anymore,” hinting at broader implications of this framework for reward design in reinforcement learning. The paper includes illustrative experimental results, though the corresponding code has not yet been released.

arXiv | X (@neu_rips)