SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models Paper • 2510.09541 • Published Oct 10 • 14
Inpainting-Guided Policy Optimization for Diffusion Large Language Models Paper • 2509.10396 • Published Sep 12 • 15