From Noise to Control: Parameterized Diffusion Policies
Quick Take
The Parameterized Diffusion Policy (PDP) framework enhances behavior steering by learning diffusion policies based on low-dimensional parameters, significantly outperforming standard diffusion policies in complex multimodal benchmarks for both simulated and real-robot experiments.
Key Points
- PDP constructs a behavior manifold reflecting semantic similarity between physical trajectories.
- Enables smooth interpolation between known strategies without updating policy weights.
- Demonstrated significant improvements in adaptation performance on complex benchmarks.
- Effective in synthesizing novel behaviors in both simulated and real-robot scenarios.
Article Excerpt
From source RSS / original summaryarXiv:2606. 00336v1 Announce Type: new Abstract: We propose Parameterized Diffusion Policy (PDP), a framework for learning diffusion policies conditioned on low-dimensional, continuous parameters embedded in a learned behavior manifold. By constructing this manifold so that distances between latent representations reflect the semantic similarity between physical trajectories, we transform diffusion from a mechanism for stochastic diversity into a precise and optimizable tool for behavior steering.
Our approach enables smooth interpolation between known strategies and efficient adaptation to novel constraints without updating policy weights. We demonstrate that PDP significantly improves adaptation performance on complex multimodal benchmarks in both simulated and real-robot experiments compared to standard diffusion policies, particularly in scenarios requiring the synthesis of novel behaviors.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →MindGames Arena Generalization Track: In2AI Solution with Delayed Per-Step Reward Attribution
The In2AI solution introduces delayed per-step reward attribution for training language model agents in multi-agent environments, achieving top performance on the MindGames Arena benchmark at NeurIPS 2025. An 8-billion-parameter model outperformed larger proprietary systems, including GPT-5, in competitive play, demonstrating enhanced stability and sample efficiency in reinforcement learning.
