-
Notifications
You must be signed in to change notification settings - Fork 406
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Motivation
- Diffusion-based policies (e.g., Diffusion Policy, Diffusion-QL) have demonstrated strong empirical results in robotics and offline RL.
Solution
Solution (Plan & References)
We will implement Phase-1 by adapting the core ideas and minimal components from the Diffusion Policy work—Diffusion Policy: Visuomotor Policy Learning via Action Diffusion—and its official codebase, aligning them with TorchRL’s API patterns (TensorDict/Modules/Objectives/Transforms).
-
Primary references
-
What we will port/adapt
-
Actor (DiffusionActor)
- A score-based policy that denoises latent actions conditioned on observations—implemented as
torchrl.modules.DiffusionActor
. - Pluggable score network (e.g., small MLP for low-dim control; CNN encoder later for pixels), scheduler (DDPM-style first), and
num_steps
. - Strict TensorDict contract:
in_keys=["observation"]
→out_keys=["action"]
.
- A score-based policy that denoises latent actions conditioned on observations—implemented as
-
Objective (DiffusionBCLoss)
- Supervised denoising/ε-prediction loss and a score-matching variant for imitation learning, following the paper’s training target while fitting TorchRL’s Objective API.
-
-
Example & Repro Path
examples/diffusion_bc_pendulum.py
(state-based control) to mirror the repo’s low-dim examples first.- Clear instructions to plug in public training data/config patterns analogous to the reference repo’s setup (e.g., single-seed + multi-seed runs). ([GitHub][1])
Checklist
- I have checked that there is no similar issue in the repo (required)
- This request aligns with TorchRL’s [call for contributions](https://github.com/pytorch/rl/issues) (feature proposals & new algorithm implementations).
- I will take ownership of this feature request and open PR(s) to implement Phase-1 (Actor + Loss + Example).
- The implementation plan references [Diffusion Policy: Visuomotor Policy Learning via Action Diffusion](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file) (paper + code) and adapts it to TorchRL’s API patterns (Modules/Objectives/Collectors).
kir0ul
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request