[Feature Request] Add Diffusion Policy modules (Actor + Loss + Example)

## Motivation

- Diffusion-based policies (e.g., Diffusion Policy, Diffusion-QL) have demonstrated strong empirical results in robotics and offline RL.

## Solution

**Solution (Plan & References)**
We will implement Phase-1 by **adapting the core ideas and minimal components** from the *Diffusion Policy* work—*Diffusion Policy: Visuomotor Policy Learning via Action Diffusion*—and its official codebase, aligning them with TorchRL’s API patterns (TensorDict/Modules/Objectives/Transforms).

* **Primary references**

  * Paper & project page: *Diffusion Policy* (RSS 2023 / IJRR), with public project materials, data, and logs. ([[GitHub](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file)][1])
  * Official code repository: `real-stanford/diffusion_policy` (training configs, scripts, and Colab demos for state/vision tasks). ([[GitHub](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file)][1])

* **What we will port/adapt**

  1. **Actor (DiffusionActor)**

     * A score-based policy that denoises latent actions conditioned on observations—implemented as `torchrl.modules.DiffusionActor`.
     * Pluggable **score network** (e.g., small MLP for low-dim control; CNN encoder later for pixels), **scheduler** (DDPM-style first), and `num_steps`.
     * Strict **TensorDict** contract: `in_keys=["observation"]` → `out_keys=["action"]`.
  2. **Objective (DiffusionBCLoss)**

     * Supervised denoising/ε-prediction loss and a score-matching variant for imitation learning, following the paper’s training target while fitting TorchRL’s Objective API.
     
-  **Example & Repro Path**

     * `examples/diffusion_bc_pendulum.py` (state-based control) to mirror the repo’s low-dim examples first.
     * Clear instructions to plug in public training data/config patterns analogous to the reference repo’s setup (e.g., single-seed + multi-seed runs). ([[GitHub](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file)][1])


## Checklist

* [x] I have checked that there is no similar issue in the repo (**required**)
* [x] This request aligns with TorchRL’s [[call for contributions](https://github.com/pytorch/rl/issues/509)](https://github.com/pytorch/rl/issues) (feature proposals & new algorithm implementations).
* [x] I will take ownership of this feature request and open PR(s) to implement **Phase-1 (Actor + Loss + Example)**.
* [ ] The implementation plan references **[[Diffusion Policy: Visuomotor Policy Learning via Action Diffusion](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file)](https://github.com/real-stanford/diffusion_policy?tab=readme-ov-file)** (paper + code) and adapts it to TorchRL’s API patterns (Modules/Objectives/Collectors).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Add Diffusion Policy modules (Actor + Loss + Example) #3149

Motivation

Solution

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Add Diffusion Policy modules (Actor + Loss + Example) #3149

Description

Motivation

Solution

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions