pytorch
diff --git a/‎.github/unittest/llm/scripts_llm/environment.yml‎
Lines changed: 1 addition & 0 deletions b/‎.github/unittest/llm/scripts_llm/environment.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.github/unittest/llm/scripts_llm/install.sh‎
Lines changed: 14 additions & 0 deletions b/‎.github/unittest/llm/scripts_llm/install.sh‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/source/reference/index.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/source/reference/index.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/source/reference/llms.rst‎
Lines changed: 11 additions & 6 deletions b/‎docs/source/reference/llms.rst‎
Lines changed: 11 additions & 6 deletions
@@ -22,3 +22,4 @@ dependencies:
     - transformers
     - datasets
     - vllm
+    - mcp
@@ -61,3 +61,17 @@ python -m pip install -e . --no-build-isolation
 
 # smoke test
 python -c "import torchrl"
+
+# Install MCP dependencies for tool execution tests
+printf "* Installing MCP dependencies (uvx, Deno)\n"
+
+# Install uvx (universal package runner)
+pip install uvx
+
+# Install Deno (required by mcp-run-python)
+curl -fsSL https://deno.land/install.sh | sh
+export PATH="$HOME/.deno/bin:$PATH"
+
+# Verify installations
+uvx --version || echo "Warning: uvx not installed"
+deno --version || echo "Warning: Deno not installed"
@@ -10,6 +10,7 @@ API Reference
     llms
     modules
     objectives
+    services
     trainers
     utils
     config
@@ -930,7 +930,9 @@ Tools are usually implemented as transforms, and appended to a base environment
 such as :class:`~torchrl.envs.llm.ChatEnv`.
 
 An example of a tool transform is the :class:`~torchrl.envs.llm.transforms.PythonInterpreter` transform, which is used
-to execute Python code in the context of the LLM.
+to execute Python code in the context of the LLM. The PythonInterpreter can optionally use a shared 
+:class:`~torchrl.envs.llm.transforms.PythonExecutorService` for efficient resource usage across multiple environments.
+See :ref:`ref_services` for more details on the service registry system.
 
     >>> from torchrl.envs.llm.transforms import PythonInterpreter
     >>> from torchrl.envs.llm import ChatEnv
@@ -1141,6 +1143,7 @@ By following these design principles, reward transforms can be effectively integ
     KLRewardTransform
     MCPToolTransform
     PolicyVersion
+    PythonExecutorService
     PythonInterpreter
     RayDataLoadingPrimer
     RetrieveKL
@@ -1155,20 +1158,22 @@ Objectives
 
 LLM post-training requires specialized loss functions that are adapted to the unique characteristics of language models.
 
-GRPO
-~~~~
-
-The :class:`~torchrl.objectives.llm.GRPOLoss` class is a thin wrapper around the :class:`~torchrl.objectives.PPOLoss` class
-that codes the LLM-specific functionalities.
+GRPO, DAPO, CISPO
+^^^^^^^^^^^^^^^^^
 
 .. currentmodule:: torchrl.objectives.llm
 
 .. autosummary::
     :toctree: generated/
     :template: rl_template.rst
 
+    LLMLossOutput
     GRPOLoss
     GRPOLossOutput
+    CISPOLoss
+    CISPOLossOutput
+    DAPO
+    DAPOLossOutput
     MCAdvantage
 
 SFT