From 3752c1650455c31b7663e360002bbb5c2383078d Mon Sep 17 00:00:00 2001 From: Adrian Cole Date: Thu, 4 Dec 2025 20:08:54 +0800 Subject: [PATCH 1/4] docs(rfd): Draft: Agent Telemetry Export --- docs/rfds/agent-telemetry-export.mdx | 163 +++++++++++++++++++++++++++ 1 file changed, 163 insertions(+) create mode 100644 docs/rfds/agent-telemetry-export.mdx diff --git a/docs/rfds/agent-telemetry-export.mdx b/docs/rfds/agent-telemetry-export.mdx new file mode 100644 index 0000000..5367cf5 --- /dev/null +++ b/docs/rfds/agent-telemetry-export.mdx @@ -0,0 +1,163 @@ +--- +title: "Agent Telemetry Export" +--- + +- Author(s): [@codefromthecrypt](https://github.com/codefromthecrypt) + +## Elevator pitch + +> What are you proposing to change? + +Define how agents export telemetry (logs, metrics, traces) to clients without tunneling it over the ACP transport. Clients run a local telemetry receiver and pass standard OpenTelemetry environment variables when launching agents. This keeps telemetry out-of-band and enables editors to display agent activity, debug issues, and integrate with observability backends. + +## Status quo + +> How do things work today and what problems does this cause? Why would we change things? + +ACP defines how clients launch agents as subprocesses and communicate over stdio. The [meta-propagation RFD](./meta-propagation) addresses trace context propagation via `params._meta`, enabling trace correlation. However, there is no convention for how agents should export the actual telemetry data (spans, metrics, logs). + +Without a standard approach: + +1. **No visibility into agent behavior** - Editors cannot display what agents are doing (token usage, tool calls, timing) +2. **Difficult debugging** - When agents fail, there's no structured way to capture diagnostics +3. **Fragmented solutions** - Each agent/client pair invents their own telemetry mechanism +4. **Credential exposure risk** - If agents need to send telemetry directly to backends, they need credentials + +Tunneling telemetry over the ACP stdio transport is problematic: + +- **Head-of-line blocking** - Telemetry traffic could delay agent messages +- **Implementation burden** - ACP would need to define telemetry message formats +- **Coupling** - Agents would need ACP-specific telemetry code instead of standard SDKs + +## What we propose to do about it + +> What are you proposing to improve the situation? + +Clients that want to receive agent telemetry run a local OTLP (OpenTelemetry Protocol) receiver and inject environment variables when launching agent subprocesses: + +``` +OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 +OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf +OTEL_SERVICE_NAME=agent-name +``` + +Agents using OpenTelemetry SDKs auto-configure from these variables. The client's receiver can: + +- Display telemetry in the editor UI (e.g., token counts, timing, errors) +- Forward telemetry to the client's configured observability backend +- Add client-side context before forwarding + +This follows the [OpenTelemetry collector deployment pattern](https://opentelemetry.io/docs/collector/deployment/agent/) where a local receiver proxies telemetry to backends. + +### Architecture + +``` +┌────────────────────────────────────────────────────────────┐ +│ Client/Editor │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │ ACP Handler │ │OTLP Receiver │───▶│ Exporter │ │ +│ └──────────────┘ └──────────────┘ └──────────────┘ │ +└────────┬─────────────────────▲──────────────────┬──────────┘ + │ stdio │ HTTP │ + ▼ │ ▼ +┌─────────────────────┐ │ ┌───────────────────┐ +│ Agent Process │ │ │ Observability │ +│ ┌──────────────┐ │ │ │ Backend │ +│ │ ACP Agent │ │ │ └───────────────────┘ +│ ├──────────────┤ │ │ +│ │ OTEL SDK │────────────┘ +│ └──────────────┘ │ +└─────────────────────┘ +``` + +### Discovery + +Environment variables must be set before launching the subprocess, but ACP capability exchange happens after connection. Options for discovery: + +1. **Optimistic injection** - Clients inject OTEL environment variables unconditionally. Agents without OpenTelemetry support simply ignore them. This is pragmatic since environment variables are low-cost and OTEL SDKs handle misconfiguration gracefully. + +2. **Registry metadata** - Agent registries (like the one proposed in PR #289) could include telemetry support in agent manifests, letting clients know ahead of time. + +3. **Manual configuration** - Users configure their client to enable telemetry collection for specific agents. + +## Shiny future + +> How will things will play out once this feature exists? + +1. **Editor integration** - Editors can show agent activity: token usage, tool call timing, model switches, errors +2. **Unified debugging** - When agents fail, structured telemetry is available for diagnosis +3. **End-to-end traces** - Combined with `params._meta` trace propagation, traces flow from client through agent to any downstream services +4. **No credential sharing** - Agents never see backend credentials; the client handles authentication +5. **Standard SDKs** - Agent authors use normal OpenTelemetry SDKs that work in any context, not ACP-specific code + +## Implementation details + +> Tell me more about your implementation. What is your detailed implementation plan? + +### 1. Create `docs/protocol/observability.mdx` + +Add a new protocol documentation page covering observability practices for ACP. This page will describe: + +**For Clients/Editors:** +- Running an OTLP receiver to collect agent telemetry +- Injecting `OTEL_EXPORTER_*` environment variables when launching agent subprocesses +- Respecting user-configured `OTEL_*` variables (do not override if already set) +- Forwarding telemetry to configured backends with client credentials + +**For Agent Authors:** +- Using OpenTelemetry SDKs with standard auto-configuration +- Recommended spans, metrics, and log patterns for agent operations +- How telemetry flows when `OTEL_*` variables are present vs absent + +### 2. Update `docs/protocol/extensibility.mdx` + +Add a section linking to the new observability doc, similar to how extensibility concepts relate to other protocol features. Add a brief mention that observability practices (telemetry export) are documented separately. + +### 3. Update `docs/docs.json` + +Add `protocol/observability` to the Protocol navigation group. + +## Frequently asked questions + +> What questions have arisen over the course of authoring this document or during subsequent discussions? + +### How does this relate to trace propagation in `params._meta`? + +They are complementary: + +- **Trace propagation** (`params._meta` with `traceparent`, etc.) passes trace context so spans can be correlated +- **Telemetry export** (this RFD) defines where agents send the actual span/metric/log data + +Both are needed for end-to-end observability. + +### What if an agent doesn't use OpenTelemetry? + +Agents without OTEL SDKs simply ignore the environment variables. No harm is done. Over time, as more agents adopt OpenTelemetry, the ecosystem benefits. + +### What if the user already configured `OTEL_*` environment variables? + +If `OTEL_*` variables are already set in the environment, clients should not override them. User-configured telemetry settings take precedence, allowing users to direct agent telemetry to their own backends when desired. + +### Why not define ACP-specific telemetry messages? + +This would duplicate OTLP functionality, add implementation burden to ACP, and force agent authors to use non-standard APIs. Using OTLP means agents work with standard tooling and documentation. + +### What about agents that aren't launched as subprocesses? + +This RFD focuses on the stdio transport where clients launch agents. For other transports (HTTP, etc.), agents would need alternative configuration mechanisms, which could be addressed in future RFDs. + +### What alternative approaches did you consider, and why did you settle on this one? + +1. **Tunneling telemetry over ACP** - Rejected due to head-of-line blocking concerns and implementation complexity +2. **Agents export directly to backends** - Rejected because it requires sharing credentials with agents +3. **File-based telemetry** - Rejected because it doesn't support real-time display and adds complexity + +The environment variable approach: +- Uses existing standards (OTLP, OpenTelemetry SDK conventions) +- Keeps telemetry out-of-band from ACP messages +- Lets clients control where telemetry goes without exposing credentials +- Requires no changes to ACP message formats + +## Revision history + +- 2025-12-04: Initial draft From dc62527ba044a740f8b9d4cb9d5cc8c341b908fb Mon Sep 17 00:00:00 2001 From: Ben Brandt Date: Thu, 11 Dec 2025 14:29:44 +0100 Subject: [PATCH 2/4] Update website --- docs/docs.json | 3 ++- docs/updates.mdx | 7 +++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/docs.json b/docs/docs.json index cdd0c0b..287a1c6 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -110,7 +110,8 @@ "rfds/request-cancellation", "rfds/session-resume", "rfds/meta-propagation", - "rfds/session-info-update" + "rfds/session-info-update", + "rfds/agent-telemetry-export" ] }, { "group": "Preview", "pages": [] }, diff --git a/docs/updates.mdx b/docs/updates.mdx index 65e787c..d00d643 100644 --- a/docs/updates.mdx +++ b/docs/updates.mdx @@ -4,6 +4,13 @@ description: Updates and announcements about the Agent Client Protocol rss: true --- + +## Agent Telemetry Export RFD moves to Draft stage + +The RFD for providing more guidance on how agents should export telemetry has been moved to Draft stage. Please review the [RFD](./rfds/agent-telemetry-export) for more information on the current proposal and provide feedback as work on the implementation begins. + + + ## session_info_update notification RFD moves to Draft stage From 7e7294857ce991ea9e214772feddb506016b32f7 Mon Sep 17 00:00:00 2001 From: Ben Brandt Date: Thu, 11 Dec 2025 14:30:44 +0100 Subject: [PATCH 3/4] format --- docs/rfds/agent-telemetry-export.mdx | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/rfds/agent-telemetry-export.mdx b/docs/rfds/agent-telemetry-export.mdx index 5367cf5..2073077 100644 --- a/docs/rfds/agent-telemetry-export.mdx +++ b/docs/rfds/agent-telemetry-export.mdx @@ -99,12 +99,14 @@ Environment variables must be set before launching the subprocess, but ACP capab Add a new protocol documentation page covering observability practices for ACP. This page will describe: **For Clients/Editors:** + - Running an OTLP receiver to collect agent telemetry - Injecting `OTEL_EXPORTER_*` environment variables when launching agent subprocesses - Respecting user-configured `OTEL_*` variables (do not override if already set) - Forwarding telemetry to configured backends with client credentials **For Agent Authors:** + - Using OpenTelemetry SDKs with standard auto-configuration - Recommended spans, metrics, and log patterns for agent operations - How telemetry flows when `OTEL_*` variables are present vs absent @@ -153,6 +155,7 @@ This RFD focuses on the stdio transport where clients launch agents. For other t 3. **File-based telemetry** - Rejected because it doesn't support real-time display and adds complexity The environment variable approach: + - Uses existing standards (OTLP, OpenTelemetry SDK conventions) - Keeps telemetry out-of-band from ACP messages - Lets clients control where telemetry goes without exposing credentials From a21bc6c936adc637e82756bbd1d9df213895aa4b Mon Sep 17 00:00:00 2001 From: Ben Brandt Date: Thu, 11 Dec 2025 14:32:27 +0100 Subject: [PATCH 4/4] add champion --- docs/rfds/agent-telemetry-export.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/rfds/agent-telemetry-export.mdx b/docs/rfds/agent-telemetry-export.mdx index 2073077..b6a37db 100644 --- a/docs/rfds/agent-telemetry-export.mdx +++ b/docs/rfds/agent-telemetry-export.mdx @@ -3,6 +3,7 @@ title: "Agent Telemetry Export" --- - Author(s): [@codefromthecrypt](https://github.com/codefromthecrypt) +- Champion: [@benbrandt](https://github.com/benbrandt) ## Elevator pitch