Skip to content

Commit c386fde

Browse files
authored
ROB-2405: document stream events (#1073)
1 parent c34af2d commit c386fde

File tree

1 file changed

+386
-0
lines changed

1 file changed

+386
-0
lines changed

docs/reference/http-api.md

Lines changed: 386 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -356,3 +356,389 @@ curl http://<HOLMES-URL>/api/model
356356
"model_name": ["gpt-4.1", "azure/gpt-4.1", "robusta"]
357357
}
358358
```
359+
360+
---
361+
362+
## Server-Sent Events (SSE) Reference
363+
364+
All streaming endpoints (`/api/stream/investigate`, `/api/stream/chat`, `/api/stream/issue_chat`, etc.) emit Server-Sent Events (SSE) to provide real-time updates during the investigation or chat process.
365+
366+
### Metadata Object Reference
367+
368+
Many events include a `metadata` object that provides detailed information about token usage, context window limits, and message truncation. This section describes the complete structure of the metadata object.
369+
370+
#### Token Usage Information
371+
372+
**Structure:**
373+
```json
374+
{
375+
"metadata": {
376+
"usage": {
377+
"prompt_tokens": 2500,
378+
"completion_tokens": 150,
379+
"total_tokens": 2650
380+
},
381+
"tokens": {
382+
"total_tokens": 2650,
383+
"tools_tokens": 100,
384+
"system_tokens": 500,
385+
"user_tokens": 300,
386+
"tools_to_call_tokens": 50,
387+
"assistant_tokens": 1600,
388+
"other_tokens": 100
389+
},
390+
"max_tokens": 128000,
391+
"max_output_tokens": 16384
392+
}
393+
}
394+
```
395+
396+
**Fields:**
397+
398+
- `usage` (object): Token usage from the LLM provider (raw response from the model)
399+
- `prompt_tokens` (integer): Tokens in the prompt (input)
400+
- `completion_tokens` (integer): Tokens in the completion (output)
401+
- `total_tokens` (integer): Total tokens used (prompt + completion)
402+
403+
- `tokens` (object): HolmesGPT's detailed token count breakdown by message role
404+
- `total_tokens` (integer): Total tokens in the conversation
405+
- `tools_tokens` (integer): Tokens used by tool definitions
406+
- `system_tokens` (integer): Tokens in system messages
407+
- `user_tokens` (integer): Tokens in user messages
408+
- `tools_to_call_tokens` (integer): Tokens used for tool call requests from the assistant
409+
- `assistant_tokens` (integer): Tokens in assistant messages (excluding tool calls)
410+
- `other_tokens` (integer): Tokens from other message types
411+
412+
- `max_tokens` (integer): Maximum context window size for the model
413+
- `max_output_tokens` (integer): Maximum tokens reserved for model output
414+
415+
#### Truncation Information
416+
417+
When messages are truncated to fit within context limits, the metadata includes truncation details:
418+
419+
**Structure:**
420+
```json
421+
{
422+
"metadata": {
423+
"truncations": [
424+
{
425+
"tool_call_id": "call_abc123",
426+
"start_index": 0,
427+
"end_index": 5000,
428+
"tool_name": "kubectl_logs",
429+
"original_token_count": 15000
430+
}
431+
]
432+
}
433+
}
434+
```
435+
436+
**Fields:**
437+
438+
- `truncations` (array): List of truncated tool results
439+
- `tool_call_id` (string): ID of the truncated tool call
440+
- `start_index` (integer): Character index where truncation starts (always 0)
441+
- `end_index` (integer): Character index where content was cut off
442+
- `tool_name` (string): Name of the tool whose output was truncated
443+
- `original_token_count` (integer): Original token count before truncation
444+
445+
Truncated content will include a `[TRUNCATED]` marker at the end.
446+
447+
---
448+
449+
### Event Types
450+
451+
#### `start_tool_calling`
452+
453+
Emitted when the AI begins executing a tool. This event is sent before the tool runs.
454+
455+
**Payload:**
456+
```json
457+
{
458+
"tool_name": "kubectl_describe",
459+
"id": "call_abc123"
460+
}
461+
```
462+
463+
**Fields:**
464+
465+
- `tool_name` (string): The name of the tool being called
466+
- `id` (string): Unique identifier for this tool call
467+
468+
---
469+
470+
#### `tool_calling_result`
471+
472+
Emitted when a tool execution completes. Contains the tool's output and metadata.
473+
474+
**Payload:**
475+
```json
476+
{
477+
"tool_call_id": "call_abc123",
478+
"role": "tool",
479+
"description": "kubectl describe pod my-pod -n default",
480+
"name": "kubectl_describe",
481+
"result": {
482+
"status": "success",
483+
"data": "...",
484+
"error": null,
485+
"params": {"pod": "my-pod", "namespace": "default"}
486+
}
487+
}
488+
```
489+
490+
**Fields:**
491+
492+
- `tool_call_id` (string): Unique identifier matching the `start_tool_calling` event
493+
- `role` (string): Always "tool"
494+
- `description` (string): Human-readable description of what the tool did
495+
- `name` (string): The name of the tool that was called
496+
- `result` (object): Tool execution result
497+
- `status` (string): One of "success", "error", "approval_required"
498+
- `data` (string|object): The tool's output data (stringified if complex)
499+
- `error` (string|null): Error message if the tool failed
500+
- `params` (object): Parameters that were passed to the tool
501+
502+
---
503+
504+
#### `ai_message`
505+
506+
Emitted when the AI has a text message or reasoning to share (typically before tool calls).
507+
508+
**Payload:**
509+
```json
510+
{
511+
"content": "I need to check the pod logs to understand the issue.",
512+
"reasoning": "The pod is crashing, so examining logs will reveal the root cause.",
513+
"metadata": {...}
514+
}
515+
```
516+
517+
**Fields:**
518+
519+
- `content` (string|null): The AI's message content
520+
- `reasoning` (string|null): The AI's internal reasoning (only present for models that support reasoning like o1)
521+
- `metadata` (object): See [Metadata Object Reference](#metadata-object-reference) for complete structure
522+
523+
---
524+
525+
#### `ai_answer_end`
526+
527+
Emitted when the investigation or chat is complete. This is the final event in the stream.
528+
529+
**For RCA/Investigation (`/api/stream/investigate`):**
530+
```json
531+
{
532+
"sections": {
533+
"Alert Explanation": "...",
534+
"Key Findings": "...",
535+
"Conclusions and Possible Root Causes": "...",
536+
"Next Steps": "...",
537+
"App or Infra?": "...",
538+
"External links": "..."
539+
},
540+
"analysis": "Full analysis text...",
541+
"instructions": ["runbook1", "runbook2"],
542+
"metadata": {...}
543+
}
544+
```
545+
546+
**For Chat (`/api/stream/chat`, `/api/stream/issue_chat`):**
547+
```json
548+
{
549+
"analysis": "The issue can be resolved by...",
550+
"conversation_history": [
551+
{"role": "system", "content": "..."},
552+
{"role": "user", "content": "..."},
553+
{"role": "assistant", "content": "..."}
554+
],
555+
"follow_up_actions": [
556+
{
557+
"id": "action1",
558+
"action_label": "Run diagnostics",
559+
"pre_action_notification_text": "Running diagnostics...",
560+
"prompt": "Run diagnostic checks"
561+
}
562+
],
563+
"metadata": {...}
564+
}
565+
```
566+
567+
**Common Fields:**
568+
569+
- `metadata` (object): See [Metadata Object Reference](#metadata-object-reference) for complete structure including token usage, truncations, and compaction info
570+
571+
**RCA-Specific Fields:**
572+
573+
- `sections` (object): Structured investigation output with predefined sections (customizable via request)
574+
- `analysis` (string): Full analysis text (markdown format)
575+
- `instructions` (array): List of runbooks that were used during investigation
576+
577+
**Chat-Specific Fields:**
578+
579+
- `analysis` (string): The AI's response (markdown format)
580+
- `conversation_history` (array): Complete conversation history including the latest response
581+
- `follow_up_actions` (array|null): Optional follow-up actions the user can take
582+
- `id` (string): Unique identifier for the action
583+
- `action_label` (string): Display label for the action
584+
- `pre_action_notification_text` (string): Text to show before executing the action
585+
- `prompt` (string): The prompt to send when the action is triggered
586+
587+
---
588+
589+
#### `approval_required`
590+
591+
Emitted when tool execution requires user approval (e.g., potentially destructive operations). The stream pauses until the user provides approval decisions via a subsequent request.
592+
593+
**Payload:**
594+
```json
595+
{
596+
"content": null,
597+
"conversation_history": [...],
598+
"follow_up_actions": [...],
599+
"requires_approval": true,
600+
"pending_approvals": [
601+
{
602+
"tool_call_id": "call_xyz789",
603+
"tool_name": "kubectl_delete",
604+
"description": "kubectl delete pod failed-pod -n default",
605+
"params": {"pod": "failed-pod", "namespace": "default"}
606+
}
607+
]
608+
}
609+
```
610+
611+
**Fields:**
612+
613+
- `content` (null): No AI content when approval is required
614+
- `conversation_history` (array): Current conversation state
615+
- `follow_up_actions` (array|null): Optional follow-up actions
616+
- `requires_approval` (boolean): Always true for this event
617+
- `pending_approvals` (array): List of tools awaiting approval
618+
- `tool_call_id` (string): Unique identifier for the tool call
619+
- `tool_name` (string): Name of the tool requiring approval
620+
- `description` (string): Human-readable description
621+
- `params` (object): Parameters for the tool call
622+
623+
To continue after approval, send a new request with `tool_decisions`:
624+
```json
625+
{
626+
"conversation_history": [...],
627+
"tool_decisions": [
628+
{"tool_call_id": "call_xyz789", "approved": true}
629+
]
630+
}
631+
```
632+
633+
---
634+
635+
#### `token_count`
636+
637+
Emitted periodically to provide token usage updates during the investigation. This event is sent after each LLM iteration to help track resource consumption in real-time.
638+
639+
**Payload:**
640+
```json
641+
{
642+
"metadata": {...}
643+
}
644+
```
645+
646+
**Fields:**
647+
648+
- `metadata` (object): See [Metadata Object Reference](#metadata-object-reference) for complete token usage structure. This event provides the same metadata structure as other events, allowing you to monitor token consumption throughout the investigation
649+
650+
---
651+
652+
#### `conversation_history_compacted`
653+
654+
Emitted when the conversation history has been compacted to fit within the context window. This happens automatically when the conversation grows too large.
655+
656+
**Payload:**
657+
```json
658+
{
659+
"content": "Conversation history was compacted to fit within context limits.",
660+
"messages": [...],
661+
"metadata": {
662+
"initial_tokens": 150000,
663+
"compacted_tokens": 80000
664+
}
665+
}
666+
```
667+
668+
**Fields:**
669+
670+
- `content` (string): Human-readable description of the compaction
671+
- `messages` (array): The compacted conversation history
672+
- `metadata` (object): Token information about the compaction
673+
- `initial_tokens` (integer): Token count before compaction
674+
- `compacted_tokens` (integer): Token count after compaction
675+
676+
---
677+
678+
#### `error`
679+
680+
Emitted when an error occurs during processing.
681+
682+
**Payload:**
683+
```json
684+
{
685+
"description": "Rate limit exceeded",
686+
"error_code": 5204,
687+
"msg": "Rate limit exceeded",
688+
"success": false
689+
}
690+
```
691+
692+
**Fields:**
693+
694+
- `description` (string): Detailed error description
695+
- `error_code` (integer): Numeric error code
696+
- `msg` (string): Error message
697+
- `success` (boolean): Always false
698+
699+
**Common Error Codes:**
700+
701+
- `5204`: Rate limit exceeded
702+
- `1`: Generic error
703+
704+
---
705+
706+
## Event Flow Examples
707+
708+
### Typical RCA Investigation Flow
709+
710+
```
711+
1. start_tool_calling (tool 1)
712+
2. start_tool_calling (tool 2)
713+
3. tool_calling_result (tool 1)
714+
4. tool_calling_result (tool 2)
715+
5. token_count
716+
6. start_tool_calling (tool 3)
717+
7. tool_calling_result (tool 3)
718+
8. token_count
719+
9. ai_answer_end
720+
```
721+
722+
### Chat with Approval Flow
723+
724+
```
725+
1. ai_message
726+
2. start_tool_calling (safe tool)
727+
3. start_tool_calling (requires approval)
728+
4. tool_calling_result (safe tool)
729+
5. tool_calling_result (approval required with status: "approval_required")
730+
6. approval_required
731+
[Client sends approval decisions]
732+
1. tool_calling_result (approved tool executed)
733+
[investigation resumes]
734+
```
735+
736+
### Chat with History Compaction
737+
738+
```
739+
1. conversation_history_compacted
740+
2. start_tool_calling (tool 1)
741+
3. tool_calling_result (tool 1)
742+
4. token_count
743+
5. ai_answer_end
744+
```

0 commit comments

Comments
 (0)