Skip to content

Conversation

@TejasGhatte
Copy link
Collaborator

@TejasGhatte TejasGhatte commented Oct 13, 2025

Summary

Implement Cohere ResponsesStream support and refactor token usage handling to improve reliability and reduce code duplication.

Changes

  • Added ResponsesStream implementation for Cohere provider
  • Refactored token usage handling to use non-pointer float64 values
  • Fixed content and tool call handling in streaming responses
  • Simplified delta message content handling in stream events
  • Reorganized code structure for better readability and maintainability

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Test the Cohere provider with ResponsesStream functionality:

# Core/Transports
go version
go test ./...

# Test ResponsesStream with Cohere
curl -X POST "http://localhost:8000/v1/providers/cohere/responses-stream" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "command",
    "messages": [
      {"role": "user", "content": "Tell me about quantum computing"}
    ]
  }'

Screenshots/Recordings

N/A

Breaking changes

  • Yes
  • No

Related issues

Implements ResponsesStream support for Cohere provider

Security considerations

No security implications.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

Copy link
Collaborator Author

TejasGhatte commented Oct 13, 2025

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added Cohere chat, responses, streaming, and embeddings support aligned with latest API.
    • Enabled real-time streaming with deltas, tool calls, citations, and usage metrics.
    • Introduced Gemini responses streaming.
  • Improvements

    • Standardized response fields (text, deltas, tool calls, usage, latency) across providers.
    • Enhanced token/usage reporting, including cached token details.
    • Added optional text field to citation annotations for richer context.
  • Bug Fixes

    • More robust error, timeout, and cancellation handling for streaming requests.

Walkthrough

Adds Cohere v2 public methods (ChatCompletion, Responses, ResponsesStream, Embedding), implements streaming translation to Bifrost events, normalizes usage/latency/raw payload handling, and enhances citation/usage types. Introduces a new text field on citation annotations. Gemini gains a ResponsesStream that delegates to ChatCompletionStream.

Changes

Cohort / File(s) Summary
Cohere provider API and flows
core/providers/cohere.go
Adds public ChatCompletion, Responses, ResponsesStream, and Embedding methods aligned to Cohere v2; implements request conversions, SSE-style streaming handling, mapping to Bifrost responses/streams, standardized latency/raw capture, and unified error/cancellation handling.
Cohere streaming event translation
core/schemas/providers/cohere/responses.go
Updates responses usage mapping (input/output/cached tokens); adds CohereStreamEvent.ToBifrostResponsesStream(...) translating message/content/tool/citation/debug events into Bifrost stream chunks, attaching usage on completion.
Cohere type system updates
core/schemas/providers/cohere/types.go
Introduces typed stream enums and structs (CohereStreamEventType, CohereContentBlockType, CohereCitationType, CohereSourceType), strengthens content/citation/source typing, adjusts ToolCalls/Citations shapes, and adds CachedTokens to usage.
Cohere embedding usage mapping
core/schemas/providers/cohere/embedding.go
Simplifies ToBifrostResponse usage assignment for embeddings and computes total tokens directly on the response.
Responses schema augmentation
core/schemas/responses.go
Adds optional Text *string field to ResponsesOutputMessageContentTextAnnotation for citation text payloads.
Gemini responses streaming
core/providers/gemini.go
Implements ResponsesStream by delegating to ChatCompletionStream (converts request and combines post-hook runner); removes previous unsupported stub.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant CohereProvider
  participant HTTP as Cohere /v2/chat (stream)
  participant Mapper as Stream Mapper

  Client->>CohereProvider: ResponsesStream(request, postHookRunner)
  CohereProvider->>CohereProvider: Build streaming request (stream=true)
  CohereProvider->>HTTP: POST /v2/chat (SSE)
  Note over HTTP,CohereProvider: Events: MessageStart, ContentStart/Delta/End,\nToolPlanDelta, ToolCall*, Citation*, MessageEnd, Debug
  loop For each SSE event
    HTTP-->>CohereProvider: CohereStreamEvent
    CohereProvider->>Mapper: ToBifrostResponsesStream(seq)
    Mapper-->>CohereProvider: BifrostStream chunk (delta/tool/usage)
    CohereProvider-->>Client: Emit chunk
  end
  CohereProvider->>Client: Close channel (on completion/error/timeout)
Loading
sequenceDiagram
  autonumber
  participant Client
  participant GeminiProvider
  participant CCStream as ChatCompletionStream

  Client->>GeminiProvider: ResponsesStream(request, postHookRunner)
  GeminiProvider->>GeminiProvider: Convert Responses -> Chat request
  GeminiProvider->>CCStream: ChatCompletionStream(converted, combined post-hook)
  CCStream-->>Client: Streamed Bifrost chunks
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • feat: added responses stream in cohere #608 — Alters the same Cohere provider files and streaming/event translation additions (ChatCompletion, Responses, ResponsesStream, Embedding and related schemas); likely a direct match.

Suggested reviewers

  • akshaydeo
  • danpiths

Poem

I twitch my ears at streaming breeze,
New cohorts hum like busy bees.
Citations hop with little text,
Tools call, then rest—what’s coming next?
Cohere and Gemini share the run—
I nibble bytes; the job is done. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly describes the primary change by indicating a new feature to add a responses stream to the Cohere provider in a concise phrase that aligns with the main objective of the PR.
Description Check ✅ Passed The description adheres closely to the repository template by providing a summary, detailed changes, type of change, affected areas, testing instructions, and other required sections, making it clear and actionable for reviewers.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b509450 and 14a101a.

📒 Files selected for processing (6)
  • core/providers/cohere.go (5 hunks)
  • core/providers/gemini.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@TejasGhatte TejasGhatte marked this pull request as ready for review October 13, 2025 05:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c3a9b2 and a1e473c.

📒 Files selected for processing (5)
  • core/providers/cohere.go (6 hunks)
  • core/schemas/providers/cohere/chat.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
core/schemas/providers/cohere/responses.go (3)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/schemas/responses.go (24)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1687-1716)
  • ResponsesStreamResponseTypeOutputItemAdded (1629-1629)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1632-1632)
  • ResponsesStreamResponseTypeOutputTextDelta (1636-1636)
  • ResponsesStreamResponseTypeContentPartDone (1633-1633)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1654-1654)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (692-715)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1642-1642)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1643-1643)
  • ResponsesOutputMessageContentTextAnnotation (669-679)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1677-1677)
  • ResponsesStreamResponseTypeCompleted (1625-1625)
  • ResponsesStreamResponseStruct (1718-1721)
  • ResponsesResponseUsage (236-239)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesResponseInputTokens (241-243)
core/schemas/providers/cohere/chat.go (1)
plugins/mocker/main.go (1)
  • Usage (138-142)
core/providers/cohere.go (7)
core/schemas/bifrost.go (16)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • ErrorField (426-433)
  • RequestCancelled (378-378)
  • BifrostResponseExtraFields (349-358)
  • BifrostErrorExtraFields (435-439)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (2)
  • Provider (194-219)
  • PostHookRunner (191-191)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/schemas/providers/cohere/types.go (3)
  • CohereStreamEvent (379-384)
  • CohereError (411-415)
  • CohereEmbeddingResponse (232-239)
core/schemas/responses.go (2)
  • ResponsesStreamResponse (1687-1716)
  • ResponsesStreamResponseStruct (1718-1721)
core/schemas/providers/cohere/embedding.go (1)
  • ToCohereEmbeddingRequest (6-62)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 10-12-feat_bedrock_responses_streaming_added to graphite-base/608 October 13, 2025 05:23
@TejasGhatte TejasGhatte force-pushed the 10-12-feat_added_responses_stream_in_cohere branch from a1e473c to 4d09345 Compare October 13, 2025 05:51
@TejasGhatte TejasGhatte changed the base branch from graphite-base/608 to 10-12-feat_bedrock_responses_streaming_added October 13, 2025 05:51
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 10-12-feat_bedrock_responses_streaming_added to graphite-base/608 October 13, 2025 06:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
core/schemas/providers/cohere/responses.go (1)

619-627: Fix citation-end stream mapping.

Re-emitting response.output_text_annotation.added here without an Annotation payload duplicates the “added” event and leaves clients with a phantom citation, because there’s no real completion signal tied to the original content_index. Please reuse the original annotation metadata (or emit a proper response.content_part.done for the same content_index) so downstream consumers can close out the citation correctly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1e473c and 4d09345.

📒 Files selected for processing (6)
  • core/providers/cohere.go (6 hunks)
  • core/schemas/providers/cohere/chat.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
core/schemas/providers/cohere/chat.go (1)
plugins/mocker/main.go (1)
  • Usage (138-142)
core/schemas/providers/cohere/responses.go (3)
core/schemas/responses.go (25)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1688-1717)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1655-1655)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1644-1644)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1678-1678)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1719-1722)
  • ResponsesResponseUsage (236-239)
  • ResponsesResponseInputTokens (241-243)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
core/providers/cohere.go (7)
core/schemas/bifrost.go (15)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • ErrorField (426-433)
  • RequestCancelled (378-378)
  • BifrostResponseExtraFields (349-358)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (2)
  • Provider (194-219)
  • PostHookRunner (191-191)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/utils.go (1)
  • Ptr (12-14)
core/schemas/providers/cohere/types.go (2)
  • CohereStreamEvent (379-384)
  • CohereError (411-415)
core/schemas/providers/cohere/embedding.go (1)
  • ToCohereEmbeddingRequest (6-62)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (7)
core/schemas/providers/cohere/types.go (7)

12-27: LGTM! Improved readability.

The field alignment and comment formatting improve code readability without any semantic changes.


324-337: LGTM! Strong typing improves API safety.

The new CohereCitationType and CohereSourceType enums replace string types with strongly-typed constants, reducing errors and improving code clarity.


342-356: LGTM! Enhanced citation and source structure.

The expanded CohereCitation and CohereSource types with typed enums and optional fields improve type safety and flexibility. The changes are well-structured:

  • ContentIndex and Type add context to citations
  • CohereSourceType enum improves type safety
  • Pointer fields (ID, ToolOutput, Document) correctly express optionality

298-301: LGTM! CachedTokens field added correctly.

The new optional CachedTokens field is properly typed as *float64, consistent with other optional numeric fields in the struct. This is an additive, non-breaking change.


404-406: LGTM! Improved type safety for stream content.

The change from *string to CohereContentBlockType enum improves type safety. The non-pointer Type field is appropriate since it's required for content events.


319-321: LGTM! CohereLogProb structure improved.

The changes enhance the structure:

  • TokenIDs as []int is appropriate
  • Text as *string correctly expresses optionality
  • LogProbs as []float64 is type-safe

282-282: LogProbs slice change is safe Verified that no code nil-checks the Cohere LogProbs slice; all references assign or use its length, so removing the pointer causes no break.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/cohere.go (1)

398-417: Add nil check before accessing ToolCalls.

Line 399 checks if event.Delta.Message.ToolCalls != nil, but line 401 then assigns cohereToolCall := event.Delta.Message.ToolCalls which is dereferencing it. The subsequent accesses to cohereToolCall.ID and cohereToolCall.Function need guards to prevent panics if those nested fields are nil.

Consider adding additional nil checks:

 			case cohere.StreamEventToolCallStart, cohere.StreamEventToolCallDelta:
 				if event.Delta != nil && event.Delta.Message != nil && event.Delta.Message.ToolCalls != nil {
 					// Handle single tool call object (tool-call-start/delta events)
 					cohereToolCall := event.Delta.Message.ToolCalls
 					toolCall := schemas.ChatAssistantMessageToolCall{}
 
 					if cohereToolCall.ID != nil {
 						toolCall.ID = cohereToolCall.ID
 					}
 
 					if cohereToolCall.Function != nil {
 						if cohereToolCall.Function.Name != nil {
 							toolCall.Function.Name = cohereToolCall.Function.Name
 						}
 						toolCall.Function.Arguments = cohereToolCall.Function.Arguments
 					}
 
 					response.Choices[0].BifrostStreamResponseChoice.Delta.ToolCalls = []schemas.ChatAssistantMessageToolCall{toolCall}
 				}

The current code already has good nil checks for nested fields. However, ensure that if cohereToolCall.Function is nil, accessing Function.Arguments at line 412 won't panic. Consider wrapping that in the existing nil check block.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1e473c and 4d09345.

📒 Files selected for processing (6)
  • core/providers/cohere.go (6 hunks)
  • core/schemas/providers/cohere/chat.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/schemas/providers/cohere/chat.go
🧰 Additional context used
🧬 Code graph analysis (3)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
core/schemas/providers/cohere/responses.go (3)
core/schemas/responses.go (25)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1688-1717)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1655-1655)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1644-1644)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1678-1678)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1719-1722)
  • ResponsesResponseUsage (236-239)
  • ResponsesResponseInputTokens (241-243)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/providers/cohere.go (7)
core/schemas/account.go (1)
  • Key (8-17)
core/schemas/bifrost.go (14)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • RequestCancelled (378-378)
  • BifrostResponseExtraFields (349-358)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (2)
  • Provider (194-219)
  • PostHookRunner (191-191)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/utils.go (1)
  • Ptr (12-14)
core/schemas/providers/cohere/types.go (3)
  • CohereStreamEvent (379-384)
  • CohereError (411-415)
  • CohereEmbeddingResponse (232-239)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (9)
core/schemas/responses.go (1)

673-673: LGTM!

The addition of the Text field to ResponsesOutputMessageContentTextAnnotation is properly structured with appropriate JSON tags and aligns with the enriched citation/annotation model introduced in this PR.

core/schemas/providers/cohere/responses.go (2)

619-630: Verify citation-end event handling semantics.

According to a past review comment, StreamEventCitationEnd is re-emitting response.output_text_annotation.added without annotation payload. The comment suggested returning a proper "done" signal or mirroring the original citation data. Please confirm whether the current implementation (lines 619-630) addresses that concern or if further changes are needed.

Based on past review comments at lines 619-628.


477-491: Add nil checks before accessing Content.Text.

Line 480 accesses chunk.Delta.Message.Content.Text without checking if Content is non-nil, risking a panic.

Apply this diff:

 	case StreamEventContentDelta:
-		if chunk.Index != nil && chunk.Delta != nil {
+		if chunk.Index != nil && chunk.Delta != nil && chunk.Delta.Message != nil && chunk.Delta.Message.Content != nil {
 			// Handle text content delta
-			if chunk.Delta.Message != nil && chunk.Delta.Message.Content != nil && chunk.Delta.Message.Content.Text != nil && *chunk.Delta.Message.Content.Text != "" {
+			if chunk.Delta.Message.Content.Text != nil && *chunk.Delta.Message.Content.Text != "" {
 				return &schemas.BifrostResponse{

Likely an incorrect or invalid review comment.

core/providers/cohere.go (3)

482-515: LGTM! Well-structured Responses method.

The Responses method properly:

  • Validates operation permissions
  • Converts the request using ToCohereResponsesRequest
  • Handles the request with reused handleCohereChatCompletionRequest logic
  • Maps the response using ToResponsesBifrostResponse
  • Sets appropriate metadata (provider, model, request type, latency)

The implementation is clean and follows the established patterns in the codebase.


517-701: LGTM! Comprehensive ResponsesStream implementation.

The ResponsesStream method provides a robust streaming implementation with:

  • Proper permission checks and error handling
  • SSE-style event parsing with data: prefix handling
  • Event-driven chunk processing via ToBifrostResponsesStream
  • Setup calls for initial/in-progress events (lines 647-649)
  • Proper end-of-stream handling with latency tracking (lines 668-679)
  • Post-hook integration throughout the stream lifecycle

The implementation demonstrates good separation of concerns and error propagation.


703-785: LGTM! Clean Embedding method implementation.

The Embedding method is well-structured with:

  • Proper permission validation
  • Request conversion via ToCohereEmbeddingRequest
  • Standard HTTP request/response handling with latency tracking
  • Error handling for various response codes
  • Appropriate metadata population (provider, model, request type, latency)
  • Conditional raw response inclusion based on config

The implementation follows established patterns and handles edge cases appropriately.

core/schemas/providers/cohere/types.go (3)

313-315: LGTM! Token fields correctly use float64 to match Cohere API.

The use of float64 for InputTokens and OutputTokens in CohereTokenUsage accurately reflects the Cohere API specification. While token counts are typically integers, Cohere's API returns these as floating-point numbers, so this representation is correct.

The change from pointer to non-pointer fields simplifies the code and is safe as these fields are always present in Cohere's response.


324-356: LGTM! Well-structured type system for citations and sources.

The addition of:

  • CohereCitationType enum with TEXT_CONTENT, THINKING_CONTENT, PLAN
  • CohereSourceType enum with tool, document
  • Expanded CohereCitation with ContentIndex and Type fields
  • Expanded CohereSource with Type, ToolOutput, and Document fields

These changes create a strongly-typed, clear representation of Cohere's citation and source model. The use of pointer types for optional fields is appropriate, and JSON tags are correctly applied.


363-406: LGTM! Comprehensive streaming event type system.

The addition of:

  • CohereStreamEventType enum covering all stream lifecycle events (message-start through message-end)
  • Updated CohereStreamEvent structure with proper fields
  • CohereStreamContent with typed content representation
  • Updated CohereStreamMessage with Citations *CohereCitation

These changes establish a well-organized type system for handling Cohere's streaming events. The shift from generic content representations to typed structures (e.g., CohereStreamContent with Type and Text fields) improves type safety and code clarity.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
core/schemas/providers/cohere/responses.go (1)

622-628: Still duplicating annotation.added on citation end.

This mirrors the earlier review: emitting another response.output_text_annotation.added with no annotation payload drops the “end” signal and confuses consumers. Please send a proper completion event (for example a ContentPartDone tied to the original content/annotation index, or reuse the prior annotation metadata) instead of replaying an empty “added”.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1e473c and 4d09345.

📒 Files selected for processing (6)
  • core/providers/cohere.go (6 hunks)
  • core/schemas/providers/cohere/chat.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/providers/cohere.go
🧰 Additional context used
🧬 Code graph analysis (3)
core/schemas/providers/cohere/responses.go (3)
core/schemas/responses.go (24)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1688-1717)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1655-1655)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1644-1644)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1719-1722)
  • ResponsesResponseUsage (236-239)
  • ResponsesResponseInputTokens (241-243)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/schemas/providers/cohere/chat.go (1)
plugins/mocker/main.go (1)
  • Usage (138-142)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check

@TejasGhatte TejasGhatte force-pushed the 10-12-feat_added_responses_stream_in_cohere branch from 4d09345 to 3490ee9 Compare October 13, 2025 06:46
@TejasGhatte TejasGhatte changed the base branch from graphite-base/608 to 10-12-feat_bedrock_responses_streaming_added October 13, 2025 06:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
core/schemas/providers/cohere/responses.go (1)

625-634: Fix citation end event mapping.

Line 629 still re-emits response.output_text.annotation.added with no annotation payload and assigns ContentIndex from chunk.Index, so consumers see a duplicate “added” event that points at the wrong content block. Please emit a completion signal instead (e.g. response.content_part.done) while preserving the original content index, and keep the annotation index separate so clients can reconcile the start/end pair.

-	case StreamEventCitationEnd:
-		if chunk.Index != nil {
-			// Citation end - indicate annotation is complete
-			return &schemas.BifrostResponse{
-				ResponsesStreamResponse: &schemas.ResponsesStreamResponse{
-					Type:            schemas.ResponsesStreamResponseTypeOutputTextAnnotationAdded,
-					SequenceNumber:  sequenceNumber,
-					ContentIndex:    chunk.Index,
-					AnnotationIndex: chunk.Index,
-				},
-			}, nil, false
-		}
-		return nil, nil, false
+	case StreamEventCitationEnd:
+		response := &schemas.BifrostResponse{
+			ResponsesStreamResponse: &schemas.ResponsesStreamResponse{
+				Type:           schemas.ResponsesStreamResponseTypeContentPartDone,
+				SequenceNumber: sequenceNumber,
+			},
+		}
+
+		if chunk.Delta != nil && chunk.Delta.Message != nil && chunk.Delta.Message.Citations != nil {
+			response.ResponsesStreamResponse.ContentIndex = schemas.Ptr(chunk.Delta.Message.Citations.ContentIndex)
+		}
+
+		response.ResponsesStreamResponse.AnnotationIndex = chunk.Index
+		return response, nil, false
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d09345 and 3490ee9.

📒 Files selected for processing (6)
  • core/providers/cohere.go (6 hunks)
  • core/schemas/providers/cohere/chat.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • core/schemas/providers/cohere/types.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/schemas/responses.go
🧰 Additional context used
🧬 Code graph analysis (3)
core/schemas/providers/cohere/responses.go (2)
core/schemas/responses.go (25)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesResponseInputTokens (241-243)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1689-1718)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1656-1656)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1645-1645)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1679-1679)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1720-1723)
  • ResponsesResponseUsage (236-239)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/providers/cohere.go (7)
core/schemas/bifrost.go (15)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • ErrorField (426-433)
  • BifrostResponseExtraFields (349-358)
  • BifrostErrorExtraFields (435-439)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (2)
  • Provider (194-219)
  • PostHookRunner (191-191)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/utils.go (1)
  • Ptr (12-14)
core/schemas/providers/cohere/types.go (3)
  • CohereStreamEvent (379-384)
  • CohereError (411-415)
  • CohereEmbeddingResponse (232-239)
core/schemas/responses.go (2)
  • ResponsesStreamResponse (1689-1718)
  • ResponsesStreamResponseStruct (1720-1723)
core/schemas/providers/cohere/chat.go (1)
plugins/mocker/main.go (1)
  • Usage (138-142)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 10-12-feat_bedrock_responses_streaming_added to graphite-base/608 October 13, 2025 07:29
@TejasGhatte TejasGhatte force-pushed the 10-12-feat_added_responses_stream_in_cohere branch from 3490ee9 to b509450 Compare October 13, 2025 08:14
@TejasGhatte TejasGhatte changed the base branch from graphite-base/608 to 10-12-feat_bedrock_responses_streaming_added October 13, 2025 08:15
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 10-12-feat_bedrock_responses_streaming_added to graphite-base/608 October 13, 2025 08:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
core/providers/cohere.go (1)

553-553: Minor: Use http.ErrHandlerTimeout consistently.

Line 553 checks for fasthttp.ErrTimeout in the context of an http.Client operation, but should use http.ErrHandlerTimeout for consistency with the standard library HTTP client.

Apply this diff:

-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
+		if errors.Is(err, http.ErrHandlerTimeout) || errors.Is(err, context.DeadlineExceeded) {

The same issue exists on line 581.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3490ee9 and b509450.

📒 Files selected for processing (6)
  • core/providers/cohere.go (5 hunks)
  • core/providers/gemini.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/schemas/responses.go
  • core/schemas/providers/cohere/embedding.go
🧰 Additional context used
🧬 Code graph analysis (4)
core/providers/gemini.go (2)
core/schemas/provider.go (1)
  • PostHookRunner (191-191)
core/schemas/bifrost.go (3)
  • BifrostResponsesRequest (197-203)
  • BifrostStream (383-386)
  • BifrostError (409-418)
core/schemas/providers/cohere/responses.go (3)
core/schemas/responses.go (25)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesResponseInputTokens (241-243)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1689-1718)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1656-1656)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1645-1645)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1679-1679)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1720-1723)
  • ResponsesResponseUsage (236-239)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/providers/cohere.go (7)
core/schemas/bifrost.go (15)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • RequestCancelled (378-378)
  • BifrostResponseExtraFields (349-358)
  • BilledLLMUsage (333-338)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (7)
  • Provider (194-219)
  • PostHookRunner (191-191)
  • ErrProviderJSONMarshaling (26-26)
  • ErrRequestCancelled (23-23)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • DefaultStreamBufferSize (17-17)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/utils.go (1)
  • Ptr (12-14)
core/schemas/providers/cohere/types.go (4)
  • CohereStreamEvent (379-384)
  • StreamEventMessageEnd (374-374)
  • CohereError (411-415)
  • CohereEmbeddingResponse (232-239)
core/schemas/providers/cohere/embedding.go (1)
  • ToCohereEmbeddingRequest (6-62)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (26)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (12)
core/providers/gemini.go (1)

211-218: LGTM! Clean delegation pattern.

The ResponsesStream implementation correctly delegates to ChatCompletionStream using a combined post-hook runner to convert chat completion chunks to responses stream format. This approach maintains consistency with the non-streaming Responses method.

core/schemas/providers/cohere/responses.go (4)

432-453: LGTM! MessageStart event properly initialized.

The message-start event correctly creates an OutputItemAdded response with empty content. The initialization is safe and follows the expected streaming event structure.


454-486: LGTM! Proper nil checks for nested Content access.

The code correctly verifies that Delta, Message, and Content are all non-nil before accessing Content.Type and Content.Text, preventing potential panics.


628-640: Verify citation-end event mapping.

The CitationEnd event emits an OutputTextAnnotationAdded event without an annotation payload, using only the index. This may not provide sufficient information for clients to match citation start/end pairs.

Consider emitting a ContentPartDone event or including the original citation data so downstream consumers can properly reconcile the citation lifecycle.

Based on past review discussion, is ContentPartDone the appropriate event type here, or should the annotation data be preserved?


656-675: LGTM! Proper usage token handling with nil checks.

The code correctly checks for the presence of Usage.Tokens before accessing InputTokens and OutputTokens, and properly handles CachedTokens mapping to InputTokensDetails. This addresses the critical nil-check issue from previous reviews.

core/providers/cohere.go (4)

188-224: LGTM! ChatCompletion implementation is correct.

The method properly:

  • Checks operation permissions
  • Converts the request to Cohere format
  • Handles the request through the shared handler
  • Maps the response with appropriate metadata
  • Includes raw response when configured

486-519: LGTM! Responses implementation follows the established pattern.

The method correctly delegates to the shared chat completion handler and converts the response to Responses format using ToResponsesBifrostResponse. The metadata fields are properly populated.


521-716: LGTM! Comprehensive ResponsesStream implementation.

The streaming implementation properly:

  • Creates and configures the HTTP streaming request
  • Handles SSE event parsing with proper state tracking
  • Converts Cohere events to Bifrost responses using ToBifrostResponsesStream
  • Propagates billed usage information for message-end events (addressing past review feedback)
  • Handles stream end and error cases correctly

The addition of billed usage propagation on lines 666-674 addresses the major issue flagged in previous reviews.


718-800: LGTM! Embedding implementation is complete and correct.

The method properly:

  • Validates operation permissions
  • Converts the request to Cohere embedding format
  • Executes the HTTP request with proper error handling
  • Parses both the structured and raw responses
  • Populates all metadata fields correctly
core/schemas/providers/cohere/types.go (3)

313-314: Breaking change noted: Token fields are now non-pointer.

The change from *float64 to *float64 for InputTokens and OutputTokens maintains the pointer type. However, past review comments suggested these should be non-pointer float64 fields. Please verify this is the intended design.

Clarify whether token fields should be:

  1. Pointers (current implementation) - allows distinguishing between "not provided" (nil) vs "zero tokens" (0.0)
  2. Non-pointers (past review suggestion) - simpler but loses nil optionality

Based on the usage in responses.go lines 103-108, the code expects pointers with nil checks, so the current implementation appears intentional.


324-356: LGTM! New citation and source types are well-structured.

The new CohereCitationType, CohereSourceType enums and updated CohereCitation/CohereSource structures provide clear type safety and comprehensive field coverage for citation handling. The design supports both tool and document sources appropriately.


395-406: LGTM! Stream message types updated correctly.

The CohereStreamMessage and CohereStreamContent types are properly structured to handle streaming events. The Type field using CohereContentBlockType (non-pointer) is appropriate for required fields in streaming contexts.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3490ee9 and b509450.

📒 Files selected for processing (6)
  • core/providers/cohere.go (5 hunks)
  • core/providers/gemini.go (1 hunks)
  • core/schemas/providers/cohere/embedding.go (1 hunks)
  • core/schemas/providers/cohere/responses.go (2 hunks)
  • core/schemas/providers/cohere/types.go (9 hunks)
  • core/schemas/responses.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/schemas/responses.go
  • core/schemas/providers/cohere/embedding.go
🧰 Additional context used
🧬 Code graph analysis (4)
core/providers/gemini.go (2)
core/schemas/provider.go (1)
  • PostHookRunner (191-191)
core/schemas/bifrost.go (3)
  • BifrostResponsesRequest (197-203)
  • BifrostStream (383-386)
  • BifrostError (409-418)
core/providers/cohere.go (6)
core/schemas/bifrost.go (13)
  • BifrostChatRequest (189-195)
  • BifrostResponse (238-254)
  • BifrostError (409-418)
  • Cohere (59-59)
  • ChatCompletionRequest (104-104)
  • RequestType (99-99)
  • BifrostResponsesRequest (197-203)
  • ResponsesRequest (106-106)
  • BifrostStream (383-386)
  • ResponsesStreamRequest (107-107)
  • BifrostResponseExtraFields (349-358)
  • BifrostEmbeddingRequest (205-211)
  • EmbeddingRequest (108-108)
core/schemas/providers/cohere/chat.go (1)
  • ToCohereChatCompletionRequest (8-179)
core/schemas/provider.go (2)
  • Provider (194-219)
  • PostHookRunner (191-191)
core/schemas/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (11-80)
core/schemas/providers/cohere/types.go (3)
  • CohereStreamEvent (379-384)
  • CohereError (411-415)
  • CohereEmbeddingResponse (232-239)
core/schemas/providers/cohere/embedding.go (1)
  • ToCohereEmbeddingRequest (6-62)
core/schemas/providers/cohere/responses.go (3)
core/schemas/responses.go (25)
  • ResponsesExtendedResponseUsage (229-234)
  • ResponsesResponseInputTokens (241-243)
  • ResponsesMessageTypeMessage (256-256)
  • ResponsesInputMessageRoleAssistant (566-566)
  • ResponsesMessage (280-292)
  • ResponsesMessageContent (573-576)
  • ResponsesStreamResponse (1689-1718)
  • ResponsesStreamResponseTypeOutputItemAdded (1630-1630)
  • ResponsesMessageContentBlockType (617-617)
  • ResponsesMessageContentBlock (631-642)
  • ResponsesOutputMessageContentTypeText (624-624)
  • ResponsesOutputMessageContentTypeReasoning (626-626)
  • ResponsesStreamResponseTypeContentPartAdded (1633-1633)
  • ResponsesStreamResponseTypeOutputTextDelta (1637-1637)
  • ResponsesStreamResponseTypeContentPartDone (1634-1634)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1656-1656)
  • ResponsesMessageTypeFunctionCall (261-261)
  • ResponsesToolMessage (693-716)
  • ResponsesStreamResponseTypeFunctionCallArgumentsAdded (1643-1643)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1645-1645)
  • ResponsesOutputMessageContentTextAnnotation (669-680)
  • ResponsesStreamResponseTypeOutputTextAnnotationAdded (1679-1679)
  • ResponsesStreamResponseTypeCompleted (1626-1626)
  • ResponsesStreamResponseStruct (1720-1723)
  • ResponsesResponseUsage (236-239)
core/schemas/providers/cohere/types.go (15)
  • CohereStreamEvent (379-384)
  • StreamEventMessageStart (364-364)
  • StreamEventContentStart (365-365)
  • CohereContentBlockTypeText (126-126)
  • CohereContentBlockTypeThinking (128-128)
  • StreamEventContentDelta (366-366)
  • StreamEventContentEnd (367-367)
  • StreamEventToolPlanDelta (368-368)
  • StreamEventToolCallStart (369-369)
  • StreamEventToolCallDelta (370-370)
  • StreamEventToolCallEnd (371-371)
  • StreamEventCitationStart (372-372)
  • StreamEventCitationEnd (373-373)
  • StreamEventMessageEnd (374-374)
  • StreamEventDebug (375-375)
core/schemas/bifrost.go (1)
  • BifrostResponse (238-254)
core/schemas/providers/cohere/types.go (1)
core/schemas/bifrost.go (1)
  • LogProbs (341-346)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (26)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (12)
core/providers/gemini.go (1)

211-218: LGTM! Clean delegation to chat completion stream.

The implementation correctly reuses ChatCompletionStream infrastructure for ResponsesStream by wrapping the post-hook runner with getResponsesChunkConverterCombinedPostHookRunner. This avoids code duplication and maintains consistency between the two streaming paths.

core/schemas/providers/cohere/responses.go (5)

103-116: LGTM! Token usage mapping is correct.

The usage conversion properly handles nil checks before dereferencing token pointers and correctly maps cached tokens to InputTokensDetails. The total token count is accurately computed from the sum of input and output tokens.


432-453: LGTM! Message start event mapping is correct.

The StreamEventMessageStart case properly initializes a ResponsesMessage with assistant role and empty content, then wraps it in an OutputItemAdded event. The implementation correctly sets the message ID and type.


454-486: LGTM! Content start with proper nil guards.

The StreamEventContentStart case has comprehensive nil checks (chunk.Delta != nil && chunk.Index != nil && chunk.Delta.Message != nil && chunk.Delta.Message.Content != nil) before accessing nested fields. The content type mapping from Cohere to Bifrost is accurate.


628-640: Verify citation-end event semantics.

The StreamEventCitationEnd case emits a ResponsesStreamResponseTypeOutputTextAnnotationAdded event with only indices but no annotation payload. This appears to signal completion by reusing the "added" event type.

Confirm whether:

  1. Downstream consumers expect a distinct "annotation done" event type, or
  2. They can distinguish end-of-annotation by the absence of an Annotation field

If a dedicated completion event is needed, consider adding a new response type like ResponsesStreamResponseTypeOutputTextAnnotationDone or reusing ResponsesStreamResponseTypeContentPartDone.


641-675: LGTM! Message end event properly finalizes the stream.

The StreamEventMessageEnd case correctly:

  • Creates a ResponsesStreamResponseTypeCompleted event
  • Maps token usage with proper nil checks for both Tokens and CachedTokens
  • Computes TotalTokens as the sum of input and output tokens
  • Returns isLastChunk=true to signal stream completion

The usage metadata handling is comprehensive and safe.

core/schemas/providers/cohere/types.go (1)

313-314: JSON serialization behavior change: tokens now mandatory.

The InputTokens and OutputTokens fields removed omitempty from their JSON tags, making them mandatory in serialized output even when zero. The Go types remain *float64 pointers, so existing dereference operations are correct. This is a minor serialization change but not a breaking change in Go code.

core/providers/cohere.go (5)

188-224: LGTM! ChatCompletion implementation follows best practices.

The method correctly:

  • Converts the request using ToCohereChatCompletionRequest
  • Handles errors from the request pipeline
  • Converts the response using ToBifrostResponse
  • Populates all required ExtraFields including provider, model, request type, and latency
  • Conditionally includes the raw response based on configuration

The implementation is clean and follows the established patterns.


385-388: LGTM! Content.Text access properly guarded.

The nil check chain (event.Delta != nil && event.Delta.Message != nil && event.Delta.Message.Content != nil && event.Delta.Message.Content.Text != nil) is comprehensive and prevents panic when any intermediate field is nil. This correctly addresses the issue flagged in previous reviews.


486-519: LGTM! Responses method correctly reuses chat infrastructure.

The implementation properly:

  • Converts requests using ToCohereResponsesRequest
  • Delegates to the shared handleCohereChatCompletionRequest handler
  • Converts responses using ToResponsesBifrostResponse for Responses-specific format
  • Sets RequestType to ResponsesRequest (not ChatCompletionRequest)
  • Follows the same error handling and field population patterns as ChatCompletion

The code reuse and consistency are excellent.


666-674: LGTM! Billed usage properly mapped from stream events.

The code correctly extracts BilledUnits from the final MessageEnd event and populates ExtraFields.BilledUsage with proper nil checks. This addresses the concern raised in previous reviews about missing billed usage in response streams.


718-800: LGTM! Embedding implementation is complete and correct.

The method properly:

  • Converts requests using ToCohereEmbeddingRequest
  • Makes requests to the correct /v2/embed endpoint
  • Handles errors with appropriate logging
  • Converts responses using ToBifrostResponse
  • Populates all required ExtraFields (provider, model, request type, latency)
  • Conditionally includes raw response data

The implementation is thorough and follows established patterns.

Copy link
Contributor

akshaydeo commented Oct 13, 2025

Merge activity

  • Oct 13, 9:55 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Oct 13, 9:59 AM UTC: Graphite couldn't merge this PR because it had merge conflicts.
  • Oct 13, 10:00 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Oct 13, 10:01 AM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo changed the base branch from graphite-base/608 to main October 13, 2025 09:57
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 10-12-feat_added_responses_stream_in_cohere branch from b509450 to 14a101a Compare October 13, 2025 10:00
@akshaydeo akshaydeo merged commit b42310e into main Oct 13, 2025
3 of 4 checks passed
@akshaydeo akshaydeo deleted the 10-12-feat_added_responses_stream_in_cohere branch October 13, 2025 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants