-
Notifications
You must be signed in to change notification settings - Fork 71
feat: fallbacks added to vk provider routing #557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: fallbacks added to vk provider routing #557
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Caution Review failedThe pull request is closed. 📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds fallback request ID propagation for fallbacks, updates logging to capture parent-child request IDs, implements weighted provider routing with auto-generated fallbacks, enables SGL embeddings and Gemini transcription-aware responses, preloads provider configs in RDB, adds new configstore migrations, adjusts HTTP server default config dir, and updates UI pages to lazy-load governance-gated data. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant HTTP as HTTP Handler
participant Router as VKProviderRoutingMiddleware
participant Downstream as Bifrost Core
Note over Router: Weighted selection + auto-fallbacks
Client->>HTTP: POST /v1/chat/completions (with VK)
HTTP->>Router: Middleware
Router->>Router: Select provider by weight
Router->>Router: Build fallbacks (exclude chosen, sort by weight)
Router->>HTTP: Set model and fallbacks in request body
HTTP->>Downstream: Forward request (model + fallbacks)
Downstream-->>Client: Response/stream
sequenceDiagram
autonumber
actor Client
participant Core as Core (bifrost.go)
participant Provider as Provider Worker
participant Log as Logging Plugin
Note over Core,Provider: Fallback handling with request IDs
Client->>Core: Request
Core->>Provider: Try primary
alt failure -> fallback
Core->>Core: Generate fallback UUID
Core->>Provider: Invoke fallback with ctx[fallback-request-id]
Core->>Log: PreHook (RequestID=fallback, ParentRequestID=original)
Provider-->>Core: Result
Core->>Log: PostHook (uses fallback RequestID)
else success
Core->>Log: PreHook (RequestID=original)
Provider-->>Core: Result
Core->>Log: PostHook
end
Core-->>Client: Final response
sequenceDiagram
autonumber
actor Client
participant SGL as SGL Provider
participant Core as SGL Adapter
Note over Core: Embeddings implemented
Client->>Core: Embeddings request
Core->>SGL: POST /v1/embeddings
SGL-->>Core: Embeddings response
Core-->>Client: Normalized response
sequenceDiagram
autonumber
actor Client
participant Gemini as Gemini Adapter
Note over Gemini: Transcription-aware
Client->>Gemini: Chat response
Gemini->>Gemini: Detect transcription vs chat
alt Transcription
Gemini-->>Client: audio.transcription response
else Chat completion
Gemini-->>Client: chat.completion with choices and usage
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (17)
Comment |
Summary
Briefly explain the purpose of this PR and the problem it solves.
Changes
Type of change
Affected areas
How to test
Describe the steps to validate this change. Include commands and expected outcomes.
If adding new configs or environment variables, document them here.
Screenshots/Recordings
If UI changes, add before/after screenshots or short clips.
Breaking changes
If yes, describe impact and migration instructions.
Related issues
Link related issues and discussions. Example: Closes #123
Security considerations
Note any security implications (auth, secrets, PII, sandboxing, etc.).
Checklist
docs/contributing/README.md
and followed the guidelines