You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add per-token latency tracking for streaming responses (#596)
## Summary
Add per-token latency tracking for streaming responses to improve observability and performance monitoring.
## Changes
- Added latency tracking for each token in streaming responses across all providers (Anthropic, Bedrock, Cohere, Gemini, OpenAI)
- Implemented two types of latency measurements:
- Per-chunk latency: Time since the last chunk was received
- Total latency: Time from the start of the request to the final chunk
- Added a new Prometheus metric `bifrost_stream_token_latency_seconds` to track token latency
- Enhanced the telemetry plugin to record these metrics for each streaming chunk
- Improved JSON marshaling for BifrostStream to prevent field conflicts
- Simplified the streaming response handler in HTTP transport
## Type of change
- [x] Feature
- [x] Refactor
## Affected areas
- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [x] Plugins
## How to test
Test streaming responses with different providers and verify latency metrics are being recorded:
```sh
# Start Bifrost with telemetry plugin enabled
go run cmd/bifrost/main.go
# Make streaming requests to different providers
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"Write a short story"}],"stream":true}'
# Check Prometheus metrics
curl http://localhost:8000/metrics | grep bifrost_stream_token_latency_seconds
```
## Breaking changes
- [x] No
## Related issues
Improves observability for streaming responses, which helps diagnose performance issues.
## Security considerations
No security implications as this only adds internal performance tracking.
## Checklist
- [x] I added/updated tests where appropriate
- [x] I verified builds succeed (Go and UI)
0 commit comments