Skip to content

[bug] Token counts on Phoenix inconsistent vs. OpenRouter or model providers #2268

@nkim500

Description

@nkim500

Describe the bug
I have some OpenAI-Agents-SDK Agent using OpenRouter & LiteLLM to send (i) streaming and (ii) non-streaming messages to LLM providers (e.g. OpenAI, Anthropic, etc). The token counts for each span are always the same between the LLM providers vs OpenRouter.

But for (i) streaming messages, the token counts on Phoenix are lower for every span vs. LLM providers & OpenRouter.

Is this a bug on Phoenix or is there something wrong with the implementation on my end?

For example, here's an Agent trace with token counts for each span:
Image

This is what the actual token counts were for each span, in the same order:

tokens_prompt tokens_completion tokens_total
5325 84 5409
9297 115 9412
9468 161 9629

The Agent code is like this:
Image

I auto-instrument the tracing with:

  • openinference-instrumentation-litellm==0.1.25
  • openinference-instrumentation-openai-agents==1.1.0
    as I'm practically sending all LLM messages via LiteLLM (but not all via OpenRouter)

I also saw this OpenRouter tracing page, but I think this would create triple-nested spans and triple count the token usage.

Additional context

arize-phoenix-client                        1.15.3
arize-phoenix-otel                          0.12.1
litellm                                     1.72.2
openai-agents                               0.2.11
openinference-instrumentation               0.1.35
opentelemetry-sdk                           1.36.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinginstrumantation: litellmrelated to litellm and litellm proxy

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions