Skip to content

Python: Bug: Incorrect token usage reporting in Semantic Kernel SDK Agents – Intermediate LLM calls not included #13173

@NISHANTSHRIVASTAV

Description

@NISHANTSHRIVASTAV

Describe the bug
When using the Semantic Kernel SDK with the Group Chat Orchestration pattern, we observe a token usage discrepancy between the SDK’s reported values and those shown in the Azure portal.

To Reproduce
Steps to reproduce the behavior:

  1. Create an agent using ChatCompletionAgent
  2. Execute the agent and retrieve token usage via the Semantic Kernel SDK.
  3. Compare the SDK’s token usage with the Azure portal’s token usage.
  4. A discrepancy in token counts is observed.

Expected behavior
Token usage reported by the Semantic Kernel SDK for Group Chat Orchestration should match the token usage displayed in the Azure portal.

Screenshots

Semantic kernel SDK single agent token usage:

Image

Azure portal single agent token usage:

Image

Platform

  • Language: Python
  • Source: [semantic-kernel==1.36.0, python 3.13.3]
  • AI model: azure-gpt-4o (Creating agent using ChatCompletionAgent)
  • IDE: VS Code
  • OS: Windows

Additional Context

The SDK currently only reports the last LLM call’s token details and does not include in-between tool → LLM call token usage. This likely causes the under-reporting compared to the Azure portal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpythonPull requests for the Python Semantic Kerneltriage

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions