Skip to content

[Feature]: Expose LLM Metrics via a Central Metrics Object #1084

@sahil1610

Description

@sahil1610

What problem does this feature solve?

Currently, there is no way to access metadata from LLM calls, such as token counts and model names. This prevents the integration of essential observability and cost-tracking tools like Langfuse. Without this data, it's impossible to monitor the performance and cost of applications built with Midscene in a production environment.

Describe the solution you'd like

I propose that Midscene implement a central, inspectable object that aggregates LLM usage metrics throughout the lifecycle of a request or session. This approach is inspired by the clean and simple implementation seen in the Stagehand documentation.

Instead of requiring complex callbacks or processing return values, a user could simply access the metrics from a static or session-level object.

What does the proposed API look like?

# Run some operations with Midscene
midscene_client.run("My first operation...")
midscene_client.run("My second operation...")

# Access the aggregated metrics
# The `.metrics` object would hold the combined usage from all previous calls.
current_usage = midscene.metrics

print(current_usage)
# Expected Output:
# {
#   "total_prompt_tokens": 512,
#   "total_completion_tokens": 256,
#   "total_tokens": 768,
#   "calls": 2
# }

# Now we can easily push this to any observability tool
# For example, with Langfuse:
langfuse.track("MidsceneUsage", value=current_usage["total_tokens"], metadata=current_usage)

# The metrics could be reset when needed
midscene.reset_metrics()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions