-
Notifications
You must be signed in to change notification settings - Fork 736
Description
What problem does this feature solve?
Currently, there is no way to access metadata from LLM calls, such as token counts and model names. This prevents the integration of essential observability and cost-tracking tools like Langfuse. Without this data, it's impossible to monitor the performance and cost of applications built with Midscene in a production environment.
Describe the solution you'd like
I propose that Midscene implement a central, inspectable object that aggregates LLM usage metrics throughout the lifecycle of a request or session. This approach is inspired by the clean and simple implementation seen in the Stagehand documentation.
Instead of requiring complex callbacks or processing return values, a user could simply access the metrics from a static or session-level object.
What does the proposed API look like?
# Run some operations with Midscene
midscene_client.run("My first operation...")
midscene_client.run("My second operation...")
# Access the aggregated metrics
# The `.metrics` object would hold the combined usage from all previous calls.
current_usage = midscene.metrics
print(current_usage)
# Expected Output:
# {
# "total_prompt_tokens": 512,
# "total_completion_tokens": 256,
# "total_tokens": 768,
# "calls": 2
# }
# Now we can easily push this to any observability tool
# For example, with Langfuse:
langfuse.track("MidsceneUsage", value=current_usage["total_tokens"], metadata=current_usage)
# The metrics could be reset when needed
midscene.reset_metrics()