Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Dec 6, 2025

Summary

  • Ensures the assistant tool_use is immediately followed by a tool_result that references the same tool_call_id when the user rejects an action.
  • Reverts forcing string serialization for tool results. Anthropic supports list-serialized content; the real requirement is strict adjacency between tool_use and tool_result.
  • Adds a focused unit test verifying ordering and id-matching for the rejection path.
  • Adds a minimal repro script to validate against Anthropic via the LiteLLM eval proxy.

Root cause

  • Anthropic rejects requests when a tool_result is not the next message immediately after its tool_use (same call id). Any intervening message (even a user/system/meta message) causes a 400: "tool_use ids were found without tool_result blocks immediately after".

What changed in this PR

  • sdk/event/llm_convertible/observation.py:
    • Keep tool_call_id threading in ObservationEvent and UserRejectObservation tool messages.
    • Do NOT force string serialization (reverted) — provider compatibility is fine with list serializer.
  • tests/sdk/event/test_user_reject_tool_result_order.py:
    • Asserts that ActionEvent (tool_use) followed by UserRejectObservation produces messages where the tool_result immediately follows the tool_use with matching tool_call_id.
  • examples/repro_reject_anthropic.py:
    • Small script to exercise the flow against Anthropic via the eval proxy.

Live validation against Anthropic

  • Environment: eval proxy to Anthropic Claude Sonnet 4.5
  • How to run:
    cd agent-sdk
    uv run python examples/repro_reject_anthropic.py
  • Expected/observed log excerpt:
    • LiteLLM completion() model= anthropic/claude-sonnet-4-5-20250929; provider = litellm_proxy
    • UserRejectObservation emitted referencing original tool_call_id
    • Second run completed without Anthropic tool_result error.

Why this fixes the issue

  • The user rejection now reliably manifests as a tool_result in the very next message after the corresponding tool_use, which satisfies Anthropic’s contract. The unit test enforces ordering and id-matching. The repro script demonstrates the same with a live Anthropic call via the eval proxy.

Tests

  • Unit test added: tests/sdk/event/test_user_reject_tool_result_order.py
  • Pre-commit: passing locally across the repo

Closes

Notes

  • Title kept for continuity; description updated to reflect the revert and live validation.

enyst added 2 commits December 6, 2025 11:49
…and are serialized as plain text to satisfy Anthropic requirements\n\nAlso set force_string_serializer=True for ObservationEvent payloads so tool results are always accepted.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
…ately after tool_use and forces string serializer\n\nValidates fix for missing tool_result after user rejection.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 6, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/event/llm_convertible
   observation.py621870%81–86, 89, 100–101, 106, 122–125, 130, 139–140, 145
TOTAL12425565354% 

enyst added 2 commits December 6, 2025 13:25
- shorten comments to pass E501 on observation tool-result messages
- no functional changes beyond commit 953a3a7
- Revert force_string_serializer on ObservationEvent/UserRejectObservation
- Root cause is ordering/adjacency of tool_use/tool_result, not list serialization
- Keep unit test; add repro script for Anthropic via eval proxy
@openhands-ai
Copy link

openhands-ai bot commented Dec 6, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Pre-commit checks
    • Run tests
    • Check Documented Examples

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1344 at branch `openhands/fix-tool-result-on-reject`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@enyst enyst changed the title fix: emit tool_result for user rejection and force plain-text tool messages fix: correct tool_result pairing after user rejection; add Anthropic repro; keep list serializer Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants